Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research studyBMJ 2017; 357 doi: https://doi.org/10.1136/bmj.j2490 (Published 08 June 2017) Cite this as: BMJ 2017;357:j2490
- Agnes Dechartres, associate professor14,
- Ludovic Trinquart, researcher14,
- Ignacio Atal, data scientist14,
- David Moher, senior scientist5,
- Kay Dickersin, professor6,
- Isabelle Boutron, professor14,
- Elodie Perrodeau, statistician13,
- Douglas G Altman, professor7,
- Philippe Ravaud, professor148
- 1INSERM, U1153, Paris, France
- 2Faculté de Médecine, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
- 3Centre d’Épidémiologie Clinique, Hôpital Hôtel Dieu, AP-HP (Assistance Publique des Hôpitaux de Paris), Paris, France
- 4Cochrane France, Paris, France
- 5Clinical Epidemiology Program, Ottawa Hospital Research Institute, School of Epidemiology, Public Health and Preventive Medicine, Canadian EQUATOR Centre, University of Ottawa, Ottawa, ON, Canada
- 6Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- 7Centre for Statistics in Medicine, Oxford, UK
- 8Columbia University, Mailman School of Public Health, Department of Epidemiology, New York, NY, USA
- Correspondence to: A Dechartres
- Accepted 9 May 2017
Objective To examine how poor reporting and inadequate methods for key methodological features in randomised controlled trials (RCTs) have changed over the past three decades.
Design Mapping of trials included in Cochrane reviews.
Data sources Data from RCTs included in all Cochrane reviews published between March 2011 and September 2014 reporting an evaluation of the Cochrane risk of bias items: sequence generation, allocation concealment, blinding, and incomplete outcome data.
Data extraction For each RCT, we extracted consensus on risk of bias made by the review authors and identified the primary reference to extract publication year and journal. We matched journal names with Journal Citation Reports to get 2014 impact factors.
Main outcomes measures We considered the proportions of trials rated by review authors at unclear and high risk of bias as surrogates for poor reporting and inadequate methods, respectively.
Results We analysed 20 920 RCTs (from 2001 reviews) published in 3136 journals. The proportion of trials with unclear risk of bias was 48.7% for sequence generation and 57.5% for allocation concealment; the proportion of those with high risk of bias was 4.0% and 7.2%, respectively. For blinding and incomplete outcome data, 30.6% and 24.7% of trials were at unclear risk and 33.1% and 17.1% were at high risk, respectively. Higher journal impact factor was associated with a lower proportion of trials at unclear or high risk of bias. The proportion of trials at unclear risk of bias decreased over time, especially for sequence generation, which fell from 69.1% in 1986-1990 to 31.2% in 2011-14 and for allocation concealment (70.1% to 44.6%). After excluding trials at unclear risk of bias, use of inadequate methods also decreased over time: from 14.8% to 4.6% for sequence generation and from 32.7% to 11.6% for allocation concealment.
Conclusions Poor reporting and inadequate methods have decreased over time, especially for sequence generation and allocation concealment. But more could be done, especially in lower impact factor journals.
The public has the right to expect that information about the efficacy and safety of health interventions is complete, transparent, and reliable and that research investments are not wasted.1234567 Randomised controlled trials (RCTs) are the reference standard for assessing the efficacy of interventions, but how they are planned, conducted, and reported raises important concerns.56 Empirical evidence shows that using inadequate methods to generate randomisation sequences, not concealing allocation, lack of blinding, and excluding patients from analyses can bias findings.89101112 Yet half of RCTs fail to take adequate steps to reduce such bias.3713
Poor reporting of methods is another common problem.35 Although it does not necessarily reflect poor methods,1415 poor reporting prevents readers from adequately assessing whether the methods are reliable and whether the results and conclusions of RCTs can be trusted. It also limits reproducibility.16 RCTs that use inadequate methods or that are poorly reported might not contribute to the evidence base, which is a waste of resources that affects not only RCTs but also the systematic reviews that include them.
Many methodological studies have evaluated the quality of reporting and risk of bias in RCTs.1718192021222324252627282930 But most have focused on small numbers of trials or specific diseases, journals, or time periods. These studies used various criteria for their assessments, which were frequently not defined.17 We therefore lack knowledge of the global magnitude of poor reporting and inadequate methods in RCTs and how they have changed over time. A comprehensive picture of the quality of research would help us better understand the disparities between journals and to propose practical ways of improvement.
Cochrane reviews are uniquely placed to evaluate the quality of research. They synthesise findings from RCTs and are used to aid decision making and develop practice guidelines.3132 The risk of bias of each included study is systematically assessed in duplicate by trained reviewers who reach consensus. Reviewers use the Cochrane risk of bias tool, which assesses several methodological items that are considered crucial to evaluating the validity of an RCT.3334
We examined how poor reporting and inadequate methods have changed over the past three decades, both in and between journals, using the data included in Cochrane reviews.
We used the proportion of trials considered by the review authors to be at high or unclear risk of bias as surrogate measures of inadequate methods and poor reporting, respectively. We focused on several key items of the Cochrane risk of bias tool: sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessors, and incomplete outcome data (table 1⇓) because they are associated with intervention effect estimates in meta-epidemiological studies.333536
We obtained data from all Cochrane reviews published between March 2011 and September 2014. We chose March 2011 because it corresponded to the most recent update of the Cochrane risk of bias tool.34 An XML file was provided for each review consisting of all data entered by the review authors in Review Manager, the software used for preparing and maintaining Cochrane reviews.37 Each file contained information for all studies included in the review, including methods summary, references, and consensus on risk of bias assessment between the reviewers.34
We downloaded the Web of Science and PubMed databases for the list of indexed journals and the Journal Citation Reports database for the list of journals with their medical categories and 2014 impact factors.
Constitution of a unique database
We combined all individual XML files into a single database using R v3.2.2 with the XML package (https://www.R-project.org/). We first standardised the wording about risk of bias items because it varied across reviews. Two of us (AD, LT) manually classified them using a standardised set of risk of bias items, and all disagreements were resolved by discussion.
Selection of eligible Cochrane reviews
If the Cochrane review had been updated within our search limits, we considered the most recent version. We excluded reviews that had been withdrawn and “empty” reviews (that is, those not including any studies). We focused on reviews of RCTs only and excluded those of observational or non-randomised studies. To identify observational or non-randomised studies, two of us (AD, LT) made a list of keywords that could correspond to observational studies—for example, “observational,” “cohort,” and “case-control”—that were automatically searched in the free text description of the methods summary. We also identified risk of bias items that could correspond to observational studies; for example, “potential confounding factors taken into account.” We excluded reviews reporting these keywords or items.
Then, we identified reviews that included an assessment of the risk of bias for the following key items: sequence generation, allocation concealment, blinding (whatever the type of blinding item), and incomplete outcome data. Some reviews assessed blinding overall (according to a previous version of the risk of bias tool), with no distinction between blinding of participants and personnel and blinding of outcome assessors. So we considered reviews eligible if they reported at least one item concerning blinding—blinding overall, blinding of participants and personnel, or blinding of outcome assessors.
Selection of eligible RCTs and identification of corresponding primary reference
We excluded RCTs if the results were not reported in at least one journal article published after 1985. We excluded RCT results published before 1986 (corresponding to the 10th centile of trial year of publication) to focus on contemporary trials. We first extracted the references for all RCTs included in the eligible Cochrane reviews, including the reference type (journal article, book section, or conference proceedings). Then we used a matching algorithm38 and manual validation to identify duplicate references, and we excluded trials that were included in more than one review. For RCTs that referenced more than one journal article, we selected the primary reference reported by the review authors. For the few cases that had several primary references, we manually identified the reference corresponding to reporting of the main results. We excluded RCTs for which no primary reference was reported. We also excluded RCTs reported in abstract format only (reported as such in the characteristics of included studies). We extracted the year of publication and journal names of the primary reference for all selected RCTs.
Matching journal names with Web of Science, PubMed, and Journal Citation Reports
We used the Web of Science and PubMed databases to standardise journal names and abbreviations. One of us (AD) manually reviewed journal names that could not be matched to verify whether they corresponded to existing journals that were not indexed. This enabled us to identify variations in journal names (for example, Critical Care, Critical Care London, and Critical Care London England), which were corrected according to Web of Science or PubMed. We excluded RCTs with journal names corresponding to non-existing journals. Finally, for each journal we extracted the 2014 journal impact factor, medical category, average impact factor centile across medical categories, country, and language from Journal Citation Reports. For journals not included in Journal Citation Reports, we manually evaluated the main medical category and the language.
Extraction of risk of bias assessment
For each RCT we extracted the Cochrane risk of bias judgments, which were“low,” “high,” or “unclear” for each of the five key items (sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessors, and incomplete outcome data). When a Cochrane review assigned more than one judgment to the same item (for example, blinding might have been assessed for each outcome) we considered the risk of bias judgment corresponding to subjective outcomes, because objective outcomes should be low risk. For trials that assessed blinding overall, we considered the risk of bias to be the same for both blinding of participants and personnel and blinding of outcome assessors. For trials that were included in more than one Cochrane review, we extracted the risk of bias assessment for the most recent.
Summary of data available for each included RCT
Publication characteristics—year of publication, journal name, medical category, language, and whether the journal was indexed in Web of Science, PubMed, and Journal Citation Reports. For RCTs published in a journal indexed in Journal Citation Reports, we also had the journal 2014 impact factor.
Risk of bias assessment—final judgment, corresponding to the consensus of two trained reviewers, for: sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessors, and incomplete outcome data.
Assessment of poor reporting and inadequate methods
The Cochrane Handbook says that risk of bias should be considered unclear if there is insufficient information to permit judgment of low or high risk.3334 So an item classified as unclear can be considered poorly reported. A judgment of high risk is given when inadequate methods or conduct could result in a bias of sufficient magnitude to have a notable effect on the results or conclusions of the trial.34 We used the proportion of trials considered by the review authors to be at unclear or high risk of bias as surrogates for poor reporting and inadequate methods, respectively.
The analysis was descriptive. We first assessed the overall proportion of trials at unclear and high risk of bias. Then we examined how poor reporting and inadequate methods changed over time. Because inadequate methods can only be assessed when reporting is adequate, we performed two analyses for the evolution of inadequate methods: one based on all trials and one based on only trials that were not at unclear risk.
We predefined four subgroup analyses according to: impact factor (≥10, 5-10, <5, no IF); centile of impact factor within the medical categories (≥90th centile, 70-90th, <70th percentile, no IF); medical subject category according to Journal Citation Reports (“medicine, general and internal” versus other); and each of the 10 journals with the most RCTs. One reviewer suggested an additional subgroup analysis on language (English only versus other).
Patients were not involved in any aspect of the study design, conduct, or in the development of the research question or outcome measures. There are no plans to disseminate the results of the research to study participants or the relevant patient community. This study is research on existing published research and therefore there was no active patient recruitment for data collection.
Selection and general characteristics
The selection process is reported in web appendix 1. We included data from 2001 systematic reviews, including 20 920 unique articles (median year of publication 2003, interquartile range 1997-2008) published in 3136 journals. 19 551 (93.4%) articles were published in 2390 journals indexed in Web of Science or PubMed, and 17 944 (85.8%) were published in 1706 journals indexed in Journal Citation Reports. The median impact factor for the journals was 3.4 (interquartile range 2.0-5.5) (table 2⇓). Most RCTs were published in journals with impact factors <5 or between 5 and 10 (12 496 (59.7%) and 3134 (15.0%), respectively). Of the journals indexed in Journal Citation Reports, 165 (9.7%) were non-English and included 584 RCTs, corresponding to 3.2% of RCTs published in journals with impact factors. 2976 RCTs were published in 1430 journals without impact factors (14.2%).
Characteristics of journals without impact factors
Among the 1430 journals without impact factors, 743 (52.0%) were non-English and included 1511 RCTs (50.8% of RCTs published in journals without impact factors). The most common medical categories were integrative and complementary medicine and medicine, general and internal with 521 RCTs for each. We identified 12 journals (including 29 RCTs) that did not have an impact factor in 2014 but had one in 2015 and 102 journals (including 287 RCTs) that had had an impact factor before 2014.
Overall assessment of poor reporting and inadequate methods
The proportion of trials at unclear risk of bias was high for sequence generation and allocation concealment (48.7% and 57.5%, respectively) and lower for blinding of participants and personnel and incomplete outcome data (30.6% and 24.7%, respectively) (fig 1⇓). The proportion of trials at high risk of bias was 4.0% and 7.2% of all trials for sequence generation and allocation concealment, respectively, but was 33.1% and 22.6% for blinding of participants and personnel and blinding of outcome assessors, respectively, and 17.1% for incomplete outcome data (fig 1⇓).
For all five risk of bias items, we found a lower proportion of trials at unclear or high risk of bias in journals with high impact factors than in those with low or no impact factor (fig 2⇓). For allocation concealment, for example, 38.0% of trials published in journals with impact factors ≥10 were at unclear risk of bias, compared with 73.4% of those in journals with no impact factor. We found the same trend when grouping the journals by centiles of impact factor—45.9% for trials published in journals with impact factors above the 90th centile v 73.4% in those with no impact factor. The proportion of trials at high or unclear risk of bias was lower for journals in the “medicine, general and internal” category than for those in other categories and those not indexed (fig 2⇓). We describe risk of bias for the 10 main medical categories in web appendix 2.
A lower proportion of trials were at unclear or high risk of bias in English journals than in non-English journals (web appendix 3). For allocation concealment, 55.7% and 6.6% of RCTs were at unclear and high risk of bias in English journals, compared with 73.7% and 12.5%, respectively, in non-English journals.
Evolution of poor reporting and inadequate methods over time
Evolution of poor reporting
The proportion of trials at unclear risk of bias decreased over time, especially for sequence generation, which fell from 69.1% in 1986-1990 to 31.2% in 2011-14, and for allocation concealment, which fell from 70.1% to 44.6% in the same period (fig 3⇓).
The fall in unclear risk of bias over time for sequence generation and allocation concealment was consistent across all types of journals but seemed more marked for journals with higher impact factors (fig 4⇓) and for general journals compared with specialist journals (web appendix 4).
The evolution of poor reporting for the 10 journals with the most RCTs is shown in web appendix 5. All journals showed an improvement over time but with differences between journals.
Evolution of inadequate methods
Evolution of inadequate methods over time is shown in fig 5⇓. When considering all trials, including those considered at unclear risk of bias, the change seems minimal. For sequence generation, the proportion of trials at high risk of bias dropped from 4.6% in 1986-1990 to 3.2% in 2011-14; for allocation concealment, the proportion fell from 9.8% to 6.4% over the same period. The decrease was greater after excluding trials at unclear risk of bias, from 14.8% to 4.6% and from 32.7% to 11.6%, respectively. We found a slight decrease for blinding of outcome assessors and incomplete outcome data, from 24.0% to 20.3% and from 19.8% to 14.5%, respectively, when considering all trials and from 42.0% to 30.6% and from 28.3% to 18.7% after excluding trials at unclear risk of bias. By contrast, the proportion of trials at high risk of bias for blinding of participants and personnel slightly increased from 31.0% to 36.1% for all trials and from 47.3% to 49.3% after excluding trials at unclear risk of bias. We found no clear difference in evolution over time by journal impact factor (web appendix 6).
We extensively mapped the research included in Cochrane reviews and found a fall in poor reporting over time, especially for sequence generation and allocation concealment. The proportion of trials with inadequate methods has also decreased slightly for these items. But we found important differences between journals based on their impact factor and between general and specialist journals. Our results raise concerns about trials published in journals without impact factors, in light of their high number (2976 trials; 14.2% of our sample) and the prevalence of poor reporting and inadequate methods.
Strengths and weaknesses
We built a comprehensive database of primary research (RCTs) by compiling a large amount of data routinely collected for Cochrane systematic reviews. These data are of good quality,39 are standardised in part, and are available in an electronic format. Using these data, we identified more than 20 000 RCTs and the corresponding risk of bias assessments, which were collected in duplicate by trained Cochrane reviewers. Although RCTs included in Cochrane reviews do not represent all RCTs, they cover a large and important body of evidence.3132 We think that research on such a large group of trials would not have been possible without Cochrane reviews.
Our study has several limitations. We relied on Cochrane reviewers’ assessments of risk of bias. Although reviewers should be trained in use of the risk of bias tool, variability might exist. For RCTs included in more than one review, we relied on the risk of bias assessment in the most recent. We compared this assessment with previous reviews for 1065 RCTs that shared the same primary reference and found agreement in 83% (n=881) and 75% (n=802) of RCTs for sequence generation and allocation concealment, respectively. Most disagreements concerned the distinction between low and unclear risk of bias. Some Cochrane reviewers might contact study authors for clarification and additional information for some methodological elements that were not clearly reported in study reports. This variability seems to be random and does not seem to be correlated with the time of conduct of the review.
Reviewers might have excluded some trials because of inadequate methods, leading us to underestimate the number of studies with poor reporting and inadequate methods in publications. We cannot exclude the possibility that a classification bias might explain the differences by impact factor. Cochrane reviewers might be influenced by the impact factor of the journal when assessing the risk of bias, with attribution of better scores to journals with the highest impact factors. We think that this is more likely to affect the extreme categories (impact factors ≥10 and no impact factor), and we observed a clear trend for all categories.
We relied on the 2014 impact factor from Journal Citation Reports, which could differ from that in the year the trial was published. Finally, we focused on inadequate methods and poor reporting and did not consider other important sources of waste in RCTs, such as research questions not relevant to patients or their doctors or failure to report trial results.37
Comparison with other studies
This study goes beyond previous literature on this topic with assessment of changes over time and comparisons between journals. Our results show the magnitude of poor reporting in RCTs included in Cochrane reviews, with around half of these RCTs considered at unclear risk of bias by the review authors for sequence generation and allocation concealment, which agrees with previous findings.134041 We found an improvement in reporting for these items over time, with the proportion of trials at unclear risk of bias being halved over three decades, which is consistent with the results of a study of variation in risk of bias over time from a sample of 1732 RCTs included in 97 systematic reviews, published in 2012.40
We found a lower proportion of RCTs with poor reporting and inadequate methods in journals with higher impact factors. General medical journals seemed to be associated with lower risk of bias and better reporting than specialist journals, but this may be explained by the higher impact factors of general medical journals.
Journal impact factor might be a surrogate for other factors. Previous studies show that trials published in journals with high impact factors are more likely to report methodological safeguards against bias4243 and to adhere to reporting guidelines than those with lower impact factors.4344 Journals with high impact factors might have more technical resources, which can help to ensure adherence to reporting guidelines by checking submission of the checklist or might be used to detect selective reporting of outcomes by checking information from clinical trial registries. Moreover, journals with high impact factors might be more engaged in quality improvement, with methodologists involved in peer review. Finally, the selection of trials to be published might be more stringent in journals with high impact factors. However, although some studies support a relationship between impact factor and quality, 424344 others show no such relationship,45 andimpact factors remain controversial.4647 Publication in a journal with a high impact factor does not ensure that an RCT is at low risk of bias.42
Trials published in a journal without an impact factor represented 14.2% of all trials included in Cochrane reviews we examined. Poor reporting and inadequate methods were particularly common in these trials, with limited improvement over time. Although impact factors should not be assumed to represent journal quality, journals without an impact factor are likely to be different from those that do have one. In our sample we identified 1430 journals without impact factors, 52.0% of which were non-English journals (compared with 9.7% of journals with impact factors).
The improvements over time that we observed are encouraging but could be better. Most of the waste related to poor reporting or inadequate methods could be avoided. We previously showed that half of the waste related to using inadequate methods could be limited at the planning stage of the trial with simple and inexpensive methodological adjustments.13 Waste related to poor reporting could be completely avoided. Although using reporting guidelines, such as the CONSORT checklist, is associated with more complete reporting,4849 their implementation varies between journals44505152 with many journals having no policy or mentioning only the existence of the CONSORT statement in the instructions to authors.44505152 We need to promote more active implementation, such as submission of the checklist with the manuscript, as it has been associated with better reporting.51
Research investigators and other stakeholders must also act responsibly to report clear, transparent, and reliable research findings.6 This responsibility can be communicated through education and training starting at university and continuing through residency, fellowship, and the various stages of a career (for example, through professional societies and university departments). Investigators should work with methodologists from the planning stage of their trial to increase the likelihood of adequate study design and quality of reporting.53 If having a methodologist as a co-investigators is not possible, writing aid tools might be useful.54
Our data provide an overview of the quality of evidence in Cochrane reviews. Poor reporting and inadequate methods are common in RCTs included in Cochrane reviews and might affect their results and conclusions, as previous meta-epidemiological studies have shown.89101112 Our results highlight the importance of assessing risk of bias and of incorporating this assessment in evidence synthesis. Improvements at the trial level are necessary to improve the quality of evidence provided by systematic reviews.
This study is part of a larger project, the next step of which is to talk to journals either alone or perhaps grouped by specialty55 about whether this type of audit and feedback is useful to their journal practice. Such data can serve an important monitoring function, examining incremental changes in quality over time. If our process could be automated, this might be a powerful tool for funding agencies and others interested in assessing the value of their research investments.156756 Cochrane is uniquely placed to observe the quality of research, as its data are routinely collected with excellent quality assurance.
This extensive mapping of trials shows a decrease in waste related to poor reporting and inadequate methods over time, but with important differences between risk of bias items and between journals. Our approach, based on the use of data already collected by Cochrane for its reviews, could be a first step in the development of a live observatory to monitor the quality of research over time.
What is already known on this subject
Poor reporting and inadequate methods are common in randomised controlled trials (RCTs)
Many methodological studies have evaluated the quality of reporting and risk of bias in RCTs, but they are limited in terms of number of trials evaluated, most focusing on specific diseases, journals, or time periods.
What this study adds
We took advantage of the amount and quality of data included in Cochrane reviews to map the evolution of poor reporting and inadequate methods in and between journals
From nearly 21 000 RCTs published in 3136 journals over three decades, our results show a decrease over time of poor reporting and inadequate methods especially for sequence generation and allocation concealment
We found a lower proportion of RCTs with poor reporting and inadequate methods in journals with higher impact factors. By contrast, our results raise concerns about journals without impact factors because of the prevalence of poor reporting and inadequate methods
The next step would be to provide feedback to journals and to evaluate whether this type of audit has an impact by using Cochrane data as a live observatory to monitor changes over time
We thank David Tovey, editor in chief of the Cochrane Library, for sharing data from Cochrane reviews; Javier Mayoral Campos, system administrator; the Cochrane Central Executive for preparing files; and all Cochrane reviewers who collected data. We also thank Carolina Riveros, from Inserm U1153, for help with data extraction and Elise Diard, from Inserm U1153, for help with figures.
Contributors: PR generated the idea; AD, LT, and PR designed the study; AD, LT, and IA selected and collected the data; IA and EP managed the data and conducted the statistical analysis; AD, LT, DM, KD, IB, DGA, PR interpreted the data; AD and PR wrote the manuscript; and LT, IA, DM, KD, IB, and DGA critically reviewed the manuscript. AD is the guarantor. She had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Funding: The researchers did not receive external funding. This work was internally funded by Cochrane France. The funding source had no role in the design, conduct, writing, or submission of this article.
Competing interests: DM is a member of the Cochrane Library Oversight Committee and a member of the Cochrane Bias Methods Group. He received funding from the Cochrane Collaboration for an unrelated project. IB and DGA are also members of the Cochrane Bias Methods Group. DGA is supported by Cancer Research UK (C5529). KD is the director of the US Cochrane Center. PR is the director of Cochrane France. The other authors declare no competing interests.
Ethical approval: Not applicable. This is a research on research study.
Data sharing: Data and analysis code available on request from the authors.
Transparency declaration: The guarantor (AD) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.