Outcome reporting bias in clinical trials: why monitoring mattersBMJ 2017; 356 doi: https://doi.org/10.1136/bmj.j408 (Published 14 February 2017) Cite this as: BMJ 2017;356:j408
- 1Departments of Medicine, Health Research and Policy, and Statistics, and Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Palo Alto, CA 94305, USA
- 2Division of Medical Ethics, School of Medicine, NYU Langone Medical Center, New York, NY 10016, USA
- 3Clinical Research, BUC (Biosciences UAM+CSIC) Program, International Campus of Excellence, Universidad Autónoma de Madrid, Ciudad Universitaria de Cantoblanco, Madrid 28049, Spain
- Correspondence to: J P A Ioannidis
Selective reporting of outcomes can bias clinical trials.1 This happens when investigators publish some prespecified outcomes but not others, depending on the nature and direction of the results. They may also add outcomes that were not prespecified or modify some of the prespecified outcomes in subtle or major ways. These selective reporting practices tend to make trial results more attractive but also spurious.
The proposed solution has been trial registration, including the explicit listing of prespecified outcomes before launch, and the transparent description of all changes that occur afterwards. The BMJ asks the guarantor of research to sign a declaration of transparency at submission, stating that the manuscript is an “honest, accurate, and transparent account” of the methods and results, including explanations of “any discrepancies from the study as planned” (http://www.bmj.com/about-bmj/resources-authors/). We discuss efforts to identify discrepancies between trial protocols and published papers and outline possible quality control measures for the future.
From October 2015 to January 2016 the Centre for Evidence Based Medicine Outcome Monitoring Project (COMPare) team checked articles published in the top five general medicine journals based on impact factor—New England Journal of Medicine, Lancet, JAMA, The BMJ, and Annals of Internal Medicine—against protocols and registries pre-dating trial launch.2 Only nine of the 67 clinical trials published (13%) were “perfectly reported,” that is all primary and secondary outcomes were the same in the protocols or registries and the articles. COMPare found that investigators didn’t report the results of 354 outcomes that were prespecified in protocols or registries and that they added—without flagging it to the editor and reviewers—357 outcomes that were not prespecified. COMPare sent correction letters to the five journals and corresponded further when journals refused to publish the corrections.
To date only The BMJ and Lancet have agreed to highlight discrepancies and publish corrections. Two of the three trials in The BMJ have been corrected, and correction letters have been published. As at May 2016 Lancet had published 11 of COMPare’s 19 correction letters, but they were accompanied by replies from the original trial authors, who typically argued that what they did was correct. New England Journal of Medicine, JAMA, and Annals of Internal Medicine did not make corrections. Their justifications include: that not all outcomes are published in the same paper (for example, some secondary outcomes are in secondary publications); that there is not enough space to report all secondary outcomes; that protocols change for good reasons, so a fairer comparison would be the published paper versus the latest protocol update; and that editors and reviewers should ensure that important outcomes are reported and best analyses are applied, even if they are not prespecified. The COMPare team replied that they simply sought explicit acknowledgment and justification for adding, deleting, or changing outcomes.
Despite lack of consensus between editors and COMPare investigators, several benefits have emerged from this engagement. The usefulness of crosschecking was shown in The BMJ with publication of a final correction34 in which the investigators “broadly acknowledged initial errors and corrected them swiftly and clearly.”2 Moreover, since 19 April 2016 Annals of Internal Medicine has required authors to submit with their manuscript a copy of the trial protocol with all dated amendments, which will be available to readers (http://annals.org/public/authorsinfo.aspx).
What we have learnt so far
Instead of arguing over who is right or wrong and blaming unrealistic watchdogs, misbehaving authors, or complacent editors, we should focus on the lessons learnt about study design. Trials analysed by COMPare probably have, on average, fewer problems with outcome reporting than most; the majority of trials are not even registered,56 most journals don’t require trial registration,7 and few journals require protocols to be submitted.
Many trials either don’t prespecify outcomes or do so vaguely.8 When outcomes are prespecified, exact statistical analyses are usually undeclared. This allows substantial variation (“vibration”) in results,9 depending on what statistical model is used, whether adjustments are made,10 whether data are withheld, how missing data are handled, and other factors. Sometimes even key factors such as the time points of analyses are vague. The choice of prespecified outcomes and analyses can be suboptimal, non-informative, or even misleading for patients, and trials on the same condition have unnecessary variety in chosen outcomes and analyses. Some investigators publish many secondary publications for the same trial, causing confusion. Reviewers, editors, and authors should occasionally improve the analysis of a trial, if recorded data allow suboptimal or misleading choices to be replaced with unambiguously better ones, for example using a clinical outcome that is relevant to patients instead of a poor surrogate.
Incentives for editors and investigators should be aligned towards publishing accurate, complete, and useful results. The notion that outcome manipulation can help a paper get published in a major journal (if unnoticed) or decrease its chances of publication (if noticed) is supported by rather weak evidence. One empirical study found that the chance of rejection from high ranked journals was not very different for trials with (79%) or without (71%) primary outcome discrepancies.11 Instead of trying to fool each other, we should try to be transparent.
Outcomes may differ between preregistration records and final publications for many reasons, some more justifiable than others (table 1)⇓. For example, a primary outcome may be registered inaccurately by whoever enters the data. In this case, reporting the correct outcome in the full paper is justifiable, and even desirable. Conversely, manipulating outcomes after unblinding of the data to make the results more attractive is absolutely not justifiable.
Given this complexity, we are surprised that as many as 13% of trials were perfectly reported in the COMPare assessment. Taking into account the spectrum of vagueness, overt and hidden multiplicity, justified or manipulative modifications, and retrospective adjustments, one might think that almost no trial can be trusted. This is unfair, because well done clinical trials can, and do, reliably inform clinical practice. However, more transparency is needed than that conveyed by signing a declaration of transparency. Authors should be open to acknowledging changes without fear that it represents misconduct or suboptimal practice. The notion of a perfect, static trial is unrealistic. Adoption of standardised, core outcomes1213 and of more efficient and unbiased statistical analyses is useful even after the start of a trial. What matters is whether changes made after the trial was started were made with or without any knowledge of the accumulated data that may have affected the reported results.14 Authors may still be reluctant to present the full gamut of changes.
Moving forward: who needs to do what?
Quality control, that is the crosschecking that the COMPare team did retrospectively, could be done at the time of submission for publication.15 The peer review process should include quality control of the trial methods (for example, appropriateness of outcomes, control group, inferences, and statistical analyses). But editors and reviewers seem either to have no access to preregistered protocols or do not check them for concordance with reported outcomes. Even the in-depth editorial processes of top journals are insufficient. Data show that between 1 September 2013 and 30 June 2014 The BMJ’s editorial and peer review process reduced unreported prespecified outcomes from 27% to 21% but did not modify the proportion of reported trials with non-prespecified outcomes (11%).16
Editors could explicitly ask authors to declare at the time of submission any changes in the manuscript compared with the registry or protocol.17 They could use table 1⇑ as guidance. Peer reviewers could also do the crosschecking between the manuscript and the protocol or registry, but as overburdened volunteers they are not keen to do so1819 and are rarely asked by editors.19 Crosschecking could be done by editorial staff members or by external groups similar to the COMPare team, comprising trained students led by a few experts.2 These external groups could liaise with journal editorial teams but not with manuscript authors. Using external groups for the core crosschecking work has advantages and disadvantages (box 1).
Box 1: Using external groups of reviewers to crosscheck trials
Could flag outcome reporting biases to editors and peer reviewers
Could be conducted by groups specialised in specific areas
Could enhance transparency of clinical trials reporting
Could enhance trial results reliability
Might become a source of prestige to the organisations hosting these groups
Belonging to one of these groups should be recognised professionally (eg, academic credit)
Challenges and caveats
Many groups may be needed to deal with all trials across the world
Implementing a commonly agreed standard operating procedure on how to conduct and report the crosschecking across all groups might be difficult
Efficient groups might be overwhelmed with requests
The degree to which universities, research institutions, professional societies, and other organisations will support this approach is unknown
May further complicate the peer review and publication system
Financial issues need to be sorted, if groups do this for a fee rather than just academic credit
Experts—such as the International Committee of Medical Journal Editors (ICMJE), the EQUATOR network (http://www.equator-network.org/), or the COMET initiative (http://www.comet-initiative.org/)—could agree on how crosschecking should be conducted and reported to ensure a common approach. The results of crosschecking could be published as a supplementary file with the article, open for further postpublication review by the scientific community and journal readership. Editors could decide on appropriate steps if discrepancies were flagged by crosschecking at any stage in the publication process.
What is the objective?
These efforts aim to assure that clinical trials are appropriately reported. Crosschecking will also reassure readers that any change (deletion, addition, change in outcomes or analysis) introduced during the conduct or analysis of a trial is justified, was openly disclosed to editors and peer reviewers, and is eventually disclosed to readers. For example, outcomes that are reported but weren’t prespecified should be tagged as retrospective.16 This may improve the credibility of clinical evidence to prescribers and patients, both for single trials and for systematic reviews of many trials. Currently, 34% of Cochrane systematic reviews include one or more trials with a high suspicion of selective reporting bias for their primary outcome.20
How big is the task?
Trials are published in a large number of journals.21 To be maximally effective, quality control procedures should be implemented by hundreds of journals. The vast majority of journals cannot afford editorial staff to conduct the crosschecking in-house, so they may have to use external groups or depend on self declarations from authors, standard peer reviewers, and postpublication review.
Bastian and colleagues estimate that 75 randomised clinical trials are published a day.22 Given that 60% of randomised clinical trials are eventually published,23 we estimate that 125 trials are started daily, which is 2.5 times more than the number of trial protocols registered daily (51) in 2014-2015 on clinicaltrials.gov. About 5% of registered trials are never started,24 and many others are terminated early (12% of those posted on clinicaltrials.gov25; 28% of those approved by research ethics committees).23 Given that many trials also have secondary publications,26 we estimate that between 150 and 200 trials would need to be assessed each day. Including rejections and resubmissions, sometimes with further outcome changes, the number could be even higher. Many external groups would be needed to cover this volume.
Perhaps priority could be given only to the larger, late phase trials that are most likely to have substantial clinical impact. However, eventually crosschecking of all trials submitted for publication should be the final aim of all parties involved. Although this will be an extra effort, we think it will be almost negligible compared with the overall effort expended by investigators and sponsors in setting up, conducting, analysing, and reporting a clinical trial and the benefit it will bring in reducing waste.
The ICMJE’s recent proposal for individual trial participant data to be de-identified27 is a step towards full clinical trial transparency but will be of limited efficacy in preventing outcome reporting bias. Instead of blaming authors or editors for selective reporting of outcomes, we need preventive action. Editors must seriously consider implementing better quality control procedures. Investigators and journals could sometimes be acknowledged for making changes in trial protocols and analyses; even then it should be done transparently. We know that significant efforts are ongoing to encourage more careful reporting of changes in clinical trial protocols, but the stakes are too high to let selective outcome reporting continue.
Selective reporting of outcomes in clinical trials is common
Many investigators forget to report on prespecified trial outcomes or inform on non-prespecified outcomes
Editors could implement better quality control measures to prevent selective reporting of outcomes
Editors could use external groups of experts that could crosscheck the information of the manuscript with that of the registry or protocol
Any change (in outcomes or analysis) introduced during analysis of the data or the peer review process should be disclosed and defended with arguments in the article
Contributors and sources: JPAI and RD-R wrote the first drafts. ALC suggested further important changes. All authors approved the final version of the manuscript and are accountable for all aspects included in it. JPAI is the guarantor. The authors assume full responsibility for the accuracy and completeness of the ideas presented.
Funding: This work was not supported by any external grants or funds.
Competing interests: All authors have completed the unified competing interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare that have neither financial nor non-financial interests that may be relevant to the submitted work. AC is the non-voting, non-paid chairperson of the Compassionate Use Advisory Committee, a panel of internationally recognised medical experts, bioethicists, and patient representatives formed by NYU School of Medicine, which advises Janssen about requests for compassionate use of its investigational medicines.
Provenance and peer review: Not commissioned; externally peer reviewed.