Intended for healthcare professionals

Editorials

Catalogue of errors in papers reporting clinical trials

BMJ 2015; 351 doi: https://doi.org/10.1136/bmj.h4843 (Published 21 September 2015) Cite this as: BMJ 2015;351:h4843
  1. Nick Freemantle, professor of clinical epidemiology and biostatistics,
  2. Greta Rait, reader in primary care
  1. 1Department of Primary Care and Population Health, PRIMENT Clinical Trials Unit, UCL Medical School (Royal Free Campus), London NW3 2PF, UK
  1. Correspondence to: N Freemantle nicholas.freemantle{at}ucl.ac.uk

Errors are linked to retraction, but are an unreliable marker for fraudulent or harmful research

The paper by Cole and colleagues1 examines the association between discrepancies and retractions in clinical trial reports, concluding that discrepancies or errors could be an early signal of unreliability in clinical trials. It is another foray by these authors into the examination of error in reported research. Their previous paper2 3 demonstrated a clear and concerning relation between error and inflated effect size in stem cell research. But is the identification of unreliable, misleading, or even fraudulent research really as simple as counting the number of errors or discrepancies in clinical trial reports? Cole and colleagues describe errors or discrepancies as red flags, but several limitations within their own study indicate that these red flags are an unreliable marker for deeper problems within a piece of research.

The authors undertake a case-control study and make recommendations based on sensitivity and specificity, even though they acknowledge that these metrics depend on prevalence and the research design is inappropriate.4 There are far more unretracted reports of clinical trials (802 953 papers on PubMed, 6 July 2015) than retracted reports of clinical trials (379 papers on same date), so the predictive power of an error count to identify rogue papers will be extremely low. You would have to kiss many princes to find a frog. This is illustrated by the authors’ own table of included studies (web appendix 2). Fifteen (30%) retracted papers (frogs) had fewer than three errors each (the level at which they suggest that a red flag is raised), whereas 17 (34%) unretracted controls (princes) had more than three errors.

Cole and colleagues interpret their outcome (retraction) as if it is always homogeneous and harmful. But retracted papers do more or less harm depending on the reason for retraction. For example, a duplicate publication, while wrong, could paradoxically help a practitioner to identify an important piece of information that they might otherwise miss, while fabricated results lead more directly to patient harm if they wrongly suggest that an ineffective treatment works. Retraction cannot be considered as “the variable capable of providing the most clinically relevant and convincing evidence.”5 Instead, the authors could have chosen a more specific outcome such as retraction for misconduct.

The authors anonymise their selected papers, but provide National Library of Medicine identification numbers, making it possible—although tedious—for readers to identify the journals that published these papers and to scrutinise their study material. Basic demographics of participants are an essential part of good reporting practice. In this case, the authors could and arguably should have reported the journal of publication, and the specialty of each included paper. While reviewing the articles, we thought it was important to see that, for example, eight (16%) retracted papers came from the same anaesthetic journal, Anesthesia and Analgesia. On the basis of the journal titles alone, 15 (30%) are in the area of anaesthesia, an over-representation that requires further scrutiny. Indeed, when considering the retracted papers in anaesthesia, we quickly discovered evidence of systematic fraud6 focused on the work of two authors, who between them have had 102 retracted articles.

Cole and colleagues analyse the association between errors and retraction in their case-control study. Of the study sample, five journals had multiple papers selected, which between them published 17 (34%) of the pairs of retracted and control papers. There was also a limited number of clinical areas covered by the study sample. Owing to this hierarchical nature of the authors’ data, they failed to account for the clustering of papers within journals, which is curious when elsewhere they recognise that papers from the same journal cannot be considered independent.

So what should we make of the study’s findings? The authors identify a relation between errors in clinical trial reports and retraction. Although this association may be real, it is unlikely to be helpful in practice if the main goal is to identify seriously misleading or fraudulent research that could harm patients. Identifying errors could increase our suspicion that a paper has a higher chance of being retracted, and that risk of retraction might be due to misleading or made up findings. But it could also be due to another perhaps less harmful problem such as duplication.

A more fundamental question is, why were the many errors and inconsistencies identified by Cole and colleagues not picked up by journal editors or technical editors whose job it is to ensure the validity of the work they publish? Our experience is that many journals manage the publication process very well. It is surprising, therefore, that the so-called “top scoring” retracted paper in Cole and colleagues’ study (published by a BMJ group journal) contained 35 different errors. Almost all (n=34) the errors were numerical, which might have been identified by technical editors.

Schafer describes the considerable efforts that editors went to in uncovering the extent of research fraud in the work of authors in anaesthesia after errors were highlighted by readers,6 and recognises the duty of journals to ensure high quality research. But he likened the growth of human knowledge to the weaving of a tapestry, and the sudden loss of retracted articles to the ripping of a thread. Many papers are published but fortunately few are fabricated or irrevocably flawed. We must expect both benign and malicious errors to occur in reports of research and devise ways to identify and correct errors of substance in the research publication process. However, Cole and colleagues’ recommendation that we raise a red flag to clinical trial reports containing more than three errors is premature and not well supported by the evidence in their paper.

Notes

Cite this as: BMJ 2015;351:h4843

Footnotes

  • Research, doi:10.1136/bmj.h4708
  • Competing interests: We have read and understood the BMJ Group policy on declaration of interests and declare the following interests: NF is paid to advise Sanofi, Novo Nordisk, and Ipsen on trial methodology and outcomes research. GR declares no conflicts of interest.

  • Provenance and peer review: Commissioned, not externally peer reviewed.

References

View Abstract