Intended for healthcare professionals

Analysis

The need to consider the wider agenda in systematic reviews and meta-analyses: breadth, timing, and depth of the evidence

BMJ 2010; 341 doi: https://doi.org/10.1136/bmj.c4875 (Published 13 September 2010) Cite this as: BMJ 2010;341:c4875
  1. John P A Ioannidis, professor and chairman1,
  2. Fotini B Karassa, lecturer in rheumatology2
  1. 1Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
  2. 2Division of Rheumatology, Department of Internal Medicine, University of Ioannina School of Medicine, Ioannina, Greece
  1. Correspondence to: J P A Ioannidis, jioannid{at}cc.uoi.gr
  • Accepted 21 July 2010

As well as focusing on a precise question, systematic reviewers also need to consider the whole research programme for the interventions under study argue John Ioannidis and Fotini Karassa

The problem: a wide research programme

A powerful company develops a promising new blockbuster. The more diseases and conditions the drug can get approved for, the greater the sales. Therefore the company launches trials for many different indications, as its clinical research programme unfolds. Independent committees set interim analysis and appropriate stopping rules for these trials, to avoid harming people on placebo for too long if the drug proves effective. Then, some trials start showing statistically significant benefits, so they are stopped early and the drug gets approved for those indications. Suppose all findings and all results are reported—that is, no reporting bias1 operates. This sounds like the ideal success of honest drug development and clinical investigation. However, it can be shown that the drug is less effective than these trials suggest—and sometimes not effective at all. Why?

Explanations for the problem

Two reasons explain this paradox. Firstly, the drug is tested for many indications and secondly, the first trials have been stopped early. The first reason refers to the breadth of the evidence. The second refers to the timing and depth of the evidence.

Firstly, multiplicity of analyses2: if we test a totally ineffective drug for 20 indications, by chance it is likely to show a significant effect (P<0.05) for one of them. If we test 10 different independent outcomes in each indication, then we expect at least one outcome to be statistically significant for almost half of the indications even without any reporting bias.3

Secondly, early stopping: trials stopped early because of perceived effectiveness give inflated estimates of the treatment effect.4 5 6 An empirical investigation of 91 early stopped trials showed that on …

View Full Text

Log in

Log in through your institution

Subscribe

* For online subscription