Reasons or excuses for avoiding meta-analysis in forest plotsBMJ 2008; 336 doi: https://doi.org/10.1136/bmj.a117 (Published 19 June 2008) Cite this as: BMJ 2008;336:1413
- John P A Ioannidis, professor1,
- Nikolaos A Patsopoulos, research fellow1,
- Hannah R Rothstein, professor2
- 1Clinical Trials and Evidence Based Medicine Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine and Biomedical Research Institute, Foundation for Research and Technology-Hellas, Ioannina 45110, Greece
- 2Department of Management, Zicklin School of Business, City University of New York, New York, NY 10010, USA
- Correspondence to: J P A Ioannidis
- Accepted 30 March 2008
Some systematic reviews simply assemble the eligible studies without performing meta-analysis. This may be a legitimate choice. However, an interesting situation arises when reviews present forest plots (quantitative effects and uncertainty per study) but do not calculate a summary estimate (the diamond at the bottom). These reviews imply that it is important to visualise the quantitative data but final synthesis is inappropriate. For example, a review of sexual abstinence programmes for HIV prevention claimed that owing to “data unavailability, lack of intention-to-treat analyses, and heterogeneity in programme and trial designs… a statistical meta-analysis would be inappropriate.”1 As we discuss, options almost always exist for quantitative synthesis and sometimes they may offer useful insights. Reviewers and clinicians should be aware of these options, reflect carefully on their use, and understand their limitations.
Why meta-analysis is avoided
Of the 1739 systematic reviews that included at least one forest plot with at least two studies in issue 4 of the Cochrane Database of Systematic Reviews (2005), 135 reviews (8%) had 559 forest plots with no summary estimate.
The reasons provided for avoiding quantitative synthesis typically revolved around heterogeneity (table 1⇓). The included studies were thought to be too different, either statistically or in clinical (including methodological) terms. Differences in interventions, metrics, outcomes, designs, participants, and settings were implied.
How large is too large heterogeneity?
This question of lumping versus splitting is difficult to answer objectively for clinical heterogeneity. Logic models based on the PICO (population-intervention–comparator-outcomes) framework may help to deal with the challenges of deciding what to include and what not. Still, different reviewers, readers, …