Intended for healthcare professionals


Misleading meta-analysis

BMJ 1995; 310 doi: (Published 25 March 1995) Cite this as: BMJ 1995;310:752
  1. Matthias Egger,
  2. George Davey Smith
  1. Senior research fellow Department of Social and Preventive Medicine, University of Berne, CH-3012 Berne, Switzerland
  2. Professor of clinical epidemiology Department of Social Medicine, University of Bristol, Bristol BS8 2PA

    Lessons from “an effective, safe, simple” intervention that wasn't

    A meta-analysis of treatments in myocardial infarction published in 1992 retrospectively showed that streptokinase was associated with a highly significant fall in mortality by 1977, after inclusion of 15 trials.1 Thrombolysis was, however, not widely recommended until 10 years later—after the effect was confirmed in two mega trials.1 2 3 In the case of magnesium, a substantial fall in mortality was evident by 1990, after inclusion of seven trials. In 1993, based on an updated meta-analysis it was argued that magnesium treatment represented an “effective, safe, simple and inexpensive” intervention that should be introduced into clinical practice without further delay.4 The negative results of ISIS 4 (the fourth international study of infarct survival), published in last week's Lancet,5 have dealt a blow to enthusiasm for both magnesium and meta-analysis.6 As the findings of meta-analyses and systematic reviews are generally not tested in mega trials the situation regarding magnesium represents an opportunity to examine a false positive meta-analysis.

    The table compares the meta-analyses of trials of magnesium and streptokinase after myocardial infarction. Trials were cumulatively included until the treatment effect was significant at P<0.001. For magnesium, seven small trials whose results were published in the 1980s were sufficient to establish the effect. Although trials were larger in the case of fibrinolytic treatment, twice as many studies and two decades were necessary to reach the same level of significance. Until recently it could have been argued that this was due to the larger effect apparently associated with magnesium treatment, which should be detectable in a smaller number of trials. In the light of ISIS 4, however, another explanation must exist.

    Could selective identification of positive studies have led to this finding? Trials that support a beneficial effect are cited more frequently than unsupportive trials and are thus more likely to be located for meta-analysis.7 8 We have addressed this hypothesis by hand searching relevant specialist journals and by extending the search to the literature in languages other than English. This has yielded another five small trials9 10 11 12 13; however, two of them showed a significant (P<0.05) reduction in total mortality9 13 and the three others a non-significant trend in the same direction.

    Comparison of two meta-analyses—of intravenous magnesium and streptokinase for acute myocardial infarction—refuted and confirmed by subsequent large randomised controlled trials

    View this table:

    Publication bias is another possibility. Small positive trials are more likely to be published than negative ones, potentially distorting the findings of meta-analyses. If publication bias is operating one would thus expect that, of published studies, the larger ones report the smaller effects. This can be examined in funnel plots, in which the estimates of effect size obtained in the studies are plotted against the sample size. If there is no publication bias the plot should resemble a symmetrical inverted funnel with the results of smaller studies being more widely scattered than those of larger studies. The figure shows the funnel plots for the magnesium and streptokinase trials that appeared before the relevant mega trials with the mega trials added. The plot for the streptokinase trials is symmetrical, and the pooled estimate is in line with the results of the mega trials, GISSI, (Gruppo Italiano per lo Studio della Streptochinasi nell'Infarto Miocardio), and ISIS 2.2 3

    This is clearly not the case for magnesium. The pooled estimate is at odds with the results of ISIS 4, and there is a gap in the bottom right of the funnel, which indicates the absence of negative small studies. Selective non-publication of negative trials thus seems to be a likely explanation for the discrepant findings of the magnesium meta-analysis. The possibility that negative trial results were turned into positive results by selective exclusion of patients from the analysis or other inadequate handling of data must also be considered. For example, a significantly (P<0.05) reduced mortality from cardiac causes was initially reported for the group treated with magnesium in one trial (A M Thogersen et al, VIth international magnesium symposium, Indore, India, 1991); when the results were later analysed on an intention to treat basis, however, this difference became non-significant.12 Furthermore, 16 deaths due to non-cardiac causes were reported during nine months of follow up,12 but only eight such deaths were mentioned in a later paper covering 22 months of follow up.14

    Such biases are probably less likely to act in larger, well monitored trials and could thus produce the same asymmetrical pattern in funnel plots. It remains unclear to what extent publication bias and inadequate handling of data and analysis have contributed to this situation. Finally, it should be kept in mind that, with hundreds of meta-analyses being performed, a few will produce misleading results by chance alone—though this is unlikely in the present case. Indeed, the situation regarding magnesium is not unique. Reviews of the use of nitrates in myocardial infarction15 and of aspirin in the prevention of pre-eclampsia16 are further examples of meta-analyses that were based on small trials and whose positive results were later substantially modified by larger trials.5 17

    Evidence from mega trials will continue to be unavailable for most medical interventions, and in these situations systematic reviews that are based on meta-analyses of randomised controlled trials are clearly the best strategy for appraising the available evidence.18 But to avoid this strategy becoming discredited several steps should be taken. Firstly, more research into the factors associated with misleading meta-analysis is needed. This research should focus on the process of identifying and selecting studies and on the refinement of methods to scrutinise results and should lead to a better estimate of the incidence of the problem. Secondly, registers of clinical trials should be established, with new studies being documented at inception. This is the most effective way of reducing the risk of negative trials disappearing from view. To ensure complete registration, ethics committees should link their approval to the requirement that trials are registered.19

    Thirdly, in the meantime results of meta-analyses that are exclusively based on small trials should be distrusted—even if the combined effect is statistically highly significant. Several medium sized trials of high quality seem necessary to render results trustworthy.

    Fourthly, the results of meta-analyses should always be subjected to careful sensitivity analyses to test the robustness of the findings. For example, the use of ß blockers in secondary prevention after myocardial infarction is widely recommended, largely on the basis of a meta-analysis published in 1985.1 20 The results of this meta-analysis are robust to the choice of the statistical methods used for combining the data and to the exclusion of trials of lesser quality or of studies terminated early. A symmetrical funnel plot suggests that publication bias did not distort the findings.

    Such sensitivity analysis should be part of any article reporting the results of meta-analyses21–it could, in fact, have prevented the misleading conclusions drawn from the magnesium trials.22 Therefore, finally, meta-analyses and systematic reviews published in print or electronically should be scrutinised carefully. Analyses based exclusively on small studies should be treated with caution.


    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.