Editor's Choice | This Week in BMJ | Press releases
BMJ No 7122 Volume 315 Education and debate Saturday 13 December 1997
Meta-analysisBeyond the grand mean?George Davey Smith, Matthias Egger, Andrew N PhillipsThis is the third in a series of six articles examining the procedures in conducting reliable meta-analysis in medical research In the previous two articles(1,2) we outlined the potentials and principles of meta-analysis and the practical steps in performing a meta-analysis. Now we will examine how to use meta-analysis to do more than simply combine the results from all the individual trials into a single effect estimate. Firstly, we discuss the advantages and disadvantages of performing subgroup analyses. Secondly, we consider the situation in which the differences in effects between individual trials are related in a graded way to an underlying phenomenon, such as the degree of mortality risk of the trial participants.
Subgroup analysisThe main aim of a meta-analysis is to produce an estimate of the average effect seen in trials of a particular treatment. The direction and magnitude of this average effect is intended to guide decisions about clinical practice for a wide range of patients. Clinicians are thus being asked to treat their patients as though each one is well represented by the patients in the clinical trials included in the meta-analysis. This runs against doctors' concerns to use the specific characteristics of a patient to tailor that patient's management.(3) Indeed, the effect of a given treatment is unlikely to be identical across different groups of patients - for example, young people versus elderly people, those with mild disease versus those with severe disease. It may therefore seem reasonable to base treatment decisions on the results of the trials that have included participants with similar characteristics to the patient under consideration rather than on the overall evidence as provided by meta-analysis. Decisions based on subgroup analyses, however, are often misleading. Consider, for example, a doctor in Germany being confronted by the meta-analysis of long term ß blockade after myocardial infarction (see previous article(2) ). Although a robust beneficial effect is seen in the overall analysis, in the only trial that recruited a substantial proportion of German patients (trial N in previous article),(4) there was, if anything, a detrimental effect associated with ß blockers. Should the doctor give ß blockers to German patients who have had an infarction? Common sense suggests that being German does not prevent a patient from obtaining benefit from ß blockade. Thus the best estimate of the outcome for German patients may come through discounting the trial carried out in German patients. This may seem paradoxical; indeed the statistical expression of this phenomenon is known as Stein's paradox (box).(5)
Making decisions between overall effects and particular results is not just a problem created by meta-analysis; it also applies to the interpretation of individual clinical trials.(6) Authors of trial reports often spend more time discussing the results seen in subgroups of patients included in the trial than on the overall results. Yet frequently the findings of these subgroup analyses fail to be confirmed by later research. The various trials of ß blockade after myocardial infarction yielded several subgroup findings with apparent clinical significance.(7) Treatment was said to be beneficial in patients aged under 65 but harmful in older patients, or only beneficial in patients with anterior myocardial infarction. When examined in subsequent studies or in a formal pooling project(8) these findings received no support.(7) It can be shown that if an overall treatment effect is significant at the 5% level (P0.05) and the patients are divided at random into two similarly sized groups then there is a 1 in 3 chance that the treatment effect will be large and highly significant in one group but irrelevant and non-significant in the other.(9) Which subgroup "clearly" benefits from an intervention is thus often a chance phenomenon, inundating the literature with contradictory findings from subgroup analyses and wrongly inducing clinicians to withhold treatments from some patients.(10-12)
The difference between the two classes of ß blockers was significant (P0.01). Since then, however, a trial was published showing a particularly strong beneficial effect of acebutolol, an agent with intrinsic sympathomimetic activity,(16) whereas another trial using metoprolol, a ß blocker without intrinsic sympathomimetic activity, was essentially negative.(17) This illustrates that, far from aiding clinicians, post hoc subgroup analyses may confuse and mislead. A more reliable way of assessing differences in treatment effects is to relate outcome to some underlying patient characteristic on a continuous, or ordered, scale.(18,19)
Meta-regression: examining gradients in treatment effectsThe clinical trials included in a meta-analysis often differ in a way that would be expected to modify the outcome. In trials of cholesterol reduction the degree of cholesterol lowering attained differs markedly between studies, and the reduction in mortality from coronary heart disease is greater in the trials in which larger reductions in cholesterol are achieved.(18,20) Such graded associations are not limited to situations where greater benefits would be expected consequent on greater changes in a risk factor. In the case of thrombolysis after acute myocardial infarction, the greater the delay in treatment, the smaller the benefit of thrombolysis.(21,22) Here, the graded association is seen between the outcome and a characteristic of the treatment used. Such a gradient allows for a more powerful examination of differences in outcomes, as a statistical test for trend can be performed, rather than the less powerful test for evidence of global heterogeneity. Other attributes of study groups - such as age and length of follow up - can readily be analysed in this way. As discussed later in this series,(23) such analyses will often require data on individual patients rather than published summary statistics. Risk stratificationA factor that is often related to a given treatment effect is the underlying risk of occurrence of the event that the treatment aims to prevent. It makes intuitive sense that patients at high risk are more likely to benefit than those at low risk. In the case of trials of cholesterol lowering, for example, the patient groups have ranged from survivors of heart attack with gross hypercholesterolaemia to groups of healthy asymptomatic people with moderately raised cholesterol concentrations. The death rates from coronary heart disease in the first group have been up to 100 times higher than the death rates in the second groups. The outcome of treatment in terms of all cause mortality has been more favourable in the trials recruiting participants at high risk than in the trials recruiting participants at relatively low risk.(18) Two factors contribute to this. Firstly, among the high risk participants, the great majority of deaths will be from coronary heart disease, the risk of which is reduced by cholesterol reduction. A 30% reduction in mortality from coronary heart disease therefore translates into a near equivalent reduction in total mortality. In the low risk participants, on the other hand, a much smaller proportion - about 40% - of deaths will be from coronary heart disease. In this case a 30% reduction in mortality from coronary heart disease would translate into a much smaller - about 10% - reduction in all cause mortality. Secondly, if there is any detrimental effect of treatment it may easily outweigh the benefits of cholesterol reduction in the low risk group, whereas in high risk patients, among whom a substantial benefit is achieved from cholesterol reduction, this will not be the case. In a recent meta-analysis of cholesterol lowering trials this situation was evident for trials using fibrates but not for trials using other drugs.(24)
When outcomes are very different in groups at different levels of risk it is inappropriate to perform a meta-analysis in which an overall estimate of the effect of treatment is calculated. In the zidovudine trials, for example, an overall effect estimate from all eight trials (odds ratio 0.96; 95% confidence interval 0.75 to 1.22) is very different from that seen in the only trial among patients with AIDS (0.04; 0.01 to 0.33). If there had been more trials among patients with AIDS the overall effect would seem highly beneficial. Conversely, if there had been more large trials among asymptomatic patients the confidence limits around the overall effect estimate would exclude any useful benefit, which would be misleading if applied to patients with AIDS. Problems in risk stratificationWhen many trials have been conducted in a particular field, risk stratification can be performed at the level of individual trials. This was carried out in the case of cholesterol lowering, with mortality from coronary heart disease in the control arm of the trials as the stratification variable.(18) This stratification is of clinical use, as this is the risk of death from coronary heart disease in patients without treatment - that is, the risk level that clinicians want to use for deciding whether patients will benefit from therapeutic cholesterol lowering. The analysis can also use risk of death in the control group as a continuous variable, through the examination of the interaction between treatment effect and risk in a logistic regression analysis. A significant statistical test for interaction suggests that there is a real difference in outcome at different levels of risk.
The resulting regression line intersects with the "null effect" line at a rate of 6 per 1,000 person years in the control group (fig 3 (top)). This was interpreted as showing "that drug treatment for mild to moderate hypertension has no effect on, or may even increase, all cause mortality in middle aged patients." (32) In other words, antihypertensive treatment was considered to be beneficial only in patients at relatively high risk of death. This interpretation, however, is misleading because it ignores the influence of random fluctuations on the slope of the regression line.(29) If, owing to non-infinite sample sizes, mortality in a control group is particularly low then mortality in the treatment group will, on average, seem high. Conversely, if mortality among controls is by chance high then mortality in the treatment group will seem low. The effect of random error will thus rotate the regression line around a pivot, making it cross the line of identity on the right hand side of the origin. This phenomenon, a manifestation of regression to the mean,(30) can be illustrated in computer simulations. Using the same rates in the control group and assuming a constant reduction of all cause mortality of 10% in treated groups (relative risk 0.9), we considered the situation both assuming no random fluctuations in rates and allowing random error (fig 3 (bottom)).(29) After we added error (by sampling 1,000 times from the corresponding Poisson distribution) the regression line rotated and crossed the no effect line. Indeed, the intersection is at almost the same point as that found in the earlier meta-analysis - namely, at a mortality in the control group of about 6 per 1,000 person years. It is thus quite possible that what was interpreted as reflecting detrimental effects of antihypertensive treatment(32) was in fact produced by random variation in event rates. When mortality in the control groups vary greatly or when trials are large, the chance fluctuations that produce such spurious associations will be less important. Alternatively, the analysis can be performed using the overall mortality in the control and treatment arms of the trials as the risk indicator.(18) This will generally, but not always, lead to bias in the opposite direction, diluting any real association between level of risk and treatment effect.(30) Use of event rates from either the control group or overall trial participants as the stratifying variable when relating treatment effect to level of risk is thus problematic.(29,30) Although some, more complex, statistical methods are less susceptible to these biases,(31)(34) it is preferable to use indicators of risk that are not based on outcome measures. In the case of the effect of angiotensin converting enzyme inhibitors on mortality in patients with heart failure, use of risk in the control group showed greater relative and absolute benefit in trials recruiting higher risk participants.(25) In a meta-analysis, data were available on treatment effects according to clinical indicators within strata from many of the trials.(35) Twenty nine per cent of patients with an ejection fraction of 0.25 and under at entry died during the trials, compared with 17% of patients with an ejection fraction of >0.25. A substantial reduction in mortality (odds ratio 0.69; 95% confidence interval 0.57 to 0.85) was seen in the first, higher risk group, whereas little effect on mortality was seen in the second, lower risk group (0.98; 0.79 to 1.23). A similar difference was seen if a combined end point of mortality or admission to hospital for congestive heart failure was used as the outcome measure. ConfoundingThat randomised controlled trials are included in meta-analyses does not mean that comparisons made between trials are randomised comparisons. When outcomes are related to characteristics of the trial participants, to differences in treatments used in the separate trials, or to the situations in which treatments were given, the associations seen are subject to the potential biases of observational studies. Confounding could exist between one trial characteristic - say, drug trials versus diet trials in the case of cholesterol lowering - and another characteristic, such as level of risk of the participants in the trial. In many cases there are simply too few trials, or differences in the average characteristics of participants in the trials are too small, for a stratified analysis to be performed at the level of the individual trial. It may be possible to consider strata within the trials - for example, male versus female, or those with or without existing disease - to increase the number of observations to be included in the regression analysis. Increasing the number of data points in this way is of little help if there are strong associations between the factors under consideration. For example, in a meta-regression analysis of total mortality outcomes of cholesterol lowering trials various factors seem to influence the outcome: greater cholesterol reduction leads to greater benefit; trials including participants with a higher level of risk of coronary heart disease show larger mortality reductions; and the fibrate drugs lead to less benefit than other interventions.(20)(24) These findings are difficult to interpret, however, as the variables included are strongly related - fibrates have been used mainly in trials recruiting lower risk participants, and they lower cholesterol much less than statins. In this situation all the problems of performing multivariable analyses with correlated covariates are introduced.(36,37) ConclusionIt is tempting to use a meta-analysis to produce more than a simple overall effect estimate, but caution is needed, for the reasons detailed above. One of the more useful extensions of meta-analysis beyond the grand mean relates to the examination of publication bias and other inclusion biases, which will be discussed later in this series.
The department of social medicine at the University of Bristol
and the department of primary care and population sciences at the Royal
Free Hospital School of Medicine, London, are part of the Medical
Research Council's health services research collaboration.
Funding: ME was supported by the Swiss National Science
Foundation.
Department of Social Medicine, Department of Primary Care and Population
Sciences, Correspondence to: Professor Davey Smith email: zetkin@bristol.ac.uk
References
1 Egger M, Davey Smith G. Meta-analysis: potentials and
promise. BMJ 1997;315:1371-4.
2 Egger M, Davey Smith G, Phillips A N. Meta-analysis: principles
and procedures. BMJ 1997;315:1533-7.
3 Wittes R E. Problems in the medical interpretation of
overviews. Stat Med 1987;6:269-76.
4 The European Infarction Study Group. European infarction study
(EIS). A secondary prevention study with slow release oxprenolol after
myocardial infarction: morbidiy and mortality. Eur Heart
J 1984;5:189-202.
5 Efron B, Morris C. Stein's paradox in statistics. Sci
Am 1977;236:119-27.
6 Oxman A D, Guyatt G H. A consumer's guide to subgroup analyses.
Ann Intern Med 1992;116:78-84.
7 Yusuf S, Wittes J, Probstfield J, Tyroler H A. Analysis and
interpretation of treatment effects in subgroups of patients in
randomized clinical trials. JAMA 1991;266:93-8.
8 Beta-Blocker Pooling Project Research Group. The beta-blocker
pooling project (BBPP): subgroup findings from randomized trials in
post infarction patients. Eur Heart J 1988;9:8-16.
9 Peto R. Statistical aspects of cancer trials. In: Halnan K E,
ed. Treatment of cancer. London: Chapman and Hall, 1982.
10 Buyse M E. Analysis of clinical trial outcomes: some comments on
subgroup analyses. Controlled Clinical Trials
1989;10:187-94S.
11 Mauri F, Gasparini M, Barbonaglia L, Santoro E, Franzosi M,
Tognoni G, et al. Prognostic significance of the extent of myocardial
injury in acute myocardial infarction treated by streptokinase (the
GISSI trial). Am J Cardiol 1989;63:1291-5.
12 Peto R. Misleading subgroup analysis in GISSI. Am J
Cardiol 1990;64:771
13 Schroeder R. Oxprenolol in myocardial infarction survivors:
brief review of the European infarction study results in the light of
other beta-blocker post infarction trials. Z Kardiol
1985;74(suppl 6):165-72.
14 Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade
during and after myocardial infarction: an overview of the randomized
trials. Progr Cardiovasc Dis 1985;17:335-71.
15 Peto R. Why do we need systematic overviews of randomized
trials? Stat Med 1987;6:233-40.
16 Boissel J P, Leizorovicz A, Picolet H, Peyrieux J C. Secondary
prevention after high-risk acute myocardial infarction with low-dose
acebutolol. Am J Cardiol 1990;66:251-60.
17 Lopressor Intervention Trial Research Group. The lopressor
intervention trial: multicentre study of metoprolol in survivors of
acute myocardial infarction. Eur Heart J 1987;8:1056-64.
18 Davey Smith G, Song F, Sheldon T A. Cholesterol lowering and
mortality: the importance of considering initial level of risk.
BMJ 1993;306:1367-73.
19 Bailey K R. Generalizing the results of randomized clinical
trials. Controlled Clinical Trials 1994;15:15-23.
20 Holme I. Relationship between total mortality and cholesterol
reduction as found by meta-regression analysis of randomized
cholesterol lowering trials. Controlled Clinical Trials
1996;17:13-22.
21 Fibrinolytic Therapy Trialists' (FTT) Collaborative Group.
Indications for fibrinolytic therapy in suspected acute myocardial
infarction: collaborative overview of early mortality and major
morbidity results from all randomised trials of more than 1000
patients. Lancet 1994;343:311-22.
22 Zelen M. Intravenous streptokinase for acute myocardial
infarction. New Engl J Med 1983;308:593.
23 Davey Smith G, Egger M. Meta-analysis: unresolved issues and
future developments. BMJ (in press).
24 Davey Smith G. Low blood cholesterol and non-atherosclerotic
disease mortality: where do we stand? Eur Heart J
1997;18:6-9.
25 Davey Smith G, Egger M. Who benefits from medical
interventions? Treating low risk patients can be a high risk strategy.
BMJ 1994;308:72-4.
26 Antiplatelet Trialist's Collaboration. Collaborative overview
of randomised trials of antiplatelet therapy - I: prevention of death,
myocardial infarction, and stroke by prolonged antiplatelet therapy in
various categories of patients. BMJ 1994;308:81-106.
27 Fischl M A, Richman D D, Griego M H, Gottlieb M S, Volberding P A,
Laskin O L, et al. The efficacy of azidothymidine (AZT) in the treatment
of patients with AIDS and AIDS-related complex. A double-blind,
placebo-controlled trial. N Engl J Med 1987;317:185-91.
28 Egger M, Neaton J D, Phillips A N, Davey Smith G. Concorde trial
of immediate versus deferred zidovudine. Lancet
1994;343:1355.
29 Egger M, Davey Smith G. Risks and benefits of treating mild
hypertension: a misleading meta-analysis? J Hypertens
1995;13:813-5.
30 Sharp S J, Thompson S G, Altman D G. The relation between
treatment benefit and underlying risk in meta-analysis.
BMJ 1996;313:735-8.
31 Thompson S G, Smith T E C, Sharp S J. Investigating underlying risk
as a source of heterogeneity and meta-analysis. Stat Med
(in press).
32 Hoes A W, Grobbee D E, Lubsen J. Does drug treatment improve
survival? Reconciling the trials in mild-to-moderate hypertension.
J Hypertens 1995;13:805-11.
33 L'Abbé K A, Detsky A S, O'Rourke K. Meta-analysis in clinical
research. Ann Intern Med 1987;107:224-33.
34 McIntosh M W. The population risk as an explanatory variable in
research synthesis of clinical trials. Stat Med
1996;15:1713-28.
35 Garg R, Yusuf S, for the Collaborative Group on ACE Inhibitor
Trials. Overview of randomised trials of angiotensin-converting enzyme
inhibitors on mortality and morbidity in patients with heart failure.
JAMA 1995;273:1450-6.
36 Phillips A N, Davey Smith G. How independent are
"independent" effects? Relative risk estimation when correlated
exposures are measured imprecisely. J Clin Epidemiol
1991;44:1223-31.
37 Davey Smith G, Phillips A N. Confounding in epidemiological
studies: why "independent" effects may not be all they seem.
BMJ 1992;305: 75
Home | Current issue | Past issues | Classified ads | Career Focus | Feedback Collections | About this site | About the BMJ | BMA | Medline
|