Numbers needed to treat derived from metaanalyses—sometimes informative, usually misleading
BMJ 1999; 318 doi: http://dx.doi.org/10.1136/bmj.318.7197.1548 (Published 05 June 1999) Cite this as: BMJ 1999;318:1548 Liam Smeeth, research fellowa,
 Andy Haines, professor of primary carea,
 Shah Ebrahim, professor of epidemiology and ageing (shah.ebrahim{at}bristol.ac.uk)b
 ^{a}Department of Primary Care and Population Sciences, Royal Free and University College Medical School, University College London, London NW3 2PF
 ^{b}MRC Health Services Research Collaboration, Department of Social Medicine, University of Bristol, Bristol BS8 2PR
 Correspondence to: Professor Ebrahim
 Accepted 10 February 1999
The number needed to treat—the number of patients who must be treated to prevent one adverse outcome—is a widely used measure. 1 2 It is increasingly being calculated by pooling absolute risk differences in trials included in metaanalyses. 3 4 This option is available in statistical software and the Cochrane Database of Systematic Reviews. 5 6 In this paper, we examine pooled numbers needed to treat derived from trials and metaanalyses of interventions to prevent cardiovascular disease. We show that a pooled number needed to treat may be misleading because of variation in the event rates in trials, differences in the outcomes considered, effects of secular trends on disease risk, and differences in clinical setting. The number needed to treat should be derived by applying the relative risk reductions from treatment which have been estimated by trials or metaanalysis to relevant baseline risks for different types of patients. This provides a range of possible numbers needed to treat in different patient groups.
Summary points
Numbers needed to treat are often used to summarise treatment effects in a clinically relevant way
They are derived from the baseline risk without treatment and the reduction in risk achieved with treatment
Numbers needed to treat are sensitive to factors that change the baseline risk such as the outcome considered, patients' characteristics, secular trends in incidence and case fatality, and the clinical setting
Pooled numbers needed to treat derived from metaanalyses can be seriously misleading because the baseline risk often varies appreciably between the trials
Applying the pooled relative risk reductions calculated from metaanalyses or individual trials to the baseline risk relevant to specific patient group produces a useful number needed to treat
Methods
The interventions selected for study were use of statins for lowering cholesterol concentrations in primary and secondary prevention of coronary heart disease.7^{–}11 In secondary prevention after myocardial infarction, the interventions considered were antiplatelet drugs,12 β blockers,13 and multiple risk factor interventions.14 Comparisons of estimates of efficacy between clinical settings (that is, primary care compared with hospital clinics) were made using metaanalyses of trials of antihypertensive drugs in elderly people.15 Secular trends from 1975 to 1995 in coronary heart disease mortality for men and women aged 5564 years in England and Wales were obtained from routine mortality statistics.16 Pooled absolute risk reductions were obtained using EasyMa software,17 which also provides a pooled number needed to treat to avoid each event considered. Numbers needed to treat were calculated for five years of treatment.
Effect of choice of outcome
The effects of treatment with statins are shown in table 1. Numbers needed to treat vary greatly, depending on the outcome chosen. In communicating a positive message, it is tempting to chose the smallest number needed to treat, for example, “all the bad things that can happen.”18 The combined end point, all vascular events, was made up of different proportions of events in the different trials. In the WOSCOPS and AFCAPS/TexCAPS studies, coronary heart disease deaths made up 16% and 3% respectively of the combined end point. 7 8
In the AFCAPS/TexCAPS trial of lovastatin, the absolute coronary heart disease mortality difference was very close to zero, with 95% confidence intervals which included the possibility of benefit and also of harm. As the absolute difference comes close to zero, the number needed to treat becomes very large and approaches infinity. If the absolute difference is greater than zero, treatment is not beneficial, and the reciprocal of the absolute difference becomes a number needed to harm.19
Effect of variation in baseline risk
Interventions for secondary prevention after myocardial infarction are shown in table 2. While all treatments show very similar reductions in relative risk, the numbers needed to treat vary much more and have wide 95% confidence intervals. The baseline mortality in the individual trials varied greatly, in one case by more than an order of magnitude. These very large differences in absolute risk in the individual trials reflect the participants selected for inclusion. In the trials of multiple risk factor interventions, applying the same relative risk reduction to the trials with the highest and lowest coronary heart disease mortality rates resulted in numbers needed to treat for five years of 2 and 317 respectively.
Effect of secular trends
In primary prevention, the background population risk of coronary heart disease mortality is relevant in determining the numbers needed to treat. To illustrate the effect of secular trends on the number needed to treat, we have assumed that statins were available over the past 20 years and that the reductions in relative risk seen in the trials were achievable, applied to the whole population, and occurred over the whole range of coronary heart disease mortality (table 3). Numbers needed to treat for five years show instability over time. This is more noticeable for women than men, reflecting the greater absolute falls in mortality among women. If there are secular trends in case fatality, perhaps through changes in disease severity, caution in the use of numbers needed to treat may also apply to treatments used in secondary prevention.
Effect of clinical setting
For each outcome, the pooled relative risks are similar for primary care and secondary care trials. By contrast, the numbers needed to treat for five years vary twofold depending on the setting, but the 95% confidence intervals are wide. Despite the large numbers of patients in these trials, it is uncertain whether the large differences in numbers needed to treat are simply due to chance. Generalising the numbers needed to treat derived from trials undertaken in one setting to patient care in another setting may be misleading. However, the relative estimates of efficacy varied very little across the different settings and could be generalised with more confidence (table 4).
Discussion
Randomised controlled trials aim to achieve internal validity through careful inclusion criteria for participants, random allocation to intervention, and blind ascertainment of end points. Consequently, the absolute event rates in trials may bear little relation to the event rates that might be expected in routine clinical practice.
Pooling assumptions
There are two main categories of statistical model for meta analysis—fixed and random effect models. The fixed effect models assume that all the studies are estimating the same “true” effect, and that the variation in effects seen between different trials is a result of random error alone. The random effects model assumes that the trials included in the metaanalysis are a random sample from a hypothetical population of trials of different underlying effect sizes. 20 21 There will almost always be differences in the baseline risks of trials carried out in different populations and at different times. Consequently, there is no single, true, absolute risk difference, as assumed in the fixed effects models. Neither is the variation in risk difference between trials solely the result of a sampling effect. Decisions affecting the baseline risk of participants in a trial, such as inclusion and exclusion criteria or geographical setting, are not made in a random way, as is assumed in the random effects models. Consequently, pooled numbers needed to treat may contravene statistical assumptions made by these models.
Duration of treatment effect
Trials have different lengths of follow up. However, to produce a number needed to treat for five years, for example, all the absolute risk differences need to be standardised for five years. This standardisation requires an assumption of constancy of effect over time, an assumption that may not be reasonable. For example, in the Scandinavian simvastatin survival study, no effect on total mortality was evident until one year of treatment, after which the reduction in absolute risk gradually increased with the duration of follow up.9
Care must be taken in calculating absolute risk differences in metaanalysis programs. For example, most programs require the number of participants in each arm of the trial to be input as denominators. Since trials tend to have differing lengths of follow up, pooled absolute differences calculated using participants, rather than person years as denominators, will assume equal length of follow up across trials, and result in false estimates of absolute risk differences.
Interpretation
In calculating an overall number needed to treat for a metaanalysis, all the data from the trials are taken and pooled—producing a less useful result than that provided by the individual trials. In terms of health economics, an incremental cost effectiveness analysis of an intervention at different levels of baseline risk will almost always be more informative than a summary of cost effectiveness based on a pooled number needed to treat.22 The pooled value may also result in wrong decisions about who should receive treatment if the concept of a threshold number needed to treat, separating those who are likely to benefit from those who are not, is applied.
Deriving number needed to treat
It is preferable to derive numbers needed to treat by applying the relative risk reductions from trials or metaanalyses to estimates of prognosis from cohort studies (representative of the groups for whom treatment decisions are to made), rather than from the trials and metaanalyses themselves.23 If the relative risk reduction varies across different baseline risks, prognostic variables and regression techniques can be used to produce an estimate of treatment benefit to patients with different baseline risks.24 This technique can be used for both individual trials25 and metaanalyses.26
Calculating numbers needed to treat
The number needed to treat is the reciprocal of the absolute risk difference for a bad outcome between treated subjects and the control or placebo group—that is, 1÷(risk of bad outcome in placebo group−risk of bad outcome in treated group). It can be calculated by applying the relative risk reduction obtained from a metaanalysis or a trial to a baseline risk that reflects the risk of the type of patients to be treated. For example, statins achieve a pooled relative risk on treatment of 0.69 for all coronary vascular disease events, but the number needed to treat varies according to the risk group of the patients studied as illustrated below.
In a high risk group
The likely risk of a bad outcome in a high risk group of patients might be as high as 5% a year. This can be estimated from studies of prognosis in relevant patient groups. In this case:
Baseline risk of a bad outcome without treatment is 5%=0.05 per year
Risk of a bad outcome on treatment is 0.05×0.69=0.0345 per year
Risk difference is 0.05−0.0345=0.0155
Number needed to treat is 1÷risk difference=1/0.0155=64 people treated for 1 year to avoid one bad outcome.
In a low risk group
The likely risk of a bad outcome in a low risk group of patients—for example,in primary care— might be as low as 0.5% a year. In this case:
Baseline risk of a bad outcome without treatment is 0.5%=0.005
Risk of a bad outcome on treatment is 0.005×0.69=0.0035
Risk difference is 0.005−0.0035=0.0016
Number needed to treat is 1/0.0016=645 people treated for 1 year to avoid one bad outcome
Understanding numbers needed to treat
The increasing use of numbers needed to treat is welcome in some respects, but caution is required. Numbers needed to treat are no better understood than other measures. 27 28 The method of presenting results of studies influences healthcare decisions. Patients,29 purchasers,27 general practitioners,30 and doctors in teaching hospitals31 are all more likely to believe an intervention is desirable when effectiveness data are presented as a reduction in relative risk than when data from the same studies are presented as a number needed to treat.
Conclusion
In spite of the reservations we outline here, we still believe that the number needed to treat has a place. In the drug treatment of hypertension, numbers needed to treat have been used appropriately to show the greater effectiveness in preventing cardiovascular events achieved when treating older rather than younger patients32 and treating patients with moderate rather than mild hypertension.2 When numbers needed to treat are presented for an intervention, the setting in which it occurred, the time period, the outcome, and the baseline risk of the patients for whom the number needed to treat is thought to be applicable should be described.
Acknowledgments
Contributors: LS contributed to the idea for and design of the project, carried out the analyses on clinical settings, and helped to write the paper. AH contributed to the idea for and design of the project and to writing of the paper. SE contributed to the idea for and design of the project, carried out the analyses other than in clinical settings, wrote the first draft of the paper, and coordinated the project. SE is the guarantor.
Footnotes

Funding LS was funded by a research studentship from the North Thames NHS Executive Research and Development Directorate

Competing interests None declared.