Numbers needed to treat derived from meta-analyses—sometimes informative, usually misleading
BMJ 1999; 318 doi: https://doi.org/10.1136/bmj.318.7197.1548 (Published 05 June 1999) Cite this as: BMJ 1999;318:1548All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Editor,
Smeeth et al 1 demonstrated how inappropriate methods of
calculating NNTs in systematic meta-analyses can be
misleading. Interestingly, the examples they chose quoted
event rates in patient-years. Smeeth et al calculated
(correctly in our view) their NNTs directly from the events
per patient-years. However, some commentators on such trials
quote NNTs for the average follow-up period of the trial.
This alternative method may be considered acceptable
although it is only an approximation to the first method but
we would draw attention to how misleading failure to
recognise the difference between the two methods can be.
For example, using the UKPDS 38 trial as an example, method
1 would give an NNT to prevent any diabetic related death as
152 patients per annum, or 15.2 patients over 10 years.
Method 2 would give an NNT of 20 over (the median follow-up)
of 8.4 years. The three choices of NNT, 152, 15.2 or 20 can
lead to misunderstanding. That this is a real problem was
illustrated in an electronic response regarding the UKPDS 38
trial 2 :
“....We are concerned that there is a discrepancy between
the numbers needed to treat which are stated in the article,
and those that can be calculated. The study states that the
number needed to treat over 10 years to prevent any
complication is 6.1 and to prevent death from a diabetes
related cause is 15.0. In calculating the numbers needed to
treat by using the values in figure 4 (based upon a median
follow up of 8.4 years), we conclude that the number needed
to treat to prevent any complication is 11, and to prevent
death is 20....”
Stefan M Groetsch, Joseph T LaVan, John W Epling, (Full
response on http://www.bmj.com/cgi/eletters/317/7160/703#EL6)
At our local critical appraisal seminars for GPs in Suffolk,
we encountered similar confusion when two participants
presented NNTs from the HOT trial 3 . An added twist to the
potential for comparisons arises when some trials report the
average follow-up period as a mean (the HOT trial) while
others report the median (UKPDS 38).
There is a case for standardising the way NNTs are reported
for trials which give their results in the form of events
per patient-years or at least insisting that commentaries
make clear which method they are using.
An fuller explanation of the different methods can be viewed
on:
http://www.suffolk-maag.ac.uk/ebm/pt-yrs&NNTs.html
with examples available for the UKPDS 38 trial:
http://www.suffolk-maag.ac.uk/stats/cpukpds.html
and with examples available for the HOT trial:
http://www.suffolk-maag.ac.uk/stats/cphot.html
Kevork Hopayian, Leiston Surgery, Suffolk, England.
k.hopayian@tesco.net
John McGough, Aldeburgh Surgery, Suffolk, England.
References
1 Numbers needed to treat derived from meta-analyses
sometimes informative, usually misleading
Liam Smeeth, Andy Haines, and Shah Ebrahim
BMJ 1999; 318: 1548-1551
http://www.bmj.com/cgi/content/full/318/7197/1548
2 UK Prospective Diabetes Study Group. Tight blood pressure
control and risk of macrovascular and microvascular
complications in type 2 diabetes. BMJ 1998. 317: 703-13.
3 Hansson L, Zanchetti A, Carruthers S G et al. Effects of intensive blood-pressure lowering and low
dose aspirin with hypertension: principal results of the Hypertension Optimal Treatment (HOT)
randomised trial. Lancet 1998; 351: 1755-62
Competing interests: No competing interests
EDITOR - There is much to agree with in the article on numbers needed
to treat by Smeeth and colleagues [1]. But to use the word
misleading in the title is in itself misleading. Numbers
needed to treat are a huge advance on what we had before.
They point out, as has been done previously, that for
numbers needed to treat to be comparable they must define
patients’ condition and severity, the intervention, outcome,
and duration [2], and perhaps other relevant issues. Their
suggestion that numbers needed to treat should reflect
underlying baseline risk for an individual patient (or group
of patients) is a restatement of a method described by
Sackett et al [3].
The problem with their argument, encountered so often, is
that it is derived from examples of interventions used to
prevent small effects in large numbers of patients. Most of
us live in a medical world where we need interventions which
produce large effects in small populations. In these
circumstances, the conclusion is that numbers needed to
treat from meta-analysis are usually informative and seldom
misleading.
Take acute pain as an example. There are many high quality
randomised, double-blind and placebo-controlled clinical
trials done over 50 years. For trials to be clinically valid
patients have to have moderate or severe pain on entry. Pain
is measured using standard scales over periods of 4-6 hours.
Using the outcome of at least half pain relief over this
time we have been able to calculate numbers needed to treat
compared with placebo for a range of analgesic interventions
[4-7]. Numbers needed to treat are unaffected by pain model
(dental or postoperative), pain measurement, duration (four
or six hours) or reporting quality (given that trials are
randomised and double-blind) [7].
Moreover, we have been able to use large amounts of data
from individual patients and clinical trials to investigate
the effect of random chance on baseline and experimental
event rates [8]. Because individual clinical trials are set
up to investigate the direction of treatment effect
(treatment better than control), we need to know how much
information is needed to overcome random effects in
estimating the magnitude of the clinical effect of an
intervention – or when a number needed to treat becomes
clinically valid [8].
Numbers needed to treat are tools. Like any tool, when used
appropriately it will be helpful and effective. What we have
to do is to ensure that in any given situation we know what
the rules are for using the tools correctly. Making
swingeing over simplifications from the same selected trials
doesn’t move us any further forward.
Andrew Moore, Consultant Biochemist
Henry McQuay, Clinical Reader in Pain Relief
References:
1 L Smeeth, A Haines, S Ebrahim. Numbers needed to treat
derived from meta-analyses – sometimes informative, usually
misleading. BMJ 1999;318:1548-51.
2 HJ McQuay, RA Moore. Using numerical results from
systematic reviews in clinical practice. Annals of Internal
Medicine 1997 126: 712-720.
3 DL Sackett, WS Richardson, W Rosenberg, RB Haynes.
Evidence-based medicine: How to practice and teach EBM.
Churchill Livingstone, New York 1997: 168-171.
4 A Moore, S Collins, D Carroll, H McQuay. Paracetamol with
and without codeine in acute pain: a quantitative systematic
review. Pain 1997 70: 193-201.
5 RA Moore, H McQuay. Single-patient data meta-analysis of
3,453 postoperative patients: oral tramadol versus placebo,
codeine and combination analgesics. Pain 1997 69: 287-294.
6 HJ McQuay, D Carroll, RA Moore. Injected morphine in
postoperative pain: a quantitative systematic review.
Journal of Pain and Symptom Management 1999 17: 164-74.
7 JE Edwards, A Oldman, L Smith, PJ Wiffen, D Carroll, HJ
McQuay, RA Moore. Oral aspirin in postoperative pain: a
quantitative systematic review. Pain (in press).
8 RA Moore, D Gavaghan, MR Tramèr, SL Collins, HJ McQuay.
Size is everything - large amounts of information are needed
to overcome random effects in estimating direction and magnitude of treatment effects. Pain 1998 78: 209-16.
Competing interests: No competing interests
Editor
Better late than never. Several years after one group of
epidemiologists put forward the NNT ('number needed to treat') derived
from mega-trials and meta-analysis as a summary statistic suitable for
expressing the effectiveness of medical interventions - another group of
epidemiologists have at long last realized that the NNT is very seldom a
valid measure [1].
Some of us came to the conclusion that NNTs were 'not necessarily
true' rather more rapidly, and without the need for three and a half pages
of cumbersome and dubiously appropriate statistical analysis [2]. The deep
flaws in the NNT statistic can be understood by a straightforward act of
inference based on an understanding of the relevant clinical science, and
guided by the principle of 'garbage in, garbage out'.
The spurious precision of the NNT is a statistical artifact which
derives, not from clinical knowledge, but from the illegitimate pooling of
the large amounts of qualitatively unlike and clinically irrelevant data
that are incorproated in almost all mega-trials and meta-analyses. Unless
trials incorporate patients of the same nature and with the same prognosis
and being given the same treatment as those to which the trial results
will apply, then statistical summary
is inevitably misleading [2].
It is somewhat galling that mega-epidemiologists and
biostatisticians so routinely take credit both for the act of creating
spurious analytic tools, and then for belatedly dismantling them - but so
it goes. The wheels of epidemiology grind exceedingly slow. At least
Smeeth et al got there in the end.
When clinical epidemiology gives up its grandiose and self-awarded
claim to be 'evidence-baded medicine' and once again becomes an activity
based in clinical science, maybe such absurdities will become a thing of
the past. I hope so.
Bruce G Charlton MD
Department of Psychology
Ridley Building
University of Newcastle upon Tyne
NE1 7RU
1 Smeeth L, Haines A, Ebrahim S. Numbers needed to treat derived
from meta-analysis - sometimes informative, usually misleading. BMJ 1999;
318: 1548-51.
2 Charlton BG. The future of clinical research: from megatrials
towards methodological rigour and representative sampling. Journal of
Evaluation in Clinical Practice. 1996; 2: 159-69
Competing interests: No competing interests
Editor - In their excellent and timely paper, Smeeth et al1 make a
very important point. The apparent interpretability of the numbers needed
to treat (NNT) measure is bought at the cost of considerable distortion
especially if, as will almost always be the case, background risk varies.
This becomes most noticeable if the trials in a meta-analysis themselves
vary but, even if they do not, it is quite possible that identifiable
groups will vary within trials or in general clinical practice and as such
a single summary NNT will be misleading.
We need measurements of the treatment effect that are as nearly
constant as can be managed from trial to trial2 ('additive' to use the
statistician's term). 'Additive at the point of study, relevant at the
point of application', ought to be our motto. In recommending the relative
risk, however, Smeeth et al do not go far enough. Contrary to much
perceived wisdom, the relative risk is generally only acceptable as an
approximation to the odds-ratio and not vice versa. This approximation is
adequate when (as in their examples) the background risk is small. The
(log) odds-ratio is the measure that is invariant to the arbitrary choice
of death or survival as the main outcome of interest. To concentrate on
one side of the story only (usually, in the case of pessimistic
epidemiologists, the deaths) is no more legitimate than in analysing a 2 x
2 contingency table to take the contribution to the chi-square from the
two cells corresponding to death only ignoring those for survival. This
also usually provides a fair approximation if deaths are rare but it is
still conceptually false. Despite being harder to understand than relative
risks, (log) odds-ratios should be used. Accurate prediction should be our
goal, even if such prediction is complicated.
1. Smeeth, L., Haines, A. and S. Ebrahim, Numbers needed to treat
derived from meta-analyses - sometimes informative, usually misleading.
BMJ 1999; 318: 1548-1551.
2. Senn, S.J. Statistical Issues in Drug Development, 1997. Chichester:
Wiley.
Competing interests: No competing interests
Poor reporting of length of follow-up in clinical trials and systematic reviews
EDITOR- Smeeth et al [1] raise interesting issues concerning the
validity of reporting numbers needed to treat in systematic reviews that
combine trials with varying periods of follow-up. In such situations, only
when the absolute treatment effect is constant over time can the number
needed to treat be correctly estimated from the reciprocal of the pooled
absolute risk difference. By contrast, if a treatment has a constant
relative effect over time, then within a single trial the number needed to
treat will decrease with increasing follow-up [2]. Similarly, we expect
that the number needed to treat will also vary among several similar
trials with different lengths of follow-up.
We have found that the reporting of length of follow-up is often
inadequate to assess whether the constant absolute risk model or constant
relative risk model is the more appropriate in a given systematic review,
or to make adjustments for length of follow-up in the analysis. We
assessed the quality of reporting of length of follow-up in the systematic
reviews published on the Cochrane Library (Issue 1, 1998) that synthesised
mortality outcomes. We excluded reviews in pregnancy and childbirth where
duration of follow-up is typically not an issue. Forty-four relevant
systematic reviews were found which combined 306 trials. For 43% of the
trials there was no mention of the duration of follow-up in the published
review.
To assess whether the cause was inadequate trial reporting or poor
data abstraction we considered in more detail the 17 systematic reviews
for interventions related to stroke, and compared the reporting of follow-
up in the reviews with that in the 103 trials on which they were based.
We noted how the reviewers had abstracted the length of follow-up:
categorising it as fixed follow-up (all participants studied for the same
length of time), variable follow-up (summarised by mean, median or range
of follow-up) or whether follow-up was not stated. We found 93% agreement
between the reviewers' abstractions and our own assessments, suggesting
that poor trial reporting is responsible for many of the omissions.
These results support the results of other reviews of reporting of follow-
up in clinical trials and cohort studies [3,4].
Our findings suggest that so many trial reports omit mentioning
length of follow-up that in practice it may not be possible to adjust for
length of follow-up in a systematic review.
Acknowledgement
We were grateful for the assistance of Hazel Fraser and the Cochrane
Stroke Review group for allowing us access to copies of the 103 trials.
References
1 Smeeth L, Haines A, Ebrahim S. Numbers needed to treat derived
from meta-analyses- sometimes informative, usually misleading. BMJ 1999;
318: 1548-51.
2 Altman DG, Andersen PK. Calculating the number needed to treat for
trials where the outcome is time to an event. BMJ, in press.
3 Altman DG, De Stavola BL, Love SB, Stepniewska KA. Review of
survival analyses published in cancer journals. Br J Cancer 1995;72:511-
18.
4 Schemper M, Smith TL. A note on quantifying follow-up in studies of
failure time. Controlled Clinical Trials 1996; 17(4): 343-346.
Roberto D'Amico
Medical Statistician
Jonathan J Deeks
Medical Statistician
Douglas G Altman
Professor of Statistics in Medicine
ICRF/NHS Centre for Statistics in Medicine
Institute of Health Sciences
Old Road, Headington
Oxford OX3 7LF
Competing interests: No competing interests