BMJ 1995;311:485 (19 August)

Statistics notes

Absence of evidence is not evidence of absence

Douglas G Altman, head,a J Martin Bland, reader in medical statistics b

a Medical Statistics Laboratory, Imperial Cancer Research Fund, London WC2A 3PX, b Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE

Correspondence to: Mr Altman.

The non-equivalence of statistical significance and clinical importance has long been recognised, but this error of interpretation remains common. Although a significant result in a large study may sometimes not be clinically important, a far greater problem arises from misinterpretation of non-significant findings. By convention a P value greater than 5% (P>0.05) is called "not significant." Randomised controlled clinical trials that do not show a significant difference between the treatments being compared are often called "negative." This term wrongly implies that the study has shown that there is no difference, whereas usually all that has been shown is an absence of evidence of a difference. These are quite different statements.

The sample size of controlled trials is generally inadequate, with a consequent lack of power to detect real, and clinically worthwhile, differences in treatment. Freiman et al1 found that only 30% of a sample of 71 trials published in the New England Journal of Medicine in 1978-9 with P>0.1 were large enough to have a 90% chance of detecting even a 50% difference in the effectiveness of the treatments being compared, and they found no improvement in a similar sample of trials published in 1988. To interpret all these "negative" trials as providing evidence of the ineffectiveness of new treatments is clearly wrong and foolhardy. The term "negative" should not be used in this context.2

A recent example is given by a trial comparing octreotide and sclerotherapy in patients with variceal bleeding.3 The study was carried out on a sample of only 100 despite a reported calculation that suggested that 1800 patients were needed. This trial had only a 5% chance of getting a statistically significant result if the stated clinically worthwhile treatment difference truly existed. One consequence of such low statistical power was a wide confidence interval for the treatment difference. The authors concluded that the two treatments were equally effective despite a 95% confidence interval that included differences between the cure rates of the two treatments of up to 20 percentage points.

Similar evidence of the dangers of misinterpretation of non-significant results is found in numerous metaanalyses (overviews) of published trials, when few or none of the individual trials were statistically large enough. A dramatic example is provided by the overview of clinical trials evaluating fibrinolytic treatment (mostly streptokinase) for preventing reinfarction after acute myocardial infarction. The overview of randomised controlled trials found a modest but clinically worthwhile (and highly significant) reduction in mortality of 22%,4 but only five of the 24 trials had shown a statistically significant effect with P<0.05. The lack of statistical significance of most of the individual trials led to a long delay before the true value of streptokinase was appreciated.

While it is usually reasonable not to accept a new treatment unless there is positive evidence in its favour, when issues of public health are concerned we must question whether the absence of evidence is a valid enough justification for inaction. A recent publicised example is the suggested link between some sudden infant deaths and antimony in cot mattresses. Statements about the absence of evidence are common--for example, in relation to the possible link between violent behaviour and exposure to violence on television and video, the possible harmful effects of pesticide residues in drinking water, the possible link between electromagnetic fields and leukaemia, and the possible transmission of bovine spongiform encephalopathy from cows. Can we be comfortable that the absence of clear evidence in such cases means that there is no risk or only a negligible one?

When we are told that "there is no evidence that A causes B" we should first ask whether absence of evidence means simply that there is no information at all. If there are data we should look for quantification of the association rather than just a P value. Where risks are small P values may well mislead: confidence intervals are likely to be wide, indicating considerable uncertainty. While we can never prove the absence of a relation, when necessary we should seek evidence against the link between A and B--for example, from case-control studies. The importance of carrying out such studies will relate to the seriousness of the postulated effect and how widespread is the exposure in the population.

  1. Freiman JA, Chalmers TC, Smith H, Kuebler RR. The importance of beta, the type II error, and sample size in the design and interpretation of the randomized controlled trial: survey of two sets of "negative" trials. In: Bailar JC, Mosteller F, eds. Medical uses of statistics. 2nd ed. Boston, MA: NEJM Books, 1992:357-73.
  2. Chalmers I. Proposal to outlaw the term "negative trial." BMJ1985;290:1002.
  3. Sung JJY, Chung SCS, Lai C-W, Chan FKL, Leung JWC, Yung M-L, Kassianides C, et al. Octreotide infusion or emergency sclerotherapy for variceal haemorrhage. Lancet 1993;342:637-41. [Medline]
  4. Yusuf S, Collins R, Peto R, Furberg C, Stampfer MJ, Goldhaber SZ, et al. Intravenous and intracoronary fibrinolytic therapy in acute myocardial infarction: overview of results on mortality, reinfarction and side-effects from 33 randomized controlled trials. Eur Heart J 1985;6:556-85. [Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Articles

Treatment of hepatic encephalopathy: Authors' reply
Bodil Als-Nielsen, Lise L Gluud, and Christian Gluud
BMJ 2004 329: 112. [Extract] [Full Text]

"Evidence of absence" can be important
Michael Joffe
BMJ 2003 326: 1267. [Extract] [Full Text]

This article has been cited by other articles:

  • Keene, O. N., Vestbo, J., Anderson, J. A., Calverley, P. M. A., Celli, B., Ferguson, G. T., Jenkins, C., Jones, P. W. (2009). Methods for therapeutic trials in COPD: lessons from the TORCH trial. Eur Respir J 34: 1018-1023 [Abstract] [Full text]  
  • Van Vaerenbergh, C. (2009). Using mixed methods to identify factors influencing patient flow. Health Serv Manage Res 22: 170-175 [Abstract] [Full text]  
  • Hawke, F., Burns, J., Landorf, K. B. (2009). Evidence-Based Podiatric Medicine: Importance of Systematic Reviews in Clinical Practice. J. Am. Podiatr. Med. Assoc. 99: 260-266 [Abstract] [Full text]  
  • Stevenson, H. A., Harrison, J. E. (2009). Structured abstracts: Do they improve citation retrieval from dental journals?. J. Orthod. 36: 52-60 [Abstract] [Full text]  
  • JOHNSON, S. R., FELDMAN, B. M., POPE, J. E., TOMLINSON, G. A. (2009). Shifting Our Thinking About Uncommon Disease Trials: The Case of Methotrexate in Scleroderma. The Journal of Rheumatology 36: 323-329 [Abstract] [Full text]  
  • Fielder, A. R., Hildebrand, P. L., Ells, A., Lorenz, B., Trese, M. T., Capone, A. Jr, Gordon, R. A., Wilson, C., Fleck, B. W., Chiang, M. F. (2009). Systematic Review of Digital Imaging Screening Strategies for Retinopathy of Prematurity. Pediatrics 123: e360-e361 [Full text]  
  • Williams, J.D., Gollany, H.T., Siemens, M.C., Wuest, S.B., Long, D.S. (2009). Comparison of runoff, soil erosion, and winter wheat yields from no-till and inversion tillage production systems in northeastern Oregon. Journal of Soil and Water Conservation 64: 43-52 [Abstract]  
  • Kriston, L., Holzel, L., Weiser, A.-K., Berner, M. M., Harter, M. (2008). Meta-analysis: Are 3 Questions Enough to Detect Unhealthy Alcohol Use?. ANN INTERN MED 149: 879-888 [Abstract] [Full text]  
  • Diel, R. (2008). Predictive Value of the Tuberculin Skin Test and the QuantiFERON-TB Gold In-Tube Assay for the Development of Active Tuberculosis Disease. Am. J. Respir. Crit. Care Med. 178: 1282-1282 [Full text]  
  • Hackshaw, A. (2008). Small studies: strengths and limitations. Eur Respir J 32: 1141-1143 [Full text]  
  • Singh, A. K., Kelley, K., Agarwal, R. (2008). Interpreting Results of Clinical Trials: A Conceptual Framework. CJASN 3: 1246-1252 [Full text]  
  • Hunsberger, S., Freidlin, B., Smith, M. A. (2008). Complexities in Interpretation of Osteosarcoma Clinical Trial Results. JCO 26: 3103-3104 [Full text]  
  • Baigent, C., Harrell, F. E, Buyse, M., Emberson, J. R, Altman, D. G (2008). Ensuring trial validity by data quality assurance and diversification of monitoring methods. Clin Trials 5: 49-55 [Abstract]  
  • Gregor, S., Maegele, M., Sauerland, S., Krahn, J. F., Peinemann, F., Lange, S. (2008). Negative Pressure Wound Therapy: A Vacuum of Evidence?. Arch Surg 143: 189-196 [Abstract] [Full text]  
  • Beutner, D., Leiner, S., Korf, E. S., Killestein, J., Sullivan, F., Swan, I., Daly, F. (2008). Prednisolone or acyclovir in Bell's palsy.. NEJM 358: 306-306 [Full text]  
  • Amusat, N. T (2007). On "A survey of therapeutic ultrasound..." Wong et al. Phys Ther. 2007;87:986 994.. ptjournal 87: 1558-1559 [Full text]  
  • Tleyjeh, I. M., Baddour, L. M. (2007). Staphylococcus aureus Bacteremia and Infective Endocarditis: Old Questions, New Answers?. Mayo Clin Proc. 82: 1163-1164 [Full text]  
  • Ge, B. (2007). Alternative Treatments of Vasomotor Symptoms of Menopause. ANN INTERN MED 147: 346-346 [Full text]  
  • Allison, G. T. (2007). Statistical and Clinical Interpretation of Research Results. J. Am. Podiatr. Med. Assoc. 97: 165-170 [Abstract] [Full text]  
  • Siegman-Igra, Y., Torres-Tortosa, M., Caballero-Granado, F. J., Canueto, J., Jetton, L., Cosgrove, S. E., Fowler, V. G. Jr., Boucher, H. W., Moran, G. J., Talan, D. A. (2006). Therapy for Methicillin-Resistant Staphylococcus aureus. NEJM 355: 2153-2155 [Full text]  
  • Petrie, A. (2006). Statistics in orthopaedic papers. J Bone Joint Surg Br 88-B: 1121-1136 [Abstract] [Full text]  
  • Geddes, J. (2006). Providing the best available evidence: INVITED COMMENTARY ON... THE ANTIDEPRESSANT TALE. Adv. Psychiatr. Treat. 12: 327-328 [Abstract] [Full text]  
  • Tiruvoipati, R., Balasubramanian, S. P., Atturu, G., Peek, G. J., Elbourne, D. (2006). Improving the quality of reporting randomized controlled trials in cardiothoracic surgery: The way forward.. J. Thorac. Cardiovasc. Surg. 132: 233-240 [Abstract] [Full text]  
  • McGarvey, C, McDonnell, M, Hamilton, K, O'Regan, M, Matthews, T (2006). An 8 year study of risk factors for SIDS: bed-sharing versus non-bed-sharing. Arch. Dis. Child. 91: 318-323 [Abstract] [Full text]  
  • Baker, J. R, Best, A. M, Pade, P. A, McCance-Katz, E. F (2006). Effect of Buprenorphine and Antiretroviral Agents on the QT Interval in Opioid-Dependent Patients. The Annals of Pharmacotherapy 40: 392-396 [Abstract] [Full text]  
  • Jonville-Bera, A. P., Giraudeau, B., Autret-Leca, E. (2006). Reporting of drug tolerance in randomized clinical trials: when data conflict with authors' conclusions.. ANN INTERN MED 144: 306-307 [Full text]  
  • Wong, S. M., Hui, A. C.F., Tong, P.-Y., Poon, D. W.F., Yu, E., Wong, L. K.S. (2005). Treatment of Lateral Epicondylitis with Botulinum Toxin: A Randomized, Double-Blind, Placebo-Controlled Trial. ANN INTERN MED 143: 793-797 [Abstract] [Full text]  
  • Berti, I., Longo, G., Visintin, S., Chowdhury, B. A., Jenkins, C. R., Marks, G. B., Reddel, H. K., Lee, D. K.C., Raghupathy, A., Brashier, B., Salvi, S., Boushey, H. A., Israel, E., Fabbri, L. M. (2005). Treatment of Mild Asthma. NEJM 353: 424-427 [Full text]  
  • Merlo, J., Chaix, B., Yang, M., Lynch, J., Rastam, L. (2005). A brief conceptual tutorial of multilevel analysis in social epidemiology: linking the statistical concept of clustering to the idea of contextual phenomenon. J. Epidemiol. Community Health 59: 443-449 [Abstract] [Full text]  
  • Schnuch, A, Lessmann, H, Becker, D, Diepgen, T L, Drexler, H, Erdmann, S, Fartasch, M, Greim, H, Kricke-Helling, P, Merget, R, Merk, H, Nowak, D, Rothe, A, Stropp, G, Wallenstein, G, Uter, W (2005). Designation of substances as skin sensitizing chemicals: a reply. Hum Exp Toxicol 24: 157-159  
  • Soares, H. P., Kumar, A., Daniels, S., Swann, S., Cantor, A., Hozo, I., Clark, M., Serdarevic, F., Gwede, C., Trotti, A., Djulbegovic, B. (2005). Evaluation of New Treatments in Radiation Oncology: Are They Better Than Standard Treatments?. JAMA 293: 970-978 [Abstract] [Full text]  
  • Cummings, P., Rivara, F. P., Koepsell, T. D. (2004). Writing Informative Abstracts for Journal Articles. Arch Pediatr Adolesc Med 158: 1086-1088 [Full text]  
  • Bogacki, R. E., Best, A., Abbey, L. M. (2004). Equivalence Study of a Dental Anatomy Computer-Assisted Learning Program. J Dent Educ 68: 867-871 [Abstract] [Full text]  
  • Als-Nielsen, B., Gluud, L. L, Gluud, C. (2004). Treatment of hepatic encephalopathy: Authors' reply. BMJ 329: 112-112 [Full text]  
  • Kent, D. M., Fendrick, A. M., Langa, K. M. (2004). New and Dis-Improved: On the Evaluation and Use of Less Effective, Less Expensive Medical Interventions. Med Decis Making 24: 281-286 [Abstract]  
  • Kent, D. M., Hinchey, J., Price, L. L., Levine, S. R., Selker, H. P. (2004). In Acute Ischemic Stroke, Are Asymptomatic Intracranial Hemorrhages Clinically Innocuous?. Stroke 35: 1141-1146 [Abstract] [Full text]  
  • Alderson, P. (2004). Absence of evidence is not evidence of absence. BMJ 328: 476-477 [Full text]  
  • Joffe, M. (2003). "Evidence of absence" can be important. BMJ 326: 1267-1267 [Full text]  
  • Cummings, P., Rivara, F. P. (2003). Reporting Statistical Information in Medical Journal Articles. Arch Pediatr Adolesc Med 157: 321-324 [Full text]  
  • Alderson, P., Chalmers, I. (2003). Research pointers: Survey of claims of no effect in abstracts of Cochrane reviews. BMJ 326: 475-475 [Full text]  
  • Varosy, P. D., Waters, D. D. (2002). Do statins confer early benefit after acute coronary syndromes? The results from FLORIDA. Eur Heart J 23: 1893-1896  
  • Arribas, J. R., Pulido, F., Yeni, P. G., Hammer, S. M. (2002). Underpowered Clinical Trials of Antiretroviral Treatment. JAMA 288: 2120-2121 [Full text]  
  • Palmer, C R (2002). Ethics, data-dependent designs, and the strategy of clinical trials: time to start learning-as-we-go?. Stat Methods Med Res 11: 381-402 [Abstract]  
  • Moseley, J. B., O'Malley, K., Petersen, N. J., Menke, T. J., Brody, B. A., Kuykendall, D. H., Hollingsworth, J. C., Ashton, C. M., Wray, N. P. (2002). A Controlled Trial of Arthroscopic Surgery for Osteoarthritis of the Knee. NEJM 347: 81-88 [Abstract] [Full text]  
  • Wong, S. M., Griffith, J. F., Tang, A., Hui, A. C. F. (2002). Re: The role of ultrasonography in the diagnosis and management of idiopathic plantar fasciitis. Rheumatology (Oxford) 41: 835-836 [Full text]  
  • Perry, A. C., Okuyama, T., Tanaka, K., Signorile, J., Kaplan, T. A., Wang, X. (2002). A Comparison of Health and Fitness-Related Variables in a Small Sample of Children of Japanese Descent on 2 Continents. Arch Pediatr Adolesc Med 156: 362-368 [Abstract] [Full text]  
  • Cummings, P, Koepsell, T D (2002). Statistical and design issues in studies of groups. Inj. Prev. 8: 6-7 [Full text]  
  • Sahota, P., Rudolf, M. C J, Dixey, R., Hill, A. J, Barth, J. H, Cade, J. (2001). Randomised controlled trial of primary school based intervention to reduce risk factors for obesity. BMJ 323: 1029-1029 [Abstract] [Full text]  
  • Rivara, F. P., Cummings, P. (2001). Writing for Publication in Archives of Pediatrics & Adolescent Medicine. Arch Pediatr Adolesc Med 155: 1090-1092 [Full text]  
  • Altman, D. G., Schulz, K. F., Moher, D., Egger, M., Davidoff, F., Elbourne, D., Gotzsche, P. C., Lang, T., for the CONSORT Group, (2001). The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration. ANN INTERN MED 134: 663-694 [Abstract] [Full text]  
  • Djulbegovic, B., Clarke, M. (2001). Scientific and Ethical Issues in Equivalence Trials. JAMA 285: 1206-1208 [Full text]  
  • Briggs, A. (2000). Economic evaluation and clinical trials: size matters. BMJ 321: 1362-1363 [Full text]  
  • Bower, P., Byford, S., Sibbald, B., Ward, E., King, M., Lloyd, M., Gabbay, M. (2000). Randomised controlled trial of non-directive counselling, cognitive-behaviour therapy, and usual general practitioner care for patients with depression. II: Cost effectiveness. BMJ 321: 1389-1392 [Abstract] [Full text]  
  • Sedgwick, P. (2000). Book Review: 100 statistical tests. Stat Methods Med Res 9: 519-520  
  • Eckstein, M., Nieman, C., Merlino, J., Polk, J. D., Kovach, B., Mancuso, C., Fallon, Jr, W. F., Cristofani, C., Sagel, J. S., Gausche, M., Lewis, R. J. (2000). Out-of-Hospital Endotracheal Intubation of Children. JAMA 283: 2790-2792 [Full text]  
  • Harrison, J. E. (2000). Evidence-based Orthodontics--How do I assess the evidence?. J. Orthod. 27: 189-197 [Full text]  
  • Greene, W. L., Concato, J., Feinstein, A. R. (2000). Claims of Equivalence in Medical Research: Are They Supported by the Evidence?. ANN INTERN MED 132: 715-722 [Abstract] [Full text]  
  • Rushton, L. (2000). Reporting of occupational and environmental research: use and misuse of statistical and epidemiological methods. Occup. Environ. Med. 57: 1-9 [Abstract] [Full text]  
  • Rigby, A. S. (1999). Getting past the statistical referee: moving away from P-values and towards interval estimation. Health Educ Res 14: 713-715 [Full text]  
  • Raynor, P., Rudolf, M. C J, Cooper, K., Marchant, P., Cottrell;, D., BLAIR, M. (1999). A randomised controlled trial of specialist health visitor intervention for failure to thrive • Commentary. Arch. Dis. Child. 80: 500-506 [Abstract] [Full text]  
  • TARNOW-MORDI, W. O., HEALY, M. J R (1999). Distinguishing between "no evidence of effect" and "evidence of no effect" in randomised controlled trials and other comparisons. Arch. Dis. Child. 80: 210-211 [Full text]  
  • Rudolf, M C J, Lyth, N, Bundle, A, Rowland, G, Kelly, A, Bosson, S, Garner, M, Guest, P, Khan, M, Thazin, R, Bennett, T, Damman, D, Cove, V, Kaur, V (1999). A search for the evidence supporting community paediatric practice. Arch. Dis. Child. 80: 257-261 [Abstract] [Full text]  
  • Williams, H., Adetugbo, K., Po, A. L. W., Naldi, L., Diepgen, T., Murrell, D. (1998). The Cochrane Skin Group: Preparing, Maintaining, and Disseminating Systematic Reviews of Clinical Interventions in Dermatology. Arch Dermatol 134: 1620-1626 [Abstract] [Full text]  
  • Williams, C, Harrad, R A, Sparrow, J M, Harvey, I, Golding, J, Lee, J., Adams, G., Sloper, J., McIntyre, A., Fielder, A. R, Aylward, G W, Rahi, J., Dezateux, C. (1998). Future of preschool vision screening. BMJ 316: 937a-937 [Full text]  
  • Nishioka, S. d. A. (1996). Corticosteroids and Risk of Complicated Varicella. Arch Pediatr Adolesc Med 150: 1314-1314 [Abstract]  
  • Hatcher, S. (1996). Predicting which psychiatric patients are at risk of suicide. BMJ 313: 884-884 [Full text]  
  • Giavarina, D., Schiavon, R. (1996). Bleeding Tendency and Patient Interview: Usefulness for Surgery Screening. Arch Intern Med 156: 1475-1475 [Abstract]  
  • Jones, B, Jarvis, P, Lewis, J A, Ebbutt, A F (1996). Trials to assess equivalence: the importance of rigorous methods. BMJ 313: 36-39 [Full text]  
  • Bender, R., Sawicki, P. T (1996). Interpretation of study's results is open to criticism. BMJ 312: 254-254 [Full text]  
  • (1996). Reaching the Parts Other Journals cannot Reach. Clin Child Psychol Psychiatry 1: 5-9  
  • King, R., Denne, J. (1995). Audit suggests that use of aspirin is rising in coronary heart disease. BMJ 311: 1504-1504 [Full text]  



Access jobs at BMJ Careers
Whats new online at Student 

BMJ