BMJ 2001;322:226-231 ( 27 January )

Education and debate

    Sifting the evidence---what's wrong with significance tests?
    Another comment on the role of statistical methods

Sifting the evidence---what's wrong with significance tests?

Jonathan A C Sterne, senior lecturer in medical statisticsGeorge Davey Smith, professor of clinical epidemiology

Department of Social Medicine, University of Bristol, Bristol BS8 2PR

Correspondence to: J Sterne jonathan.sterne@bristol.ac.uk

The first 150 words of the full text of this article appear below.

The findings of medical research are often met with considerable scepticism, even when they have apparently come from studies with sound methodologies that have been subjected to appropriate statistical analysis. This is perhaps particularly the case with respect to epidemiological findings that suggest that some aspect of everyday life is bad for people. Indeed, one recent popular history, the medical journalist James Le Fanu's The Rise and Fall of Modern Medicine, went so far as to suggest that the solution to medicine's ills would be the closure of all departments of epidemiology.1

One contributory factor is that the medical literature shows a strong tendency to accentuate the positive; positive outcomes are more likely to be reported than null results.2-4 By this means alone a host of purely chance findings will be published, as by conventional reasoning examining 20 associations will produce one result that is "significant at P=0.05" by chance . . . [Full text of this article]


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Articles

Detecting implausible social network effects in acne, height, and headaches: longitudinal analysis
Ethan Cohen-Cole and Jason M Fletcher
BMJ 2008 337: a2533. [Abstract] [Full Text] [PDF]

Corticosteroids for acute respiratory distress syndrome
Neill K J Adhikari and Damon C Scales
BMJ 2008 336: 969-970. [Extract] [Full Text] [PDF]

Listen to the data when results are not significant
Catherine E Hewitt, Natasha Mitchell, and David J Torgerson
BMJ 2008 336: 23-25. [Extract] [Full Text] [PDF]

Effect of nationwide injury prevention programme on serious spinal injuries in New Zealand rugby union: ecological study
Kenneth L Quarrie, Simon M Gianotti, Will G Hopkins, and Patria A Hume
BMJ 2007 334: 1150. [Abstract] [Full Text] [PDF]

Believability of relative risks and odds ratios in abstracts: cross sectional study
Peter C Gøtzsche
BMJ 2006 333: 231-234. [Abstract] [Full Text] [PDF]

Sifting the evidence
Thomas V Perneger, Jonathan Rees, and Michael Apple
BMJ 2001 322: 1184. [Extract] [Full Text]

Some gentle statistics
BMJ 2001 322: 0. [Full Text] [PDF]

Some gentle statistics
BMJ 2001 322: 0. [Full Text] [PDF]

This article has been cited by other articles:

  • MAKSYMOWYCH, W. P., FITZGERALD, O., WELLS, G. A., GLADMAN, D. D., LANDEWE, R., OSTERGAARD, M., TAYLOR, W. J., CHRISTENSEN, R., TAK, P.-P., BOERS, M., SYVERSEN, S. W., BATHON, J. M., RITCHLIN, C. J., MEASE, P. J., BYKERK, V. P., GARNERO, P., GEUSENS, P., EL-GABALAWY, H., ALETAHA, D., INMAN, R. D., KRAUS, V. B., KVIEN, T. K., van der HEIJDE, D. (2009). Proposal for Levels of Evidence Schema for Validation of a Soluble Biomarker Reflecting Damage Endpoints in Rheumatoid Arthritis, Psoriatic Arthritis, and Ankylosing Spondylitis, and Recommendations for Study Design. The Journal of Rheumatology 36: 1792-1799 [Abstract] [Full text]  
  • Curran-Everett, D. (2009). Explorations in statistics: hypothesis tests and P values. Adv. Physiol. Educ. 33: 81-86 [Abstract] [Full text]  
  • Fosgate, G. T. (2009). Practical sample size calculations for surveillance and diagnostic investigations. jvdi 21: 3-14 [Abstract] [Full text]  
  • Cohen-Cole, E., Fletcher, J. M (2008). Detecting implausible social network effects in acne, height, and headaches: longitudinal analysis. BMJ 337: a2533-a2533 [Abstract] [Full text]  
  • Ludbrook, J. (2008). Analysis of 2 x 2 tables of frequencies: matching test to experimental design. Int J Epidemiol 37: 1430-1435 [Abstract] [Full text]  
  • Arena, U, Vizzutti, F, Abraldes, J G, Corti, G, Stasi, C, Moscarella, S, Milani, S, Lorefice, E, Petrarca, A, Romanelli, R G, Laffi, G, Bosch, J, Marra, F, Pinzani, M (2008). Reliability of transient elastography for the diagnosis of advanced fibrosis in chronic hepatitis C. Gut 57: 1288-1293 [Abstract] [Full text]  
  • Quarrie, K. L., Hopkins, W. G. (2008). Tackle Injuries in Professional Rugby Union. Am J Sports Med 36: 1705-1716 [Abstract] [Full text]  
  • Stam, J. (2008). Sinus Thrombosis Should Be Treated With Anticoagulation. Arch Neurol 65: 984-985 [Full text]  
  • Rowlands, D. S., Thorburn, M. S., Thorp, R. M., Broadbent, S., Shi, X. (2008). Effect of graded fructose coingestion with maltodextrin on exogenous 14C-fructose and 13C-glucose oxidation efficiency and high-intensity cycling performance. J. Appl. Physiol. 104: 1709-1719 [Abstract] [Full text]  
  • Adhikari, N. K J, Scales, D. C (2008). Corticosteroids for acute respiratory distress syndrome. BMJ 336: 969-970 [Full text]  
  • Baigent, C., Harrell, F. E, Buyse, M., Emberson, J. R, Altman, D. G (2008). Ensuring trial validity by data quality assurance and diversification of monitoring methods. Clin Trials 5: 49-55 [Abstract]  
  • Hewitt, C. E, Mitchell, N., Torgerson, D. J (2008). Listen to the data when results are not significant. BMJ 336: 23-25 [Full text]  
  • Zaregarizi, M., Edwards, B., George, K., Harrison, Y., Jones, H., Atkinson, G. (2007). Acute changes in cardiovascular function during the onset period of daytime sleep: comparison to lying awake and standing. J. Appl. Physiol. 103: 1332-1338 [Abstract] [Full text]  
  • Giagounidis, A. A. N. (2007). Decitabine dosage in myelodysplastic syndromes. Blood 110: 1082-1083 [Full text]  
  • Quarrie, K. L, Gianotti, S. M, Hopkins, W. G, Hume, P. A (2007). Effect of nationwide injury prevention programme on serious spinal injuries in New Zealand rugby union: ecological study. BMJ 334: 1150-1150 [Abstract] [Full text]  
  • Ioannidis, J. P., Trikalinos, T. A (2007). Authors' response to V Johnson and Y Yuan. Clin Trials 4: 256-257  
  • Evans, A. R., Wiggins, R. D, Mercer, C. H, Bolding, G. J, Elford, J., Ross, M W (2007). Men who have sex with men in Great Britain: comparison of a self-selected internet sample with a national probability sample. Sex. Transm. Infect. 83: 200-205 [Abstract] [Full text]  
  • Lathyris, D. N, Trikalinos, T. A, Ioannidis, J. P. (2007). Evidence from crossover trials: empirical evaluation and comparison against parallel arm trials. Int J Epidemiol 36: 422-430 [Abstract] [Full text]  
  • Blakey, J. D (2007). Looking for a bit of co-action?. Thorax 62: 196-197 [Full text]  
  • Dalrymple, A., Mahn, K., Poston, L., Songu-Mize, E., Tribe, R.M. (2007). Mechanical stretch regulates TRPC expression and calcium entry in human myometrial smooth muscle cells. Mol Hum Reprod 13: 171-179* [Abstract] [Full text]  
  • Thorburn, M. S., Vistisen, B., Thorp, R. M., Rockell, M. J., Jeukendrup, A. E., Xu, X., Rowlands, D. S. (2006). Attenuated gastric distress but no benefit to performance with adaptation to octanoate-rich esterified oils in well-trained male cyclists. J. Appl. Physiol. 101: 1733-1743 [Abstract] [Full text]  
  • Lewis, S. J., Harbord, R. M., Harris, R., Smith, G. D. (2006). Meta-analyses of Observational and Genetic Association Studies of Folate Intakes or Levels and Breast Cancer Risk.. JNCI J Natl Cancer Inst 98: 1607-1622 [Abstract] [Full text]  
  • Gjini, A.B., Stuart, J.M., Cartwright, K., Cohen, J., Jacobs, M., Nichols, T., Ninis, N., Prempeh, H., Whitehouse, A., Heyderman, R.S. (2006). Quality of in-hospital care for adults with acute bacterial meningitis: a national retrospective survey.. QJM 99: 761-769 [Abstract] [Full text]  
  • Snowling, N. J., Hopkins, W. G. (2006). Effects of Different Modes of Exercise Training on Glucose Control and Risk Factors for Complications in Type 2 Diabetic Patients: A meta-analysis.. Diabetes Care 29: 2518-2527 [Abstract] [Full text]  
  • Sullivan, L. M. (2006). Estimation From Samples. Circulation 114: 445-449 [Full text]  
  • Gotzsche, P. C (2006). Believability of relative risks and odds ratios in abstracts: cross sectional study. BMJ 333: 231-234 [Abstract] [Full text]  
  • Lawlor, D. A., Timpson, N., Ebrahim, S., Day, I. N.M., Smith, G. D. (2006). The association of oestrogen receptor {alpha}-haplotypes with cardiovascular risk factors in the British Women's Heart and Health Study. Eur Heart J 27: 1597-1604 [Abstract] [Full text]  
  • Vladusich, T., Lucassen, M. P., Cornelissen, F. W. (2006). Do Cortical Neurons Process Luminance or Contrast to Encode Surface Properties?. J. Neurophysiol. 95: 2638-2649 [Abstract] [Full text]  
  • Ioannidis, J. P.A. (2006). Journals Should Publish All "Null" Results and Should Sparingly Publish "Positive" Results. Cancer Epidemiol. Biomarkers Prev. 15: 186-186 [Full text]  
  • Gore, C. J., Hopkins, W. G., Burge, C. M. (2005). Errors of measurement for blood volume parameters: a meta-analysis. J. Appl. Physiol. 99: 1745-1758 [Abstract] [Full text]  
  • Kain, Z. N. (2005). The Legend of the P Value. Anesth. Analg. 101: 1454-1456 [Full text]  
  • Martin, R. M., Middleton, N., Gunnell, D., Owen, C. G., Smith, G. D. (2005). Breast-Feeding and Cancer: The Boyd Orr Cohort and a Systematic Review With Meta-Analysis. JNCI J Natl Cancer Inst 97: 1446-1457 [Abstract] [Full text]  
  • Byrnes, G., Gurrin, L., Dowty, J., Hopper, J. L. (2005). Publication Policy or Publication Bias?. Cancer Epidemiol. Biomarkers Prev. 14: 1363-1363 [Full text]  
  • Kramer, G. W.P.M., Wanders, S. L., Noordijk, E. M., Vonk, E. J.A., van Houwelingen, H. C., van den Hout, W. B., Geskus, R. B., Scholten, M., Leer, J.-W. H. (2005). Results of the Dutch National Study of the Palliative Effect of Irradiation Using Two Different Treatment Schemes for Non-Small-Cell Lung Cancer. JCO 23: 2962-2970 [Abstract] [Full text]  
  • Davey Smith, G., Bracha, Y., Svendsen, K. H., Neaton, J. D., Haffner, S. M., Kuller, L. H., for the Multiple Risk Factor Intervention Trial Re, (2005). Incidence of Type 2 Diabetes in the Randomized Multiple Risk Factor Intervention Trial. ANN INTERN MED 142: 313-322 [Abstract] [Full text]  
  • Collins, J. E., Heward, J. M., Howson, J. M. M., Foxall, H., Carr-Smith, J., Franklyn, J. A., Gough, S. C. L. (2004). Common Allelic Variants of Exons 10, 12, and 33 of the Thyroglobulin Gene Are Not Associated with Autoimmune Thyroid Disease in the United Kingdom. J. Clin. Endocrinol. Metab. 89: 6336-6339 [Abstract] [Full text]  
  • Pocock, S. J, Collier, T. J, Dandreo, K. J, de Stavola, B. L, Goldman, M. B, Kalish, L. A, Kasten, L. E, McCormack, V. A (2004). Issues in the reporting of epidemiological studies: a survey of recent practice. BMJ 329: 883- [Abstract] [Full text]  
  • Gaida, J E, Cook, J L, Bass, S L, Austen, S, Kiss, Z S (2004). Are unilateral and bilateral patellar tendinopathy distinguished by differences in anthropometry, body composition, or muscle strength in elite female basketball players?. Br. J. Sports. Med. 38: 581-585 [Abstract] [Full text]  
  • Lawlor, D. A., Ebrahim, S., May, M., Davey Smith, G. (2004). (Mis)use of Factor Analysis in the Study of Insulin Resistance Syndrome. Am J Epidemiol 159: 1013-1018 [Abstract] [Full text]  
  • Weaver, C. S., Leonardi-Bee, J., Bath-Hextall, F. J., Bath, P. M.W. (2004). Sample Size Calculations in Acute Stroke Trials: A Systematic Review of Their Reporting, Characteristics, and Relationship With Outcome. Stroke 35: 1216-1224 [Abstract] [Full text]  
  • Wacholder, S., Chanock, S., Garcia-Closas, M., El ghormli, L., Rothman, N. (2004). Assessing the Probability That a Positive Report is False: An Approach for Molecular Epidemiology Studies. JNCI J Natl Cancer Inst 96: 434-442 [Abstract] [Full text]  
  • Ades, A. E., Lu, G., Claxton, K. (2004). Expected Value of Sample Information Calculations in Medical Decision Modeling. Med Decis Making 24: 207-227 [Abstract]  
  • Marks, H. M (2003). Rigorous uncertainty: why RA Fisher is important. Int J Epidemiol 32: 932-937 [Full text]  
  • Sterne, J. (2003). Commentary: Null points--has interpretation of significance tests improved?. Int J Epidemiol 32: 693-694 [Full text]  
  • Lee, J. H., Choi, J. H., Namkung, W., Hanrahan, J. W., Chang, J., Song, S. Y., Park, S. W., Kim, D. S., Yoon, J.-H., Suh, Y., Jang, I.-J., Nam, J. H., Kim, S. J., Cho, M.-O., Lee, J.-E., Kim, K. H., Lee, M. G. (2003). A haplotype-based molecular analysis of CFTR mutations associated with respiratory and pancreatic diseases. Hum Mol Genet 12: 2321-2332 [Abstract] [Full text]  
  • Zwahlen, M., Juni, P., Egger, M. (2003). What Now About Acetaminophen?. Arch Intern Med 163: 1862-1863 [Full text]  
  • Hoggatt, K. J (2003). Commentary: Vitamin supplement use and confounding by lifestyle. Int J Epidemiol 32: 553-555 [Full text]  
  • Mabry, I. R., Richmond, T., Bialostozky, A., Rushton, J. (2003). Interpreting Negative Results: Postpartum Interview Position Not Associated With Improved Outcomes. Arch Pediatr Adolesc Med 157: 333-335 [Full text]  
  • Kirkwood, J. M., Ibrahim, J., Sondak, V. K., Ernstoff, M. S., Flaherty, L., Haluska, F. J., Lens, M.B., Dawes, M. (2002). Use and Abuse of Statistics in Evidence-Based Medicine. JCO 20: 4122-4124 [Full text]  
  • Stallibrass, C, Sissons, P, Chalmers, C (2002). Randomized controlled trial of the Alexander Technique for idiopathic Parkinson's disease. Clin Rehabil 16: 695-708 [Abstract]  
  • Tilling, K., Wolfe, C., Donaldson, N., Kalra, L. (2002). Re: Randomized Controlled Study of Stroke Unit Versus Stroke Team Care in Different Stroke Subtypes * Response. Stroke 33: 1741-1742 [Full text]  
  • Rivara, F. P., Cummings, P. (2002). Publication Bias: The Problem and Some Suggestions. Arch Pediatr Adolesc Med 156: 424-425 [Full text]  
  • Brennan, P. (2002). Gene-environment interaction and aetiology of cancer: what does it mean and how can we measure it?. Carcinogenesis 23: 381-387 [Abstract] [Full text]  
  • Adab, P., Rouse, A. M, Mohammed, M. A, Marshall, T. (2002). Performance league tables: the NHS deserves better. BMJ 324: 95-98 [Full text]  
  • Elkan, R., Kendrick, D., Dewey, M., Hewitt, M., Robinson, J., Blair, M., Williams, D., Brummell, K., Egger, M. (2001). Effectiveness of home based support for older people: systematic review and meta-analysis Commentary: When, where, and why do preventive home visits work?. BMJ 323: 719-719 [Abstract] [Full text]  
  • Cleland, J. G F, Kaye, G. C, Mant, J., Fitzmaurice, D., Murray, E., Lip, G. Y H, Hobbs, F D R., Peterson, G., Jackson, S., Stragliotto, E., Godtfredsen, J., Boysen, G., Petersen, P., Evans, A., Kalra, L., Cates, C., Taylor, F., Cohen, H., Ebrahim, S. (2001). Long term anticoagulation or antiplatelet treatment. BMJ 323: 233-233 [Full text]  
  • Perneger, T. V, Rees, J., Apple, M. (2001). Sifting the evidence. BMJ 322: 1184a-1184 [Full text]  

Rapid Responses:

Read all Rapid Responses

Do P values really measure the strenght of evidence?
Luis Gerk de Azevedo Quadros
bmj.com, 27 Jan 2001 [Full text]
Clarifying the evidence
Stephen M Smith
bmj.com, 28 Jan 2001 [Full text]
Statistics still at square one
Guy Nash
bmj.com, 29 Jan 2001 [Full text]
More criticism of P values, and an alternative
Thomas Perneger
bmj.com, 29 Jan 2001 [Full text]
What Fisher said
David Ibbetson
bmj.com, 29 Jan 2001 [Full text]
Statistical siginificance- the widely misinterpreted term!!
Zubair Kabir
bmj.com, 30 Jan 2001 [Full text]
Where next for Statistical Reliability?
Michael Apple
bmj.com, 29 Jan 2001 [Full text]
science versus statistics
Jonathan Rees
bmj.com, 30 Jan 2001 [Full text]
Thomas Perneger's response merits attention
Jeremy Humphries
bmj.com, 31 Jan 2001 [Full text]
The antithesis of clarity
Steven Ford
bmj.com, 11 Feb 2001 [Full text]
Is Epidemiology Really any Worse than Other Disciplines
Mark Klebanoff
bmj.com, 14 Feb 2001 [Full text]
Understanding statistical significance testing
D E H Llewelyn
bmj.com, 19 Feb 2001 [Full text]
Nations’ subconscious and lifestyle scares
Giuseppe Giocoli
bmj.com, 11 Mar 2001 [Full text]



Access jobs at BMJ Careers
Whats new online at Student 

BMJ