Br Med J (Clin Res Ed)  1986;292:746-750 (15 March), doi:10.1136/bmj.292.6522.746

Confidence intervals rather than P values: estimation rather than hypothesis testing.

M J Gardner, D G Altman

Overemphasis on hypothesis testing--and the use of P values to dichotomise significant or non-significant results--has detracted from more useful approaches to interpreting study results, such as estimation and confidence intervals. In medical studies investigators are usually interested in determining the size of difference of a measured outcome between groups, rather than a simple indication of whether or not it is statistically significant. Confidence intervals present a range of values, on the basis of the sample data, in which the population value for such a difference may lie. Some methods of calculating confidence intervals for means and differences between means are given, with similar information for proportions. The paper also gives suggestions for graphical display. Confidence intervals, if appropriate to the type of study, should be used for major findings in both the main text of a paper and its abstract.


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Article

The tyranny of power: is there a better way to calculate sample size?
John Martin Bland
BMJ 2009 339: b3985. [Full Text]

This article has been cited by other articles:

  • Ross, J. S., Madigan, D., Hill, K. P., Egilman, D. S., Wang, Y., Krumholz, H. M. (2009). Pooled Analysis of Rofecoxib Placebo-Controlled Clinical Trial Data: Lessons for Postmarket Pharmaceutical Safety Surveillance. Arch Intern Med 169: 1976-1985 [Abstract] [Full text]  
  • Bland, J. M. (2009). The tyranny of power: is there a better way to calculate sample size?. BMJ 339: b3985-b3985 [Full text]  
  • Finch, S., Cumming, G. (2009). Putting Research in Context: Understanding Confidence Intervals from One or More Studies. J Pediatr Psychol 34: 903-916 [Abstract] [Full text]  
  • Curran-Everett, D. (2009). Explorations in statistics: confidence intervals. Adv. Physiol. Educ. 33: 87-90 [Abstract] [Full text]  
  • Frost, C., Kallis, C. (2009). Reply: A plea for confidence intervals and consideration of generalizability in diagnostic studies. Brain 132: e103-e103 [Full text]  
  • Freedman, L. S (2008). An analysis of the controversy over classical one-sided tests. Clin Trials 5: 635-640 [Abstract]  
  • Pecori Giraldi, F., Ambrogio, A. G., De Martin, M., Fatti, L. M., Scacchi, M., Cavagnini, F. (2007). Specificity of First-Line Tests for the Diagnosis of Cushing's Syndrome: Assessment in a Large Series. J. Clin. Endocrinol. Metab. 92: 4123-4129 [Abstract] [Full text]  
  • Pieper, S., Brosschot, J. F., van der Leeden, R., Thayer, J. F. (2007). Cardiac Effects of Momentary Assessed Worry Episodes and Stressful Events. Psychosom. Med. 69: 901-909 [Abstract] [Full text]  
  • Wiedermann, B. L. (2007). Weighing the Evidence: How Confident Are You with Confidence Intervals? (Hint: You Might Be Missing Something). AAP Grand Rounds 17: 39-40 [Full text]  
  • Groot Koerkamp, B., Hunink, M.G. M., Stijnen, T., Hammitt, J. K., Kuntz, K. M., Weinstein, M. C. (2007). Limitations of Acceptability Curves for Presenting Uncertainty in Cost-Effectiveness Analysis. Med Decis Making 27: 101-111 [Abstract]  
  • Kain, Z. N., MacLaren, J. (2007). P Less Than .05: What Does It Really Mean?. Pediatrics 119: 608-610 [Full text]  
  • GARCIA, H. H., GONZALEZ, A. E., GILMAN, R. H., MOULTON, L. H., VERASTEGUI, M., RODRIGUEZ, S., GAVIDIA, C., TSANG, V. C. W., THE CYSTICERCOSIS WORKING GROUP IN PERU, (2006). COMBINED HUMAN AND PORCINE MASS CHEMOTHERAPY FOR THE CONTROL OF T. SOLIUM.. Am J Trop Med Hyg 74: 850-855 [Abstract] [Full text]  
  • Barkham, M., Gilbert, N., Connell, J., Marshall, C., Twigg, E. (2005). Suitability and utility of the CORE-OM and CORE-A for assessing severity of presenting problems in psychological therapy services based in primary and secondary care settings. Br. J. Psychiatry 186: 239-246 [Abstract] [Full text]  
  • Bacchetti, P., Wolf, L. E., Segal, M. R., McCulloch, C. E. (2005). Ethics and Sample Size. Am J Epidemiol 161: 105-110 [Abstract] [Full text]  
  • Tobin, M. D, Minelli, C., Burton, P. R, Thompson, J. R (2004). Commentary: Development of Mendelian randomization: from hypothesis test to 'Mendelian deconfounding'. Int J Epidemiol 33: 26-29 [Full text]  
  • Sterne, J. (2003). Commentary: Null points--has interpretation of significance tests improved?. Int J Epidemiol 32: 693-694 [Full text]  
  • Dewey, H. M., Thrift, A. G., Mihalopoulos, C., Carter, R., Macdonell, R. A.L., McNeil, J. J., Donnan, G. A. (2003). Lifetime Cost of Stroke Subtypes in Australia: Findings From the North East Melbourne Stroke Incidence Study (NEMESIS). Stroke 34: 2502-2507 [Abstract] [Full text]  
  • Kelley, K., Maxwell, S. E., Rausch, J. R. (2003). Obtaining Power or Obtaining Precision: Delineating Methods of Sample-Size Planning. Eval Health Prof 26: 258-287 [Abstract]  
  • Royen, E. V., Mangelschots, K., Vercruyssen, M., Neubourg, D. D., Valkenburg, M., Ryckaert, G., Gerris, J. (2003). Multinucleation in cleavage stage embryos. Hum Reprod 18: 1062-1069 [Abstract] [Full text]  
  • Asai, T. (2002). Editorial I: Confidence in statistical analysis. Br J Anaesth 89: 807-810 [Full text]  
  • Sterne, J. A C, Smith, G. D., Cox, D R (2001). Sifting the evidence--what's wrong with significance tests?. ptjournal 81: 1464-1469 [Full text]  
  • Ely, M. (1999). The Importance of Estimates and Confidence Intervals Rather than P Values. Sociology 33: 185-190 [Abstract]  
  • Willeke, F., Willeke, M., Hinz, U., Lorenz, D., Nitschmann, K., Grauer, A., Senninger, N., Klar, E., Herfarth, C. (1998). Effect of Surgeon Expertise on the Outcome in Primary Hyperparathyroidism. Arch Surg 133: 1066-1070 [Abstract] [Full text]  
  • Dranitsaris, G. (1998). Review : Statistical methods in clinical research: A review for pharmacists. J Oncol Pharm Pract 4: 151-158 [Abstract]  
  • Everitt, B S, Landau, S (1998). The use of multivariate statistical methods in psychiatry. Stat Methods Med Res 7: 253-277 [Abstract]  
  • Lopes, A. A. (1997). The Diagnostic Value of the Respiratory Rate in Febrile Children Younger Than 2 Years. Arch Pediatr Adolesc Med 151: 747-748 [Abstract]  
  • Begg, C., Cho, M., Eastwood, S., Horton, R., Moher, D., Olkin, I., Pitkin, R., Rennie, D., Schulz, K. F., Simel, D., Stroup, D. F. (1996). Improving the Quality of Reporting of Randomized Controlled Trials: The CONSORT Statement. JAMA 276: 637-639 [Abstract]  
  • Einarson, T. R., Arikian, S. R., Doyle, J. J. (1995). Rank-order Stability Analysis (ROSA): Testing Pharmacoeconomic Data. Med Decis Making 15: 367-372 [Abstract]  
  • Trenkwalder, C., Schwarz, J., Gebhard, J., Ruland, D., Trenkwalder, P., Hense, H.-W., Oertel, W. H. (1995). Starnberg Trial on Epidemiology of Parkinsonism and Hypertension in the Elderly: Prevalence of Parkinson's Disease and Related Disorders Assessed by a Door-to-Door Survey of Inhabitants Older Than 65 Years. Arch Neurol 52: 1017-1022 [Abstract]  
  • Calabro, A., Piarulli, F., Milan, D., Rossi, A., Coscetti, G., Crepaldi, G. (1993). Clinical Assessment of Low Molecular Weight Heparin Effects in Peripheral Vascular Disease. ANGIOLOGY 44: 188-195 [Abstract]  
  • Pollock, C., Freemantle, N., Sheldon, T., Song, F., Mason, J. (1993). Methodological difificulties in rehabilitation research. Clin Rehabil 7: 63-72 [Abstract]  
  • Twomey, P. L. (1991). Invented Review: Getting Started in Clinical Nutrition Research. Nutr Clin Pract 6: 175-183 [Abstract]  
  • Hall, J. C., Hall, J. L., Christiansen, K. (1991). A Comparison of the Roles of Cefamandole and Ceftriaxone in Abdominal Surgery. Arch Surg 126: 512-516 [Abstract]  
  • Johnson, B. H., Rypins, E. B. (1990). Single-Lumen vs Double-Lumen Catheters for Total Parenteral Nutrition: A Randomized, Prospective Trial. Arch Surg 125: 990-992 [Abstract]  
  • Healy, D. T. (1990). The psychopharmacological era: notes toward a history. J Psychopharmacol 4: 152-167 [Abstract]  
  • Hall, J. C., Watts, J. McK., Press, L., O'Brien, P., Turnidge, J., McDonald, P. (1989). Single-Dose Antibiotic Prophylaxis in Contaminated Abdominal Surgery. Arch Surg 124: 244-247 [Abstract]  
  • Manu, P., Matthews, D. A., Lane, T. J. (1988). The Mental Health of Patients With a Chief Complaint of Chronic Fatigue: A Prospective Evaluation and Follow-up. Arch Intern Med 148: 2213-2217 [Abstract]  
  • Young, G. P., Alexeyeff, M., McR. Russell, D., Thomas, R. J.S. (1988). Catheter Sepsis during Parenteral Nutrition: The Safety of Long-Term OpSite Dressings. JPEN J Parenter Enteral Nutr 12: 365-370 [Abstract]  
  • Rimon, D., Lurie, M., Storch, S., Halon, D., Eisenkraft, S., Laor, A., Cohen, L. (1988). Cardiomyopathy and Multiple Myeloma: Complications of Scleredema Adultorum. Arch Intern Med 148: 551-553 [Abstract]  
  • Longstreth, W. T. Jr, Koepsell, T. D., van Belle, G. (1987). Clinical Neuroepidemiology: II. Outcomes. Arch Neurol 44: 1196-1202 [Abstract]  
  • ALTMAN, D. G., GARDNER, M. J. (1987). More Informative Abstracts. ANN INTERN MED 107: 790-791 [Abstract]  
  • Woo, J., Woo, K. S., Kin, T., Vallance-Owen, J. (1987). A Single-Blind, Randomized, Cross-over Study of Angiotensin-Converting Enzyme Inhibitor and Triamterene and Hydrochlorothiazide in the Treatment of Mild to Moderate Hypertension in the Elderly. Arch Intern Med 147: 1386-1389 [Abstract]  



Access jobs at BMJ Careers
Whats new online at Student 

BMJ