BMJ 1996;313:628 (7 September)

Letters

Logistic regression models used in medical research are poorly presented

EDITOR,--The application of multiple regression models in medical research has greatly increased during the past years.1 Nevertheless, assessing the accuracy of regression models in describing the data (goodness of fit) is almost unknown in medical research. Hence, medical journals may be publishing papers in which regression models are misused or results are misinterpreted.

We investigated the use of logistic regression in papers published in the BMJ, JAMA, the Lancet, and the New England Journal of Medicine during 1991-4. A Medline search using the strings logistic regression and proportional odds model yielded 111 papers. Of these, two articles stated the use of logistic regression in the abstract but the Cox model had been used instead. The remaining 109 papers used some kind of logistic regression. We investigated which kind of logistic regression was used (binary, polytomous, ordinal), whether a statistical reference and the computer software were specified, and whether a valid assessment of the goodness of fit of the logistic models2 was reported.

Only one paper used the proportional odds model for ordinal response; the other 108 articles used binary logistic regression. A reference for logistic regression was specified in 48 papers, for the software in 57, and for both in only 26 papers. This is not in line with the guidelines of the International Committee of Medical Journal Editors.3 The most frequently specified reference was the book by Hosmer and Lemeshow,2 followed by the book by Breslow and Day4 and various SAS manuals, while the most popular software packages in descending order were SAS, SPSS, BMDP, EGRET, and GLIM.

Goodness of fit was rarely assessed. Three papers stated the use of the Hosmer-Lemeshow test,2 two compared the predicted and observed outcomes, and two reported the analysis of residuals. A further two reported the use of likelihood ratio statistics, but as the models contained continuous covariates the likelihood ratio test was inadequate.2 Thus only seven papers reported a valid assessment of the adequacy of their regression model.

As the validity of all results and conclusions strongly depends on the goodness of fit of the models used, this practice of reporting is unsatisfactory and should be changed. We agree with Campillo that clear standardised publication criteria are needed to improve the current poor presentation of regression models in biomedical journals.5 We recommend that authors should always report the goodness of fit of regression models to avoid invalid results.

RALF BENDER Statistician Department of Metabolic Diseases and Nutrition, Heinrich-Heine-University Dusseldorf, PO Box 10 10 07, D-40001 Dusseldorf, Germany

ULRICH GROUVEN Statistician Department of Anaesthesiology, Research Group Informatics and Biometry, Hanover Medical School, Hospital Oststadt, Podbielskistrasse 380, D-30659 Hanover, Germany

Ralf Bender, Ulrich Grouven 


  1. Altman DG. Statistics in medical journals: development in the 1980s. Stat Med 1991;10:1897-913. [Medline]
  2. Hosmer DW, Lemeshow S. Applied logistic regression. New York: Wiley, 1989.
  3. International Committee of Medical Journal Editors. Uniform requirements for manuscripts submitted to biomedical journals. JAMA 1993;269:2282-6. [Abstract]
  4. Breslow NE, Day NE. Statistical methods in cancer research. Vol 1. The analysis of case-control studies. Lyons: International Agency for Research on Cancer, 1980.
  5. Campillo C. Standardizing criteria for logistic regression models. Ann Intern Med 1993;119:540-1. [Free Full Text]

This article has been cited by other articles:

  • LaValley, M. P. (2008). Logistic Regression. Circulation 117: 2395-2399 [Full text]  
  • Mikolajczyk, R. T., DiSilvesto, A., Zhang, J. (2008). Evaluation of Logistic Regression Reporting in Current Obstetrics and Gynecology Literature. Obstet Gynecol 111: 413-419 [Abstract] [Full text]  
  • Moss, M., Wellman, D. A., Cotsonis, G. A. (2003). An Appraisal of Multivariable Logistic Models in the Pulmonary and Critical Care Literature. Chest 123: 923-928 [Abstract] [Full text]  
  • van Uden, C J T, Winkens, R A G, Wesseling, G J, Crebolder, H F J M, van Schayck, C P (2003). Use of out of hours services: a comparison between two organisations. Emerg. Med. J. 20: 184-187 [Abstract] [Full text]  
  • Mullner, M., Matthews, H., Altman, D. G. (2002). Reporting on Statistical Methods To Adjust for Confounding: A Cross-Sectional Survey. ANN INTERN MED 136: 122-126 [Abstract] [Full text]  
  • KHAN, K. S., CHIEN, P. F. W., DWARAKANATH, L. S. (1999). Logistic Regression Models in Obstetrics and Gynecology Literature. Obstet Gynecol 93: 1014-1020 [Abstract] [Full text]  

Online poll
Find out more

Rapid responses for this article

There are no rapid responses for this article.


Student BMJ

Risk of surgery for inflammatory bowel disease: record linkage studies

What can you learn from this BMJ paper? Read Leanne Tite's Paper+

www.student.bmj.com

Listen to the latest BMJ Interview