Prognosis and prognostic research: validating a prognostic model
BMJ 2009; 338 doi: https://doi.org/10.1136/bmj.b605 (Published 28 May 2009) Cite this as: BMJ 2009;338:b605- Douglas G Altman, professor of statistics in medicine1,
- Yvonne Vergouwe, assistant professor of clinical epidemiology2,
- Patrick Royston, senior statistician3,
- Karel G M Moons, professor of clinical epidemiology2
- 1Centre for Statistics in Medicine, University of Oxford, Oxford OX2 6UD
- 2Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, Netherlands
- 3MRC Clinical Trials Unit, London NW1 2DA
- Correspondence to: D G Altman doug.altman{at}csm.ox.ac.uk
- Accepted 6 October 2008
Prognostic models, like the one we developed in the previous article in this series,1 yield scores to enable the prediction of the risk of future events in individual patients or groups and the stratification of patients by these risks.2 A good model may allow the reasonably reliable classification of patients into risk groups with different prognoses. To show that a prognostic model is valuable, however, it is not sufficient to show that it successfully predicts outcome in the initial development data. We need evidence that the model performs well for other groups of patients.1 3 In this article, we discuss how to evaluate the performance of a prognostic model in new data.4 5
Summary points
Unvalidated models should not be used in clinical practice
When validating a prognostic model, calibration and discrimination should be evaluated
Validation should be done on a different data from that used to develop the model, preferably from patients in other centres
Models may not perform well in practice because of deficiencies in the development methods or because the new sample is too different from the original
Why prognostic models may not predict well
Various statistical or clinical factors may lead a prognostic model to perform poorly when applied to other patients.4 6 The model’s predictions may not be reproducible because of deficiencies in the design or modelling methods used in the study to derive the model, if the model was overfitted, or if an important predictor is absent from the model (which may be hard to know).1 Poor performance in new patients can also arise from differences between the setting of patients in the new and derivation …