- Susan Mallett, medical statistician1,
- Steve Halligan, professor of radiology2,
- Matthew Thompson, GP, senior clinical scientist, codirector of Oxford Centre for Monitoring and Diagnosis1,
- Gary S Collins, senior medical statistician3,
- Douglas G Altman, professor of statistics in medicine3
- 1University of Oxford, Department of Primary Care Health Sciences, Oxford OX2 6GG, UK
- 2University College London, Centre for Medical Imaging, London, NW1 2BU, UK.
- 3University of Oxford, Centre for Statistics in Medicine, Oxford, OX2 6UD
- Correspondence to: S Mallett
- Accepted 17 May 2012
Studies of tests that aim to diagnose clinical conditions that are directly applicable to daily practice should present test results that are directly interpretable in terms of individual patients— for example, the number of true positive and false positive diagnoses. We do not examine measures used for early experimental (exploratory) studies, in which diagnostic thresholds have not been established.
Results obtained from a diagnostic test accuracy study are expressed by comparison with a reference standard of the “true” disease status for each patient. Thus, once a clinically relevant diagnostic threshold has been established, patients’ results can be categorised by the test as true positive (TP), false positive (FP), true negative (TN), and false negative (FN) (fig 1⇓).
Diagnostic accuracy can be presented at a specific threshold by using paired results such as sensitivity and specificity, or alternatively positive predictive value (PPV) and negative predictive value (NPV) (see fig 1). Other methods summarise accuracy over a range of different test thresholds—for example, the area under the receiver operator curve (ROC AUC, see fig 1).
Despite the simplicity of the 2×2 structure, the presentation and interpretation of tests and comparisons between them are not straightforward. Graphical presentation can be highly informative, in particular an ROC plot, which is a plot of sensitivity against 1−specificity (or false positive rate). Figure 2⇓ shows an ROC plot of test accuracy of a single test at different thresholds. ROC plots are also …