Papers

Reporting of precision of estimates for diagnostic accuracy: a review

BMJ 1999; 318 doi: http://dx.doi.org/10.1136/bmj.318.7194.1322 (Published 15 May 1999) Cite this as: BMJ 1999;318:1322
  1. Robert Harper, principal optometrist (robert.harper{at}man.ac.uk)a,
  2. arnaby Reeves, senior lecturerb
  1. a Department of Ophthalmology, Manchester Royal Eye Hospital, Manchester M13 9WH
  2. bHealth Services Research Unit, London School of Hygiene and Tropical Medicine, London WC1E 7HT
  1. Correspondence to: Dr Harper
  • Accepted 15 December 1998

Diagnostic accuracy is usually characterised by the sensitivity and specificity of a test, and these indices are most commonly presented when evaluations of diagnostic tests are reported. It is important to emphasise that, as in other empirical studies, specific values of diagnostic accuracy are merely estimates. Therefore, when evaluations of diagnostic accuracy are reported the precision of the sensitivity and specificity estimates or likelihood ratios should be stated.13 If sensitivity and specificity estimates are reported without a measure of precision, clinicians cannot know the range within which the true values of the indices are likely to lie.

Confidence intervals are widely used in medical literature, and journals usually require confidence intervals to be specified for other descriptive estimates and for epidemiological or experimental analytical comparisons. Journals seem less vigilant, however, for evaluations of diagnostic accuracy. For example, a recent review of compliance with methodological standards in diagnostic test research found that for the period 1978-93 only 12 of 112studies published in the New England Journal of Medicine, JAMA, the BMJ, and the Lancet reported the precision of the estimates of diagnostic accuracy.3 We have found that the reporting of 95% confidence intervals for estimates is somewhat better in a more recent 2 year interval for studies published in the BMJ but still far from ideal.

Methods and results

We searched the Medline database (for 1996 and 1997) for reports of diagnostic evaluations in the BMJ. After we excluded letters, case reports, and review or education articles we identified 16studies (references supplied on request). Only eight (95% confidence interval 25% to 75%) papers reported precision for the estimates of diagnostic accuracy, with two of these studies providing confidence intervals only for either predictive power values or likelihood ratios but not for the sensitivity or specificity estimates also reported.

Comment

Evaluations of diagnostic accuracy should be prescribed with confidence intervals. We have also recently reviewed the extent of compliance with the reporting of confidence intervals in the ophthalmic literature and concluded that evaluations of diagnostic tests in this specialty are similarly flawed.4 The omission of the precision of estimates for diagnostic accuracy can make a considerable difference to a clinician's interpretation of the findings of a study. For example, an evaluation of the sensitivity and specificity of an imaging system for the optic nerve head for the detection of glaucoma reported estimates of 89% and 78%, respectively5; the 95% confidence intervals of these estimates (not reported in the paper) ranged from 80% to 98% for sensitivity and from 66% to 90% for specificity. For a test with poorer diagnostic accuracy, these 95% confidence intervals would have been even larger for an equivalent sample size because of the dependence of the standard error of a proportion on the proportion itself (figure). The figure shows how the precision of the sensitivity or specificity estimate varies as a function of both the point estimate itself and the sample size.

Figure1

Breadth of exact binomial 95% confidence intervals as function of sample estimate of proportion of interest and sample size; from outside to centre, pairs of lines represent sample sizes of 20, 40, 60, 100, 200, and 500. Note 95% confidence interval is widest for proportion equal to 0.5 and narrows as proportion tends to 0 or 1. To use figure, read off upper and lower 95% confidence intervals and simply add and subtract sample estimate—for example, a sample estimate of 0.5, based on sample size of 100, has 95% confidence interval that ranges from 0.5−0.1 to 0.5+0.1 (0.4 to 0.6)

Most statistical packages will generate exact binomial confidence intervals. Approximate confidence intervals can easily be calculated by using the formula for the SE of a proportion (√pq/n), which is based on a binomial approximation to the normal distribution and can be used to calculate 95% confidence intervals for sensitivity and specificity (for instance, p±1.96√pq/n, where p represents either sensitivity or specificity, q=1−p, n is the sample size, and where n×p is >10).

To enhance the quality of information on diagnostic tests made available to clinicians we recommend that 95% confidence intervals are supplied with estimates of diagnostic accuracy. Referees and journal editors should enforce this requirement in the same way as they routinely do for other descriptive or comparative estimates.

Acknowledgments

Contributors: RH and BR both contributed to the idea and the methods. RH carried out the search and reviewed the papers, and BR performed the calculations to develop the figure. RH and BR jointly drafted and revised the paper and are both guarantors.

Funding: No external funding.

Competing interests: None declared.

References

View Abstract