Interpreting diagnostic accuracy studies for patient care

BMJ 2012; 345 doi: (Published 2 July 2012)
Cite this as: BMJ 2012;345:e3999

Recent rapid responses

Rapid responses are electronic letters to the editor. They enable our users to debate issues raised in articles published on Although a selection of rapid responses will be included as edited readers' letters in the weekly print issue of the BMJ, their first appearance online means that they are published articles. If you need the url (web address) of an individual response, perhaps for citation purposes, simply click on the response headline and copy the url from the browser window.

Displaying 1-3 out of 3 published

This is a superb analysis of diagnostic clinical tests. The same analysis applied to the performance characteristics of symptoms and signs, clinical prediction rules or referral guidelines potentially determines whether we overdiagnose (1) and over medicalise, and indeed whether we can afford a publically funded health service. And the key is a concise term I have never heard before: “the relative misclassification cost”. How many false positives is a true positive worth?

Embarking on diagnostic testing causes a great deal of low level harm to the healthy but very significant benefit to the few. The benefit is clear. The low level harm to the majority is hard to measure and largely ignored.

The authors cite a study which showed that 63% of women thought that 500 false positives in breast screening was worth one life saved(2). But suppose the question was phrased differently: would you suffer the anxiety of 350 false positive results to have a 50% chance of benefiting (0.998 superscript 350 = 0.5)?

Patients referred under the “Two Week Rule” for suspected cancer in the UK have an 11% chance of having cancer (3). I may expect to be referred 6 times to have an evens chance of benefit (0.89 superscript 6 = 0.5). That seems OK. If the threshold of risk for referral was 1% then I would need to be referred 69 times for a 50% chance of having a diagnosis of cancer and possibly benefiting. As a 53 year old male I could expect two urgent referrals per year for 30 years before I had a 50: 50 chance of benefiting.

It is the “relative misclassification cost” that is the key and, and it is something we rarely consider.

Reference List

1. Moynihan R, Doust J, Henry D. Preventing overdiagnosis: how to stop harming the healthy. BMJ 2012;344:e3502.
2. Schwartz LM, Woloshin S, Sox HC, et al. US women's attitudes to false-positive mammography results and detection of ductal carcinoma in situ: cross sectional survey. West J Med 2000;173:307-12.
3. Meecham D, Gildea C, Hollingworth L, et al. Variation in use of the 2-week referral pathway for suspected cancer: a cross sectional analysis. British Journal of General Practice 2012;about to be published.

Competing interests: None declared


Hoyland House Surgery, Painswick, Glos, GL6 6RD

Click to like:

The paper by Dr Mallett and colleagues on interpreting diagnostic accuracy studies for patient care addresses the important issue of including disease prevalence in the casemix in any summary of the clinical benefit of diagnostic tests.1 Another, related, issue concerns patient selection for such diagnostic accuracy studies.

For example, in studies of the diagnostic accuracy of new dementia screening instruments, subjects are often selected on the basis of known diagnoses (e.g. dementia vs. controls, or dementia vs. mild cognitive impairment, or Alzheimer’s disease vs. frontotemporal dementia), with or without additional exclusion criteria to ensure relative purity of the comparison groups. This rigorous research approach is appropriate for index studies of new tests, although comparison with normal controls may inflate test metrics. Such methodology is compliant with criteria for the assessment of the quality of studies examining the diagnostic accuracy of clinical tests such as the Standards for Reporting Diagnostic Accuracy (STARD)2 and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS).3 However, this approach is alien to the idiom of day-to-day clinical practice, wherein undiagnosed, rather than diagnosed, individuals attend for assessment and diagnosis, and patients cannot be turned away because of the presence of exclusion criteria. In clinical practice many of the variables which are controlled for in research studies cannot be controlled.

Just as the results of randomised controlled trials, the gold standard for assessment of therapeutic efficacy, may not be replicable and are acknowledged to be problematic when applied to the messy contingencies of non-trial, day-to-day, clinical practice, so STARD- and/or QUADAS-compliant diagnostic accuracy studies may, despite their methodological rigour, not assist in clinical decision making and patient care. Too slavish an adherence to test cutoffs derived from such studies may not best serve diagnostic accuracy. To better reflect clinical practice, pragmatic diagnostic accuracy studies are required.

Pragmatic studies will select consecutive patient referrals, rather than selecting on the basis of diagnosis/known aetiology. For example, in the memory clinic setting, all attendees will at minimum have subjective memory impairment; there are no normal controls. The need to adjust test cutoffs for maximal diagnostic accuracy has been found for a number of cognitive screening instruments (Addenbrooke’s Cognitive Examination and its Revision, Montreal Cognitive Assessment, Test Your Memory Test) examined in pragmatic diagnostic accuracy studies.4 Such pragmatic studies, reflecting the naturalistic, cross-sectional nature of clinical assessment and the spectrum bias of day-to-day clinical practice may better inform clinical decision making and patient care.

1. Mallett S, Halligan S, Thompson M, Collins GS, Altman DG. Interpreting diagnostic accuracy studies for patient care. BMJ 2012;344:e3999. (25 August.)
2. Bossuyt PM, Reitsma JB, Bruns DE et al. (2003) The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem 2003;49:7-18.
3. Whiting P, Rutjes AW, Dinnes J, Reitsma J, Bossuyt PM, Kleijnen J. Development and validation of methods for assessing the quality of diagnostic accuracy studies. Health Technol Assess 2004;8:iii,1-234.
4. Larner AJ. Dementia in clinical practice: a neurological approach. Studies in the dementia clinic. London: Springer, 2012: 34,38,41-2,43-4.

Competing interests: None declared

Andrew J Larner, Consultant Neurologist

Walton Centre for Neurology and Neurosurgery, Liverpool, L9 7LJ

Click to like:

The index of specificity in combination with sensitivity provides only a limited indication of the potential usefulness and accuracy of a diagnostic test. It provides a helpful indication of a test’s performance when screening a well defined population for a disease. However, it cannot be applied to differential diagnosis and treatment selection [1].

During diagnostic reasoning it is important to know the frequency with which different diagnoses (i.e. their diagnostic criteria) occur in those with a diagnostic finding in a range of settings. If the list accounting for the finding is short, then it is a helpful diagnostic lead. It is important to know if findings occur commonly in some of the diagnoses suggested by a lead and rarely or never in others (i.e. so that the ratio of sensitivities is low or zero). It is also important not to divide numerical results into positive and negative results but to interpret each numerical result by plotting their distributions. All indices depend on the ‘disease’ being first identified by a diagnostic criterion. It is therefore very important for evidence to be provided that tests used as diagnostic and treatment selection criteria are appropriate for that purpose [1].

Therefore, when providing evidence on how well tests perform, it is also important to assess their performance in suggesting differential diagnoses, differentiating between those diagnoses, defining diagnoses in terms of sufficient and necessary criteria and selecting patients for treatment.


1. Llewelyn H, Ang AH, Lewis K, Abdullah A. The Oxford Handbook of Clinical Diagnosis, 2nd edition. Oxford University Press, Oxford 2009, pp754-760.

Competing interests: None declared

Huw Llewelyn, General physician and endocrinologist

Nevill Hall Hospital, Brecon Road, Abergavenny, NP7 7EG

Click to like: