Rapid Responses to:

ANALYSIS:
Holger J Schünemann, Andrew D Oxman, Jan Brozek, Paul Glasziou, Roman Jaeschke, Gunn E Vist, John W Williams, Jr, Regina Kunz, Jonathan Craig, Victor M Montori, Patrick Bossuyt, Gordon H Guyatt for the GRADE Working Group
Grading quality of evidence and strength of recommendations for diagnostic tests and strategies
BMJ 2008; 336: 1106-1110 [Full text]
*Rapid Responses: Submit a response to this article

Rapid Responses published:

[Read Rapid Response] Quality of evidence for logical diagnosis and treatment selection
Huw Llewelyn, Rhys Llewelyn, ST1 surgery, The Royal Gwent Hospital, Newport, NP20 2UB   (19 May 2008)

Quality of evidence for logical diagnosis and treatment selection 19 May 2008
  Top
Huw Llewelyn,
Consultant Physician
Kettering General Hospital, NN16 8UZ,
Rhys Llewelyn, ST1 surgery, The Royal Gwent Hospital, Newport, NP20 2UB

Send response to journal:
Re: Quality of evidence for logical diagnosis and treatment selection

Shunemann and colleagues advise that clinical trials should be used to assess diagnostic tests by randomising patients to a new or old test strategy and then observing the outcomes. They propose dividing test results into ‘positive’ and ‘negative’ when doing this. However, it is important to establish such dividing lines for test results in an evidence -based way. For example, in the IRMA 2 trial it was shown that there was no difference in outcome between irbesartan and placebo in diabetic patients with normal or well controlled blood pressures when the albumin excretion rate (AER) was between 20 and 40mcg/min. In the other 70% of patients in the same trial with an AER above 40mcg/min, irbesartan was increasingly superior to placebo [1]. Cut-off points are usually chosen in a non-evidence-based way based on two standard deviations from the mean in a reference population (e.g. an AER of 20mcg/min). This arbitrary approach means that some patients may be treated unnecessarily and others deprived of an effective treatment.

The performance of tests as diagnostic leads [2] or ‘pivots’ [3] is also important – the shorter the list of differential diagnoses the more helpful the test. For example, localised right lower quadrant tenderness may provide weak likelihood ratios but it narrows the differential diagnosis in a helpful way. ‘Guarding’ might also have a weak likelihood ratio for appendicitis compared to ‘not appendicitis’ (which include cholecystitis, etc) but it differentiates well between appendicitis and non-specific abdominal pain. Therefore, the ratio of its likelihoods (i.e. sensitivities) between these two diagnoses alone will be strong and helpful. Such ‘diagnostic leads’ and ‘differentiators’ complement each other to predict the appearance of the appendix at laparotomy more accurately than would be expected by merely assuming statistical independence between their likelihood ratios [4]. So, sensitivities, specificities and likelihood ratios alone are not enough to represent a test’s performance in the differential diagnostic process.

It is important that the GRADE system also incorporates these logical approaches when assessing tests. This is also important so that future doctors can be trained to use old and new tests intelligently to maximum effect in an evidence-based way. Otherwise, resources will continue to be wasted and patients will continue to suffer from avoidable misdiagnoses and consequent mistreatments.

References

1. Llewelyn DEH, Garcia-Puig J. How different urinary albumin excretion rates can predict progression to nephropathy and the effect of treatment in hypertensive diabetics. JRAAS 2004; 5: 141-5.

2. Llewelyn H, Ang H, Lewis K, Al-Abdullah A. The Oxford handbook of clinical diagnosis. Oxford: Oxford University Press, 2006.

3. Eddy DM, Clanton. The art of diagnosis. NEJM 1982, 306: 1263-8.

4.Llewelyn, D E H. Assessing the validity of diagnostic tests and clinical decisions. MD thesis, University of London, 1988.

Competing interests: None declared