Intended for healthcare professionals

Analysis Rating quality of evidence and strength of recommendations

Grading quality of evidence and strength of recommendations for diagnostic tests and strategies

BMJ 2008; 336 doi: (Published 15 May 2008) Cite this as: BMJ 2008;336:1106
  1. Holger J Schünemann, professor12,
  2. Andrew D Oxman, researcher3,
  3. Jan Brozek, research fellow1,
  4. Paul Glasziou, professor4,
  5. Roman Jaeschke, clinical professor5,
  6. Gunn E Vist, researcher3,
  7. John W Williams Jr, professor6,
  8. Regina Kunz, associate professor7,
  9. Jonathan Craig, associate professor8,
  10. Victor M Montori, associate professor9,
  11. Patrick Bossuyt, professor10,
  12. Gordon H Guyatt, professor2
  13. for the GRADE Working Group
  1. 1Department of Epidemiology, Italian National Cancer Institute Regina Elena, 00144 Rome, Italy
  2. 2CLARITY Research Group, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada L8N 3Z5
  3. 3Norwegian Knowledge Centre for the Health Services, PO Box 7004, 0130 Oslo, Norway
  4. 4Centre for Evidence-Based Medicine, Department of Primary Health Care, University of Oxford, Oxford OX3 7LF
  5. 5Department of Medicine, McMaster University, 1200 Main Street West, Hamilton, Ontario, Canada L8N 3Z5
  6. 6Department of Medicine, Duke University and Durham VA Medical Center, Durham, NC 27705, USA
  7. 7Basel Institute of Clinical Epidemiology, University Hospital Basel, Hebelstrasse 10, 4031 Basel, Switzerland
  8. 8Screening and Test Evaluation Program, School of Public Health, University of Sydney, Department of Nephrology, Children’s Hospital at Westmead, Sydney, Australia
  9. 9Knowledge and Encounter Research Unit, Department of Medicine, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
  10. 10Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Amsterdam 1100 DE, Netherlands
  1. Correspondence to: H J Schünemann schuneh{at}

The GRADE system can be used to grade the quality of evidence and strength of recommendations for diagnostic tests or strategies. This article explains how patient-important outcomes are taken into account in this process

Summary points

  • As for other interventions, the GRADE approach to grading the quality of evidence and strength of recommendations for diagnostic tests or strategies provides a comprehensive and transparent approach for developing recommendations

  • Cross sectional or cohort studies can provide high quality evidence of test accuracy

  • However, test accuracy is a surrogate for patient-important outcomes, so such studies often provide low quality evidence for recommendations about diagnostic tests, even when the studies do not have serious limitations

  • Inferring from data on accuracy that a diagnostic test or strategy improves patient-important outcomes will require the availability of effective treatment, reduction of test related adverse effects or anxiety, or improvement of patients’ wellbeing from prognostic information

  • Judgments are thus needed to assess the directness of test results in relation to consequences of diagnostic recommendations that are important to patients

In this fourth article of the five part series, we describe how guideline developers are using GRADE to rate the quality of evidence and move from evidence to a recommendation for diagnostic tests and strategies. Although recommendations on diagnostic testing share the fundamental logic of recommendations on treatment, they present unique challenges. We will describe why guideline panels should be cautious when they use evidence of the accuracy of tests (“test accuracy”) as the basis for recommendations and why evidence of test accuracy often provides low quality evidence for making recommendations.

Testing makes a variety of contributions to patient care

Clinicians use tests that are usually referred to as “diagnostic”—including signs and symptoms, imaging, biochemistry, pathology, and psychological testing—for various purposes.1 These purposes include identifying physiological derangements, establishing prognosis, monitoring illness and response to treatment, and diagnosis. This article …

View Full Text

Log in

Log in through your institution


* For online subscription