GRADE Series
GRADE guidelines: 11. Making an overall rating of confidence in effect estimates for a single outcome and for all outcomes

https://doi.org/10.1016/j.jclinepi.2012.01.006Get rights and content

Abstract

GRADE requires guideline developers to make an overall rating of confidence in estimates of effect (quality of evidence—high, moderate, low, or very low) for each important or critical outcome. GRADE suggests, for each outcome, the initial separate consideration of five domains of reasons for rating down the confidence in effect estimates, thereby allowing systematic review authors and guideline developers to arrive at an outcome-specific rating of confidence. Although this rating system represents discrete steps on an ordinal scale, it is helpful to view confidence in estimates as a continuum, and the final rating of confidence may differ from that suggested by separate consideration of each domain.

An overall rating of confidence in estimates of effect is only relevant in settings when recommendations are being made. In general, it is based on the critical outcome that provides the lowest confidence.

Introduction

What is new?

Key points

  1. GRADE requires a rating of confidence in effect estimates (quality of evidence) for each outcome.

  2. Rating of confidence of evidence requires a gestalt that simultaneously considers all eight domains (risk of bias, precision, consistency, and so forth)

  3. Guideline developers using GRADE will subsequently make an overall rating of confidence in effect estimates across all outcomes based on those outcomes they consider critical to their recommendation.

  4. Optimal application of GRADE requires making the reasons for key judgments transparent.

In prior studies in this series devoted to exploring GRADE’s approach to rating confidence in estimates of effect (quality of evidence) and grading strength of recommendations (guidance for practice) we have dealt with issues of framing the question [1]; introduced GRADE’s conceptual approach to rating the confidence in a body of evidence [2]; and presented five reasons for rating down the confidence in effect estimates (risk of bias [3], imprecision [4], inconsistency [5], indirectness [6], and publication bias [7]) and three reasons for rating up the confidence in effect estimates [8] (a large magnitude of effect, a dose-response gradient, and a situation in which plausible biases, if present, would serve to increase our confidence in the effect estimate), as well as dealing with issues specific to resource use. This 11th article in the series will focus on (1) summarizing the confidence in effect estimates across a single outcome for each important or critical outcome and (2) determining the confidence in effect estimates across all critical outcomes.

Section snippets

Summarizing the confidence in effect estimates for individual outcomes

GRADE’s approach to rating down (or not) with respect to each of five criteria and to rating up (or not) with respect to three others is sometimes straightforward and enhances the transparency of the system. Most commonly, authors will be comfortable with the rating of confidence in estimate of effect that results from considering each criterion separately. Not infrequently, however, if ratings are applied in a blanket or rote fashion without considering context and the relation of one

Determining the confidence in effect estimates across outcomes

GRADE is the first formal system of rating quality of evidence to acknowledge that quality may differ across outcomes and to explicitly address this issue. For systematic reviews that are not associated with recommendations, and therefore do not require an overall confidence rating across outcomes, we suggest presenting confidence ratings for each important outcome and not determining the confidence in effect estimates across outcomes.

Such systematic reviews may, however, subsequently inform

Which outcomes are critical may depend on the evidence

The overall confidence in effect estimates may not come from the outcomes judged critical at the beginning of the guideline development process—that is, judgments about what is critical may change when considering the results. For instance, a particular adverse event (e.g., severe nausea and vomiting) may be considered critical at the outset. If it turns out, however, that the event occurs very infrequently—say, less than 3% of patients—the final decision may be that the adverse effect is

Conclusions

GRADE defines criteria for rating the confidence in effect estimates for a given outcome, thereby allowing systematic review authors and guideline developers to arrive at an outcome-specific confidence in effect estimates rating. Although this rating system represents discrete steps on an ordinal scale, it is helpful to view confidence in effect estimates as a continuum. An overall confidence in effect estimates rating across outcomes is only relevant in settings when recommendations are being

References (13)

There are more references available in the full text version of this article.

Cited by (0)

The GRADE system has been developed by the GRADE Working Group. The named authors drafted and revised this article. A complete list of contributors to this series can be found on the JCE Web site.

View full text