Intended for healthcare professionals

Education And Debate

Fortnightly Review: Development of review criteria: linking guidelines and assessment of quality

BMJ 1995; 311 doi: (Published 05 August 1995) Cite this as: BMJ 1995;311:370
  1. Richard Baker, directora,
  2. Robin C Fraser, directorb
  1. aEli Lilly National Clinical Audit Centre, Department of General Practice, University of Leicester, Leicester LE5 4PW,
  2. bDepartment of General Practice, University of Leicester, Leicester
  1. Correspondence to: Dr Baker.
  • Accepted 5 August 1995

Review criteria are designed to enable clinicians and others to assess care. However, there is no established method for developing criteria, and they are often confused with guidelines. Criteria should comprise measurable activities that are appropriate for the setting in which they are to be used. They should also be based on research evidence and prioritised according to the strength of that evidence and effect on outcome. Good criteria can be used to aid implementation of guidelines by providing a standard against which to monitor performance and enabling clinical audit.

In the continuing debate about the most effective methods for assessing high quality care reference is often made to “guidelines” and “review criteria.” Although the purpose of guidelines is to assist in making clinical decisions and criteria are used in the assessment of care1 (box 1), these crucial distinctions are not always clearly made, leading to confusion about their development and application in clinical practice. Furthermore, much less attention has been given to methods for developing and using review criteria compared with guidelines. In improving care, sound measures for the assessment of quality are as necessary as “systematically developed statements to assist practitioner and patient decisions.”1 The aims of this paper are, firstly, to make explicit the desirable attributes of criteria and, secondly, to propose a framework for linking them with the process of development of guidelines.

Box 1

Definitions of guidelines, criteria, and protocols for clinical practice

View this table:

The respective roles of guidelines and criteria can be clarified by the following example. The guidelines of the British Hypertension Society state that “great emphasis should be placed on encouraging patients to stop smoking as the coexistence of smoking as an additional risk factor in hypertensive patients confers a much increased risk of subsequent cardiovascular events.”2 To convert this guideline into review criteria it would need to become “the records show that at least annually (a) there has been an assessment of smoking habit and (b) appropriate advice has been given to smokers.” The criteria make clear what information is required to assess clinician compliance, how the information is to be obtained, and the time period in which smoking habit should be assessed. It illustrates how criteria used for assessment need to be more detailed and specific than guideline statements used to assist decision making.

The importance of establishing a method for developing criteria lies in part in the role they can have in the implementation of guidelines. Guidelines have been produced by clinicians (that is, doctors, nurses, and other professionals directly providing clinical care) for many years, but the current interest in their potential value is due to several new factors. Firstly, only recently has a method been suggested for the development of systematic guidelines based on evidence, which may offer one approach to incorporating the findings of research into routine clinical practice.3 4 Secondly, evidence has now become available to show that guidelines can encourage improvements in performance when coupled with effective implementation strategies.5 6 Thirdly, the NHS Executive has recognised the potential part clinical guidelines could play in improving clinical effectiveness if incorporated by purchasers and providers in contracts for care.7 Despite these developments, however, benefit from using guidelines does not automatically follow.8 9 10 Changes in the performance of clinicians depend not only on the guidelines themselves but also on the methods used to encourage their acceptance and use.5 6

Criteria can be used in two ways to augment the implementation of guidelines. They can be used by purchasers and providers to monitor compliance with the recommendations of guidelines, with one or more of a range of methods being chosen when necessary to improve compliance. Criteria are also an essential component of clinical audit, which is itself one method that can be used by clinicians to help them comply with guidelines. In audit, information about performance is collected for comparison with explicit criteria, and feedback of the findings to participants is then used either alone or in combination with other strategies to encourage appropriate changes in performance.11 12 13

Attributes of criteria for monitoring and clinical audit

Some desirable attributes of medical review criteria have been proposed by the American Institute of Medicine's committee on clinical practice guidelines.1 It should be remembered, however, that these medical review criteria are intended for use in health services research14 or in peer review programmes to assess the appropriateness of care provided by clinicians for reimbursement by organisations such as insurance companies. The process of developing such criteria for the external assessment of care is often detailed and prolonged.15 16 In Britain criteria are used mainly in the context of clinical audit by clinicians themselves to identify aspects of care that could be improved. The use of criteria by purchasers to monitor performance is still uncommon. Moreover, because they are used for different purposes the properties of criteria used in the American context are different from those used in Britain, whether by purchasers, providers, or clinicians.

Grimshaw and Russell have proposed a useful set of requirements for review criteria3 derived from Kessner et al17 and Irvine.18 They recommend that criteria should be easy to define; relate to morbidity amenable to improvement by medical care; have a sound scientific basis; yield sufficient data for statistical analysis; and span the range of morbidity, skills required, and resources specified by the guideline; also the effects of non-medical factors on performance should be understood. These proposals form a valuable starting point, but more consideration still needs to be given to which requirements are most important and whether they are all applicable. For example, in some types of monitoring of performance such as significant event audit or occurrence screening the need for statistical analysis would not arise.

In our opinion, the most important requirement of criteria—as with guidelines—is that they should be based on evidence from research (box 2). Indeed, the approach recommended for the development of systematic evidence linked guidelines can be directly applied to criteria. Evidence from research should be evaluated by using reputable methods3 19 to differentiate criteria for which there is strong supporting evidence from those for which evidence is less clear or completely lacking. To increase the confidence of practitioners in the practical value of the criteria and their impact on the outcome of care, information should be provided about how the review of the research literature was undertaken, including the database used and how research reports were evaluated before being accepted. Although criteria can be derived from reputable guidelines, few current guidelines are of sufficient scientific validity to recommend such a course of action routinely.7

Box 2—Attributes of criteria for assessment of quality

  • Based on research evidence

  • Prioritised according to strength of research evidence and influence on outcome

  • Measurable—clear and precise

  • Appropriate to the clinical setting

Furthermore, as the strength of the supporting evidence from research for different criteria will vary there is a need to make clear which criteria are most or least important in influencing quality of care—that is, criteria need to be prioritised. We would recommend that this should be judged primarily on the strength of the evidence from research and, secondly, on the potential of a particular criterion for influencing the outcome of care. Some elements of care which have only a marginal impact on outcome may have strong evidence from research and vice versa. In any event, selected criteria should have a substantial impact on outcome backed by satisfactory evidence from research. A practical advantage of prioritisation is that it enables participants in audit to focus their energies selectively. This will ensure that audit is efficient with effort devoted only to those criteria that lead to maximum improvements in outcome.

Several classification systems for prioritising criteria have been suggested. For example, Donabedian proposed two types, categorical or contingent, depending on the strength of evidence from research.20 Complex procedures for weighting criteria for external assessment have been developed.16 To meet the needs of clinical audit for a system which is easy to understand and practical in general use the Eli Lilly National Clinical Audit Centre has devised the classification of “must do,” “should do,” and “could do” criteria.21 The “must do” criteria are those for which there is solid evidence from research of substantial impact on outcome. “Should do” criteria are those for which there is good evidence, but the impact on outcome is less substantial or the evidence is less strong. “Could do” criteria are those elements of care for which evidence of impact is inconclusive. If it cannot be shown that an element of care is either beneficial or harmful in clinical or economic terms there is little to be gained by undertaking an audit to ensure that element of care is provided. Such criteria should not, therefore, be routinely included in assessments of the quality of care.

Criteria for audit or monitoring should also be measurable. Each criterion should specify clearly and precisely what is to be measured so that it is possible to determine compliance. For example, a criterion stating that “a patient with hypertension should avoid excessive alcohol intake” would be difficult to measure accurately as the term “excessive alcohol intake” is not precisely defined and the patient's habits cannot easily be directly observed. To be measurable the criterion might be worded “the records show that annually the patient has been advised to limit weekly alcohol intake to less than 21 units if male and and less than 14 units if female.”2

While guidelines may allow some flexibility on the part of the clinician to take into account unusual circumstances, criteria have greater rigidity. In the assessment of care the response to most criteria will be one of three possible answers—yes (complied), no (not complied), or not known (missing information). The use of a standard—that is, the percentage of cases which should comply with the criterion (box 1)—is the mechanism by which allowances are made for the variability of patients and clinical settings. The standard should be set at a realistic level which can still trigger systematic efforts to improve care. For many criteria the ultimate objective may be a standard of 100%, but in the short term a lower standard may be chosen which is then gradually increased in a series of steps.

The criteria must also be appropriate to the setting in which care is provided as clinical probabilities are related to context. For example, investigation of patients who are diagnosed as hypertensive in the setting of general practice would probably not identify many with secondary hypertension. Patients referred to hospital with uncontrolled hypertension, however, might warrant investigation for secondary causes. Thus, criteria used in an audit of hospital managed hypertension would be—and should be—different from those in general practice. Furthermore, by ensuring that criteria are specific for the setting participants in audit are more likely to accept them as relevant and practicable in their day to day work.

A framework for linking guidelines and criteria

To evaluate the quality of care given to patients with a specific condition such as hypertension criteria are required which cover every important aspect of care. For example, they should include specific diagnostic criteria that must be satisfied before patients are labelled as hypertensive and management criteria for the modification of risk factors and the control of blood pressure. In particular, all criteria in the “must do” category must be included. A complete set of criteria constitutes a protocol (box 1). Audit protocols also need to include specific advice on setting standards and the practical steps to be taken to conduct an audit in such a way as to be confident of the reliability and validity of the findings. Therefore, the procedures for collection and analysis of data must also be described.

An increasing number of scientifically valid guidelines will probably become available in the next few years. Various approaches will be used to encourage their adoption by clinicians, including monitoring of contracts, audit, and the feedback of information about performance. It has been proposed that clinical guidelines developed by expert agencies could be adapted locally for use in audit.22 This approach can be justified only if the locally adapted guidelines comply with the evidence from research, but no studies have yet been undertaken to evaluate the quality of guidelines that have been adapted for local use.23 Likewise, criteria and protocols could be developed locally by doctors or other clinicians undertaking audit, but the validity of the protocols produced may be questionable. The evaluation of evidence from research requires both skill and time that may not be readily available in every locality. Guidelines could be used to identify criteria, but if adequate evidence linked guidelines are not available the task of developing protocols will be even more complex. Even if guidelines are available, they have not usually been developed or presented to facilitate the identification of criteria. For example, current guidelines do not usually indicate which elements of care should be prioritised into categories such as “must do,” “should do,” or “could do.”

Ownership is commonly seen as an important influence on whether the criteria or guidelines are used and lead to improved care.22 24 It is argued that ownership is achieved only if the criteria are developed by those who use them. In a recent review of rigorous evaluations of methods for the implementation of guidelines, however, the balance of evidence did not indicate that ownership was of critical importance.6 Furthermore, local development need not be the only way to ensure that clinicians accept and comply with the criteria. Ownership is facilitated by allowing clinicians the freedom to choose whether performance is judged against “must do” criteria or “must do” plus “should do” and to select the most appropriate standards for their circumstances.

The concentration of resources for development of protocols regionally or nationally would be more efficient and liable to lead to the use of more appropriate and rigorous methods. As national or regional guidelines linked to evidence are produced related protocols should be developed. The developers of a guideline might themselves devise a relevant protocol, but it is important that the specific circumstances of potential users of the protocol are taken into account. Most guidelines are not developed with a single clinical setting in mind, such as general practice or cardiology outpatient departments, and in most cases clinicians from the relevant setting must take the lead in defining the wording of criteria in the light of the guidelines and the research evidence. Locally, health commissions, audit groups, and other agencies concerned in the assessment of care can select a protocol which has been rigorously developed by an expert group for an aspect of care in which they are interested. Alternatively, if a set of adequate and approved guidelines have been issued they can then choose the criteria to be used, taking into account their prioritisation and the requirements of local clinicians. Local standards may also be set, and a systematic method of collecting and analysing information about compliance with the criteria can be used.

We hope that our clarification of the meaning and practical application of terms used will encourage the linkage of guidelines with criteria, thus providing a more effective approach to implementing the findings of research in everyday practice. Clinical practice based on evidence is more likely to be achieved if guidance to assist decision making and assessments of care are both used as related parts of an integrated system. The combination of an increasing number of scientifically valid guidelines, equally sound regional or national criteria, and an explicit mechanism used to ensure their appropriate use, such as that described in this paper, should facilitate the rational monitoring and audit of care.