Intended for healthcare professionals

Education And Debate

Developing guidelines

BMJ 1999; 318 doi: (Published 27 February 1999) Cite this as: BMJ 1999;318:593
  1. Paul G Shekelle, senior research associate, Health Services Research and Development Service (shekelle{at},
  2. Steven H Woolf, professor of family medicineb,
  3. Martin Eccles, professor of clinical effectivenessc,
  4. Jeremy Grimshaw, professor of public healthd
  1. aWest Los Angeles Veterans Affairs Medical Center (111G), 11301 Wilshire Blvd, Los Angeles, CA 90073, USA
  2. bDepartment of Family Practice, Virginia Commonwealth University, Fairfax, Virginia 22033, USA
  3. cCentre For Health Services Research, University of Newcastle upon Tyne, Newcastle upon Tyne NE2 4AA
  4. dHealth Services Research Unit, University of Aberdeen, Aberdeen AB9 2ZD
  1. Correspondence to: Dr Shekelle

    This is the second in a series of four articles on issues in the development and use of clinical guidelines

    The methods of guideline development should ensure that treating patients according to the guidelines will achieve the outcomes that are desired. This article presents a combination of the literature about guideline development and the results of our combined experience in guideline development in North America and Britain. It considers the five steps in the initial development of an evidence based guideline. The dissemination, implementation, and evaluation of practice guidelines will be discussed in the final article in this series.1

    Summary points

    • Identifying and refining the subject area is the first step in developing a guideline

    • Convening and running guideline development groups is the next step

    • On the basis of systematic reviews, the group assesses the evidence about the clinical question or condition

    • This evidence is then translated into a recommendation within a clinical practice guideline

    • The last step in guideline development is external review of the guideline

    Identifying and refining the subject area of a guideline

    Prioritising topics

    Guidelines can be developed for a wide range of subjects. Clinical areas can be concerned with conditions (abnormal uterine bleeding, coronary artery disease) or procedures (hysterectomy, coronary artery bypass surgery). Given the large number of potential areas, some priority setting is needed to select an area for guideline development. Potential areas can emerge from an assessment of the major causes of morbidity and mortality for a given population, uncertainty about the appropriateness of healthcare processes or evidence that they are effective in improving patient outcomes, or the need to conserve resources in providing care.

    Refining the subject area

    The topic for guideline development will usually need to be refined before the evidence can be assessed in order to answer exact questions. The usual way of refining the topic is by a dialogue among clinicians, patients, and the potential users or evaluators of the guideline. Discussions about the scope of the guideline will also take place within the guideline development panel.

    If the topic is not refined, the clinical condition or question may be too broad in scope. For example, a guideline on the management of diabetes could cover primary, secondary, and tertiary care elements of management and also multiple aspects of management, such as screening, diagnosis, dietary management, drug therapy, risk factor management, or indications for referral to a consultant. Though all of these could legitimately be dealt with in a guideline, the task of developing such a guideline would be considerable; therefore a group needs to be clear which areas are and are not within the scope of their activities. It is possible to develop guidelines that are both broad in scope and evidence based, but to do so usually requires considerable time and money, both of which are frequently underestimated by inexperienced developers of evidence based clinical practice guidelines.

    One method of defining the clinical question of interest and also identifying the processes for which evidence needs to be collected and assessed is the construction of models or causal pathways.2 A causal pathway is a diagram that illustrates the linkages between intervention(s) of interest and the intermediate, surrogate, and health outcomes that the interventions are thought to influence. In designing the pathway, guideline developers make explicit the premises on which their assumptions of effectiveness are based and the outcomes (benefits and harms) that they consider important. This identifies the specific questions that must be answered by the evidence to justify conclusions of effectiveness and highlights gaps in the evidence, for which future research is needed.

    Running guideline development groups

    Setting up a guideline development project

    To successfully develop a guideline it may be necessary to convene more than one group. A project or management team could undertake the day to day running of the work, such as the identification, synthesis, and interpretation of relevant evidence; the convening and running of the guideline development groups; and the production of the resulting guidelines. Additional guideline development group(s) would produce recommendations in the light of the evidence or of its absence.

    Group membership and roles

    Group members—>Identifying stakeholders involves identifying all the groups whose activities would be covered by the guideline or who have other legitimate reasons for having an input into the process. This is important to ensure adequate discussion of the evidence (or its absence) when developing the recommendations in the guideline. When presented with the same evidence a single specialty group will reach different conclusions than a multidisciplinary group—the specialty group will be systematically biased in favour of performing procedures in which it has a vested interest. 3 4 For example, the conclusions of a group of vascular surgeons favoured the use of carotid endarterectomy more than did a mixed group of surgeons and medical specialists.5 Individuals' biases may be better balanced in multidisciplinary groups, and such balance may produce more valid guidelines. Ideally the group should have at least six but no more than 12-15members; too few members limits adequate discussion and too many members makes effective functioning of the group difficult. Under certain circumstances (for example, guidelines for broad clinical areas) it may be necessary to trade off full representation against the requirement of having a functional group.

    Embedded Image

    Roles —Roles required within guideline development groups are those of group member, group leader, specialist resource, technical support, and administrative support. Group members are invited to participate as individuals working in their field; their role is to develop recommendations for practice based on the available evidence and their knowledge of the practicalities of clinical practice.

    The role of the group leader is both to ensure that the group functions effectively (the group process) and that it achieves its aims (the group task). The process is best moderated by someone familiar with (though not necessarily an expert in) the management of the clinical condition and the scientific literature, but who is not an advocate. He or she stimulates discussion and allows the group to identify where true agreement exists but does not inject his or her own opinion in the process. This requires someone with both clinical skills and group process skills. Using formal group processes rather than informal ones in group meetings produces different and possibly better outcomes.68

    Skills needed for guideline development

    • Literature searching and retrieval

    • Epidemiology

    • Biostatistics

    • Health services research

    • Clinical experts

    • Group process experts

    • Writing and editing

    Identifying and assessing the evidence

    Identifying and assessing the evidence is best done by performing a systematic review. The purpose of a systematic review is to collect all available evidence, assess its potential applicability to the clinical question under consideration, inspect the evidence for susceptibility to bias, and extract and summarise the findings.

    What sort of evidence?

    Identifying the clinical questions of interest will help set the boundaries for admissible evidence (types of study designs, year of publication, etc). For example, questions of the efficacy of interventions usually mean that randomised controlled trials should be sought, while questions of risk usually mean that prospective cohort studies should be sought.

    Where to look for evidence?

    The first step in gathering the evidence is to see if a suitable, recent systematic review has already been published. The Cochrane Library will also identify relevant Cochrane review groups, which should also be contacted to see if a review is in progress.

    If a current systematic review is not available, a computer search of Medline and Embase is the usual starting point, using search strategies tailored to appropriate types of studies (though such strategies have been validated only for randomised controlled trials9). For example, randomised controlled trials provide the best evidence to answer questions about the effectiveness of treatments, whereas prospective cohort studies generally provide the best evidence for questions about risk. The Cochrane controlled trials register (part of the Cochrane Library) contains references to over 218000 clinical trials that have been identified though database and hand searching; it should be examined early on in any review process. Checking references in articles will show additional relevant articles not identified by the computer search, and having experts in the field examine the list of articles helps ensure there are no obvious omissions. Additional search strategies, including searches for articles published in languages other than English,1012 computer searches of specialised databases, hand searching relevant journals, and searching for unpublished material, will often yield additional studies, but the resources needed for such activities are considerable. The cost effectiveness of various search strategies has not been established. It is best to match the scope of the search strategy to the available resources.

    Assessing studies for relevance

    Once studies have been identified, they are assessed for relevance to the clinical questions of interest and for bias. 13 14 Screening for relevance is often possible from the abstract; it narrows the set of studies to those needing a more detailed assessment. Using explicit rather than implicit criteria should improve the reliability of the process.

    Summarising evidence

    Data are extracted from the relevant studies on the benefits, the harms, and (where applicable) the costs of the interventions being considered. These data are usually presented in a form that allows the designs and results of studies to be compared. Where appropriate, meta-analysis can be used to summarise results of multiple studies.

    Categorising evidence

    Summarised evidence is categorised to reflect its susceptibility to bias. This is a shorthand method of conveying specific aspects of the evidence to a reader of the guideline. A number of such “strength of evidence” classification schemes exist, but empirical evidence exists only for schemes that categorise effectiveness studies by study design. 15 16 The box shows a simple scheme for classifying the evidence that supports statements in practice guidelines and the strength of the recommendations. Guideline developers should use a limited number of explicit criteria, incorporating criteria for which there is explicit evidence.

    Classification schemes

    Category of evidence
    • Ia—evidence for meta-analysis of randomised controlled trials

    • Ib—evidence from at least one randomised controlled trial

    • IIa—evidence from at lease one controlled study without randomisation

    • IIb—evidence from at lease one other type of quasi-experimental study

    • III—evidence from non-experimental descriptive studies, such as comparative studies, correlation studies, and case-control studies

    • IV—evidence from expert committee reports or opinions or clinical experience of respected authorities, or both

    Strength of recommendation
    1. —directly based on category I evidence

    2. —directly based on category II evidence or extrapolated recommendation from category I evidence

    3. —directly based on category III evidence or extrapolated recommendation from category I or II evidence

    4. —directly based on category IV evidence or extrapolated recommendation from category I, II or III evidence

    Translating evidence into a clinical practice guideline

    The evidence, once gathered, needs to be interpreted (see box). Since conclusive evidence exists for relatively few healthcare procedures, deriving recommendations solely in areas of strong evidence would lead to a guideline of limited scope or applicability.17 This could be sufficient if, for example, the guideline is to recommend the most strongly supported treatments for a given illness, but more commonly the evidence needs to be interpreted into a clinical, public health, policy, or payment context. Therefore within the guideline development process a decision should be taken about how opinion will be both used and gathered.

    Factors contributing to the process of deriving recommendations

    • The nature of the evidence (for example, its susceptibility to bias)

    • The applicability of the evidence to the population of interest (its generalisability)

    • Costs

    • Knowledge of the healthcare system

    • Beliefs and values of the panel

    Using and gathering opinion

    Opinion will be used to interpret evidence and also to derive recommendations in the absence of evidence. When evidence is being interpreted, opinion is needed to assess issues such as the generalisability of evidence—for example, to what degree evidence from small randomised clinical trials or controlled observational studies may be generalised, or to extrapolate results from a study in one population to the population of interest in the guideline (extrapolating a study in a tertiary, academic medical centre to the community population of interest to potential users of the guideline).

    Recommendations based solely on clinical judgment and experience are likely to be more susceptible to bias and self interest. Therefore, after deciding what role expert opinion is to play, the next step is deciding how to collect and assess expert opinion. There is currently no optimal method for this, but the process needs to be made as explicit as possible.

    Resource implications and feasibility

    In addition to scientific evidence and the opinions of expert clinicians, practice guidelines must often take account of the resource implications and feasibility of interventions. Judgments about whether the costs of tests or treatments are reasonable depend on how cost effectiveness is defined and calculated, on the perspective taken (for example, clinicians often view cost implications differently than would payers or society at large), and on the resource constraints of the healthcare system (for example, cash limited public systems versus private insurance based systems). Feasibility issues worth considering include the time, skills, staff, and equipment necessary for the provider to carry out the recommendations, and the ability of patients and systems of care to implement them.

    Grading recommendations

    It is common to grade each recommendation in the guideline. Such information provides the user with an indication of the guideline development group's confidence that following the guideline will produce the desired health outcome. “Strength of recommendation” classification schemes (such as the one in the box) range from simple to complex; no one scheme has been shown to be superior. Given the factors that contribute to a recommendation, strong evidence does not always produce a strong recommendation, and the classification should allow for this. The classification is probably best done by the group panel, using a democratic voting process after group discussion of the strength of the evidence.

    Reviewing and updating guidelines

    Guidelines should receive external review to ensure content validity, clarity, and applicability. External reviewers should cover three areas: people with expertise in clinical content, who can review the guideline to verify the completeness of the literature review and to ensure clinical sensibility; experts in systematic reviews or guideline development, or both, who can review the method by which the guideline was developed; and potential users of the guideline, who can judge its usefulness. In Britain there is a further review process whereby guidelines are appraised by an independent unit to assess whether the NHS Executive can commend them to the NHS.

    The guideline can be updated as soon as each piece of relevant new evidence is published, but it is better to specify a date for updating the systematic review that underpins the guideline.


    New advances in understanding the science of systematic reviews, the workings of groups of experts, and the relation between guideline development and implementation are all likely in the next three to five years.

    We believe that three principles will remain basic to the development of valid and usable guidelines:

    • The development of guidelines requires sufficient resources in terms of people with a wide range of skills, including expert clinicians, health services researchers, and group process leaders and financial support;

    • A systematic review of the evidence should be at the heart of every guideline; and

    • The group assembled to translate the evidence into a guideline should be multidisciplinary.


    Series editors: Martin Eccles, Jeremy Grimshaw


    View Abstract