Elsevier

The Lancet

Volume 364, Issue 9432, 31 July–6 August 2004, Pages 429-437
The Lancet

Articles
An experimental study of determinants of group judgments in clinical guideline development

https://doi.org/10.1016/S0140-6736(04)16766-4Get rights and content

Summary

Background

Clinical guidelines for improving the quality of care are a familiar part of clinical practice. Formal consensus methods such as the nominal group technique are often used as part of guideline development, but little is known about factors that affect the statements produced by nominal groups, and on their consistency with the research evidence.

Methods

Cognitive behavioural therapy, behavioural therapy, brief psychodynamic interpersonal therapy, and antidepressants for irritable bowel syndrome, chronic fatigue syndrome, and chronic back pain were selected for study. 16 nominal groups in a factorial design allowed comparison of GP-only with mixed groups of GPs and specialists, provision of a literature review with no provision, and ratings made in the context of realistic or ideal levels of health-care resources. Participants rated appropriateness independently, and again after a facilitated meeting. Audiotapes of four group discussions were analysed.

Findings

There was agreement with the research evidence for 51% of 192 scenarios. Agreement was more likely if the group was GP-only, if a literature review was provided, or if the evidence was in accordance with clinicians' beliefs. Assumptions about the level of resources available had no impact. Clinical and social cues had mixed effects, irrespective of the research evidence. Qualitative analysis showed the modifying effect of clinical experience and beliefs about research evidence.

Interpretation

Guidelines cannot be based on data alone; judgment is unavoidable. The nominal group technique is a method of eliciting and aggregating judgments in a transparent and structured way. It can provide important information on levels of agreement between experts. However, conclusions can be at odds with the published literature. If they are, reasons need to be explicit.

Introduction

In many countries, there are clinical guidelines for disseminating good practice in medicine.1, 2 Ideally, guidelines should be based on evidence from large, well conducted studies, but often such research does not exist3 and, where it does, how the results might be applied to particular patients can be unclear.1 Also, guidelines may depend implicitly on interpretation of the literature, on judgments about value and risk,4 on the funding and organisation of health services,5 and, if public funding is involved, on policies about priorities and equity. The synthesis of the research evidence may be rigorous and transparent, but the judgments tend to be opaque.

Formal consensus development methods, often based on the nominal group technique, are widely used because, unlike informal methods such as committees, they offer structured, transparent, and replicable ways of synthesising individual judgments.6 In the UK, NICE and other professional bodies have used modified nominal group techniques, as have at least seven other countries.2, 7, 8

In the modified nominal group technique, participants first express their views independently via a postal questionnaire. They then meet for review and discussion, after which they complete the questionnaire again privately, revising their views if they wish. The practical application of this process has been far from uniform. A systematic review revealed a dearth of research into its workings9 and despite some subsequent studies,10, 11, 12, 13, 14, 15 the key questions posed by the review have not yet been adequately answered. Our aim was to investigate the effect on the judgments produced and on the extent to which there was agreement with research evidence for: (1) three types of factor used to generate clinical scenarios provided in questionnaires—the clinical condition, the treatment, and clinical or social cues; (2) three ways in which nominal groups can differ—provision of a literature review or not, group composition, and background assumptions about the level of health-care resources available.

We also aimed to explore qualitatively the reasons behind the group judgments.

The other research priorities identified by the systematic review were to assess the reliability and representativeness of formal consensus techniques. Results of these investigations will be reported elsewhere.

Section snippets

Methods

Three conditions (chronic back pain, irritable bowel syndrome, and chronic fatigue syndrome) were selected because they fulfilled the following criteria:

(1) there was a mismatch between current clinical practice and research evidence; (2) care was provided by at least two groups of clinicians (general practitioners (GPs) and mental-health professionals); (3) these conditions are important problems; and (4) national guidelines for the conditions had not been published in the UK at the time the

Results

There were 177 participants in the 16 groups, of whom 76% were GPs and 24% mental-health professionals. Mean age was 47 years, most were men (62%) and white (84%).

The relation between ratings for physical and for psychological outcomes for each scenario across the 16 groups was assessed by plotting their medians (figure 2). In view of the close agreement, subsequent analyses used the physical outcome ratings only.

Initial ratings produced by the GP-only nominal groups showed moderate agreement (κ

Discussion

A formal consensus development method produced judgments that were consistent with our assessments of the research evidence in about half the scenarios considered. The extent of concordance varied between the conditions and treatments studied. Concordance was more likely if a literature review was provided and if this evidence supported clinicians' experiences and beliefs. If clinical experience and beliefs were not consistent with research evidence, then the experience and beliefs seemed to

References (29)

  • M Murphy et al.

    Consensus development methods, and their use in clinical guideline development

    Health Technol Assessment

    (1998)
  • A Nicollier-Fahrni et al.

    Development of appropriateness criteria for colonoscopy: comparison between a standardized expert panel and an evidence-based medicine approach

    Int J Qual Health Care

    (2003)
  • P Wortman et al.

    Consensus among experts and research synthesis: a comparison of methods

    Int J Technol Assess Health Care

    (1998)
  • S Bernstein et al.

    Effect of specialty and nationality on panel judgement of the appropriateness of coronary revascularisation: a pilot study

    Med Care

    (2001)
  • Cited by (127)

    • The injustice of unfit clinical practice guidelines in low-resource realities

      2021, The Lancet Global Health
      Citation Excerpt :

      When comparing multiple international, high-quality maternal health CPGs, important inconsistencies and disagreements are seen, even between concurrently published so-called high-quality CPGs.16 These inconsistencies and disagreements highlight limitations in experimental studies and the strong influence of values, culture, and professional tradition, even in what are considered high-quality CPGs.17 Translating evidence into recommendations inevitably requires judgements about the balance between benefits and risks, and involves combining selective and restricted study findings.

    View all citing articles on Scopus
    View full text