BMJ 2003;326:816-819 ( 12 April )

Education and debate

Improving the quality of health care

Research methods used in developing and applying quality indicators in primary care

S M Campbell, research fellow aJ Braspenning, senior researcher bA Hutchinson, professor in public health cM N Marshall, professor of general practice a

a National Primary Care Research and Development Centre, University of Manchester, Manchester M13 9PL, b UMC St Radboud, WOK, Centre for Quality of Care Research, (229), Postbus 9101, 6500 HB Nijmegen, Netherlands, c University of Sheffield, Section of Public Health, Sheffield, S1 4DA

Correspondence to: S Campbell stephen.campbell{at}man.ac.uk.

Before we can take steps to improve the quality of health care, we need to define what quality care means. This article describes how to make best use of available evidence and reach a consensus on quality indicators

Quality improvement is part of the daily routine for healthcare professionals and a statutory obligation in many countries. Quality can be improved without measuring it---for example, by guiding care prospectively in the consultation using clinical guidelines.1 It is also possible to assess quality without quantitative measures, by using approaches such as peer review, videoing consultations, and patient interviews. Measurement, however, plays an important part in improvement.2 We discuss the methods available for developing and applying quality indicators in primary care.
Summary points


Most quality indicators are used in hospital practice but they are increasingly being developed for primary care

The information required to develop quality indicators can be derived by systematic or non-systematic methods

Non-systematic methods are quick and simple but the resulting indicators may be less credible than those developed by using systematic methods

Systematic methods can be based directly on scientific evidence or clinical guidelines or combine evidence and professional opinion

All measures should be tested for acceptability, feasibility, reliability, sensitivity to change, and validity




    What are quality indicators?

Indicators are explicitly defined and measurable items referring to the structures, processes, or outcomes of care.3 Indicators are operationalised by using review criteria and standards, but they are not the same thing; indicators are also different from guidelines (box 1). Care rarely meets absolute standards,5 and standards have to be set according to local context and patient circumstances. 6 7

Activity indicators measure how frequently an event happens, such as the rate of influenza immunisation. In contrast, quality indicators infer a judgment about the quality of care provided,6 and performance indicators8 are statistical devices for monitoring performance (such as use of resources) without any necessary inference about quality. Indicators do not provide definitive answers but indicate potential problems or good quality of care. Most indicators have been developed for use in hospitals but they are increasingly being developed for use in primary care.


    Principles of development

Three preliminary issues require consideration when developing indicators. The first is which aspects of care to assessw1 w2: structures (staff, equipment, appointment systems, etc),w3 processes (such as prescribing, investigations, interactions between professionals and patients),9 or outcomes (such as mortality, morbidity, or patient satisfaction).w4 Our focus is on process indicators, which have been the primary object of quality assessment and improvement. 2 10 The second issue is that stakeholders have different perspectives about quality of care.2 w5 For example, patients often emphasise good communication skills, whereas managers' views are often influenced by data on efficiency. It is important to be clear which stakeholder views are being represented when developing indicators. Finally, development of indicators requires supporting information or evidence. This can be derived by systematic or non-systematic methods.


Box 1: Definitions and examples of guidelines, indicators, review criteria, and standards


Guideline Definition Example
Indicator Systematically developed statements to help practitioners and patients make decisions in specific clinical circumstances. They essentially define best practice1 If a blood pressure reading is raised on one occasion, the patient should be followed up on two further occasions within 6 months
Review criterion Retrospectively measurable element of practice performance for which there is evidence or consensus that it can be used to assess quality of care provided and hence change it6 Patients with a blood pressure >160/90 mm Hg should have their blood pressure remeasured within 3 months
Standard: Systematically developed statement relating to a single act of medical care.6 The statement is so clearly defined that it is possible to determine retrospectively whether the element of care occurred4 If an individual patient's blood pressure was >160/90 mm Hg, was it remeasured within 3 months?
  Target standard The level of compliance with a criterion or indicator6 90% of practice's patients with blood pressure >160/90 mm Hg should have their blood pressure remeasured within 3 months
  Achieved standard Set prospectively and stipulates a level of care that providers must strive to meet 80% of practice's patients with blood pressure >160/90 mm Hg had their blood pressure remeasured within 3 months
Measured retrospectively and details whether a care provider met a predetermined standard




    Non-systematic research methods

Non-systematic approaches are not evidence based, but indicators developed in this way can still be useful, not least because they are quick and easy to create. An example includes a quality improvement project based on one case study such as a termination of pregnancy in a 13 year old girl. 11 12 Examination of her medical records showed two occasions when contraception could have been discussed, and this led to the development of a quality indicator relating to contraceptive counselling.


    Systematic, evidence based methods

Whenever possible, indicators should be based solely on scientific evidence such as rigorously conducted (trial based) empirical studies. 13 14 The better the evidence, the stronger the benefits of applying the indicators in terms of reduced morbidity and mortality. An example of an evidence based indicator is that patients with confirmed coronary artery disease should receive low dose (75 mg) aspirin unless contraindicated, as aspirin is associated with health benefits in such patients.


    Systematic methods combining evidence and expert opinion

Many areas of health care have a limited or methodologically weak evidence base, 2 6 15 especially within primary care. Quality indicators therefore have to be developed using other evidence alongside expert opinion. However, because experts often disagree on the interpretation of evidence, rigorous methods are needed to incorporate their opinion.

Consensus methods are structured facilitation techniques that explore consensus among a group of experts by synthesising opinions. Group judgments are preferable to individual judgments, which are prone to personal bias. Several consensus techniques exist,16-19 including consensus development conferences,17 w6 the Delphi technique,w7 w8 the nominal group technique,w9 the RAND appropriateness method,20 w10 and iterated consensus rating procedures (table).21


                              
View this table:
[in this window]
[in a new window]
 

Characteristics of informal and formal methods for developing consensus*

Consensus development conferences
In this technique, a selected group of about 10 people are presented with evidence by interested individuals or organisations that are not part of the decision making group. The selected group discusses this evidence and produces a consensus statement.w11 However, unlike the other techniques, these conferences use implicit methods for aggregating the judgments of individuals (such as majority voting). Explicit techniques use aggregation methods in which panellists' judgments are combined using predetermined mathematical rules, such as the median of individual judgments.17 Moreover, although these conferences provide a public forum for debate, they are expensive16 and there is little evidence of their effect on clinical practice or patient outcomes.w12

Indicators derived from guidelines by iterated consensus rating procedure
Indicators can be based on clinical guidelines.w13 w14 Review criteria derived directly from clinical guidelines are now part of NHS policy in England and Wales through the work of the National Institute for Clinical Excellence. One example is the management of type 2 diabetes.w15 Iterated consensus rating is the most commonly used method in the Netherlands,w13 w16 where indicators are based on the effect of guidelines on outcomes of care rated by expert panels and lay professionals.w17

Delphi technique
The Delphi technique is a postal method involving two or more rounds of questionnaires. Researchers clarify a problem, develop questionnaire statements to rate, select panellists to rate them, conduct anonymous postal questionnaires, and feed back results (statistical, qualitative, or both) between rounds. It has been used to develop prescribing indicators.w18 A large group can be consulted from a geographically dispersed population, although different viewpoints cannot be debated face to face. Delphi procedures have also been used to develop quality indicators with users or patients.w19

Nominal group technique
The nominal group technique aims to structure interaction within a group of experts. 16 17 w9 The group members meet and are asked to suggest, rate, or prioritise a series of questions, discuss the questions, and then re-rate and prioritise them. The technique has been used to assess the appropriateness of clinical interventionsw20 and to develop clinical guidelines.w21 This technique has not been used to develop quality indicators with patients, although it has been used to determine patients' views of, for example, diabetes.w22

RAND appropriateness method
The RAND method requires a systematic literature review for the condition to be assessed, generation of indicators based on this literature review, and the selection of expert panels. This is followed by a postal survey, in which panellists are asked to read the evidence and rate the preliminary indicators, and a face to face panel meeting, in which panellists discuss and re-rate each indicator.w10 The method therefore combines characteristics of both the Delphi and nominal group techniques. It has been described as the only systematic method of combining expert opinion and evidence.w23 It also incorporates a rating of the feasibility of collecting data.

The method has been used mostly to develop review criteria for clinical interventions in the United Statesw24 and the United Kingdom.7 w25 As with the nominal group technique, panellists meet and discuss the criteria, but because panellists have access to a systematic literature review, they can ground their ratings in the scientific evidence. Agreement between similar panels rating the same indicators has been found to have greater reliability than the reading of mammograms.w10 However, users or patients are rarely included, and the cost implications are not considered.
 
(Credit: SUE SHARPLES)

Maximising effectiveness

  • Several factors affect the outputs derived using consensus techniques.19 These include:
  • Selection of participants (number, level of homogeneity, etc)
  • How the information is presented (for example, level of evidence)
  • How the interaction is structured (for example, number of rounds)
  • Method of synthesising individual judgments (for example, definition of agreement)
  • Task set (for example, questions to be rated).

The composition of the group is particularly important. For example, group members who are familiar with a procedure are more likely to rate it higher.w26 The feedback provided to panellists is also important.w27

Group meetings rely on skilled moderators and on the willingness of the group to work together in a structured meeting. Unlike postal surveys, group meetings can inhibit some members if they feel uncomfortable sharing their ideas, although panellists' ratings carry equal weight, however much they have contributed to the debate. Panels for group meetings are smaller than Delphi panels for practical reasons.




    Research methods for applying indicators

Measures developed by consensus techniques have face validity and those based on rigorous evidence possess content validity. This is a minimum prerequisite for any quality measure. All measures have to be tested for acceptability, feasibility, reliability, sensitivity to change, and validity. 3 22 This can be done by assessing measures' psychometric properties (including factor analyses), surveys (patient or practitioner, or both), clinical or organisational audits, interviews or focus groups. Box 2 gives an example of the development and testing of review criteria for angina, asthma, and diabetes. 9 23


Box 2: Developing and applying review criteria for angina, asthma, and type 2 diabetes

Aim---Quality assessment of angina, asthma, and type 2 diabetes 9 23

Sample---60 general practices in England

Patient sample---1000 patients with angina, 1000 with asthma, 1000 with diabetes

Method---Clinical audit; semistructured interviews with nurses and doctors

Acceptability---Used only review criteria that were rated acceptable and valid by the nurses and doctors working in the practices

Reliability---Excluded criteria with an inter-rater reliability kappa  coefficient <0.6

Feasibility--- Excluded criteria relating to <1% of the population sample

Acceptability ---The acceptability of the data collected depends on whether the findings are acceptable to both those being assessed and their assessors. For example, doctors and nurses can be asked about the acceptability of review criteria being used to assess their quality of care.

Feasibility--- Information about quality of care is often driven by availability of data.w28 Quality is difficult to measure without accurate and consistent information,w1 which is often unavailable at both the macro (health organisations) and micro (individual medical records) level.w29 Quality indicators must also relate to enough patients to make comparing data feasible---for example, by excluding those aspects of care that occur in less than 1% of clinical audit samples.

Reliability ---Reliability refers to the extent to which a measurement with an indicator is reproducible. This depends on several factors relating to both the indicator itself and how it is used. For example, indicators should be used to compare organisations or practitioners with similar organisations or practitioners. The inter-rater reliability refers to the extent to which two independent raters agree on their measurement of an item of care.22 In one study, five diabetes criteria out of 31 developed using an expert panel9 were found to have poor agreement between raters when used in an audit.23

Sensitivity to change ---Quality measures need to detect changes in quality of care in order to discriminate between and within subjects.22 This is an important and often forgotten dimension of a quality indicator.6 Little research is available on sensitivity to change of quality indicators using time series or longitudinal analyses.

Validity ---Content validity in this context refers to whether any criteria were rated valid by panels contrary to known results from randomised controlled trials.w30 The validity of indicators has received more attention recently.3 w2 w31 Although little evidence exists of the content validity of the Delphi and nominal group techniques in developing quality indicators,16 there is some evidence of validity for indicators developed with the RAND method.w30 There is also evidence of the predictive validity of indicators developed with the RAND method.w32


    Conclusion

Although it may never be possible to produce an error- free measure of quality, measures should be tested during their development and application for acceptability, feasibility, reliability, sensitivity to change, and validity. This will optimise their effectiveness in quality improvement strategies. Indicators are more likely to be effective if they are derived from rigorous scientific evidence. Because evidence in health care is often unavailable, consensus techniques facilitate quality improvement by allowing a broader range of aspects of care to be assessed and improved.7 However, simply measuring something will not automatically improve it, and indicators must be used within quality improvement approaches that focus on whole healthcare systems.24

    Footnotes

This is the second of three articles on research to improve the quality of health care

Competing interests: None declared.

Further references are available on bmj.com. These are denoted in the text by the prefix w
    References

1. Grimshaw JM, Russell IT. Effect of clinical guidelines on medical practice: a systematic review of rigorous evaluations. Lancet 1993; 342: 1317-1322[CrossRef][ISI][Medline].
2. Donabedian A. Explorations in quality assessment and monitoring. , Vol 1. The definition of quality and approaches to its assessment Ann Arbor, MI: Health Administration Press, 1980.
3. McGlynn EA, Asch SM. Developing a clinical performance measure. Am J Prev Med 1998; 14: 14-21[CrossRef][ISI][Medline].
4. Donabedian A. Explorations in quality assessment and monitoring. , Vol 2. The criteria and standards of quality Ann Arbor, MI: Health Administration Press, 1982.
5. Seddon ME, Marshall MN, Campbell SM, Roland MO. Systematic review of studies of clinical care in general practice in the United Kingdom, Australia and New Zealand. Quality in Health Care 2001; 10: 152-158[Abstract/Free Full Text].
6. Lawrence M, Olesen F. Indicators of quality health care. Eur J Gen Pract 1997; 3: 103-108.
7. Marshall M, Campbell SM, Hacker J, Roland MO, eds. Quality indicators for general practice: a practical guide for health professionals and managers. London: Royal Society of Medicine, 2002.
8. Buck D, Godfrey C, Morgan A. Performance indicators and health promotion targets. York: Centre for Health Economics, University of York, 1996. (Discussion paper 150.)
9. Campbell SM, Roland MO, Shekelle PG, Cantrill JA, Buetow SA, Cragg DK. Development of review criteria for assessing the quality of management of stable angina, adult asthma and non-insulin dependent diabetes in general practice. Quality in Health Care 1999; 8: 6-15[Abstract].
10. Brook RH, McGlynn EA, Shekelle PG. Defining and measuring quality of care: a perspective from US researchers. Int J Qual Health Care 2000; 12: 281-295[Abstract/Free Full Text].
11. Pringle M. Preventing ischaemic heart disease in one general practice: from one patient, through clinical audit, needs assessment, and commissioning into quality improvement. BMJ 1998; 317: 1120-1124[Free Full Text].
12. Pringle M. Clinical governance in primary care. Participating in clinical governance. BMJ 2000; 321: 737-740[Free Full Text].
13. Hearnshaw HM, Harker RM, Cheater FM, Baker RH, Grimshaw GM. Expert consensus on the desirable characteristics of review criteria for improvement of health quality. Quality in Health Care 2001; 10: 173-178[Abstract/Free Full Text].
14. McCall A, Roderick P, Gabbay J, Smith H, Moore M. Performance indicators for primary care groups: an evidence-based approach. BMJ 1998; 317: 1354-1360[Free Full Text].
15. Naylor CD. Grey zones in clinical practice: some limits to evidence based medicine. Lancet 1995; 345: 840-842[CrossRef][ISI][Medline].
16. Jones JJ, Hunter D. Consensus methods for medical and health services research. BMJ 1995; 311: 376-380[Free Full Text].
17. Murphy MK, Black NA, Lamping DL, McKee CM, Sanderson CFB, Ashkam J, et al. Consensus development methods, and their use in clinical guideline development. Health Technol Assess 1998;2(3).
18. Fink A, Kosecoff J, Chassin M, Brook RH. Consensus methods: characteristics and guidelines for use. Am J Pub Health 1984; 74: 979-983[Abstract/Free Full Text].
19. Black N, Murphy M, Lamping D, McKee M, Sanderson C, Ashkam J, et al. Consensus development methods: a review of best practice in creating clinical guidelines. Journal of Health Services Research and Policy 1999; 4: 236-248.
20. Brook RH, Chassin MR, Fink A, Solomon DH, Kosecoff J, Park RE. A method for the detailed assessment of the appropriateness of medical technologies. International Journal of Technology Assessment in Health Care 1986; 2: 53-63[Medline].
21. Braspenning J, Drijver R, Schiere AM. Kwaliteits--- en doelmatigheidsindicatoren voor het handelen in de huisartspraktijk. Nijmegen, Utrecht: Centre for Quality of Care Research, Dutch College of General Practitioners, 2001.
22. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. Oxford: Oxford Medical Publications, 1995.
23. Campbell SM, Hann M, Hacker J, Roland MO. Quality assessment for three common conditions in primary care: validity and reliability of review criteria developed by expert panels for angina, asthma and type 2 diabetes. Quality and Safety in Health Care 2002; 11: 125-130[Abstract/Free Full Text].
24. Ferlie EB, Shortell SM. Improving the quality of health care in the United Kingdom and the United States: A framework for change. Milbank Q 2001; 79: 281-315[CrossRef][ISI][Medline].


© 2003 BMJ Publishing Group Ltd

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?

This article has been cited by other articles:

  • van der Ploeg, E, Depla, M F I A, Shekelle, P, Rigter, H, Mackenbach, J P (2008). Developing quality indicators for general practice care for vulnerable elders; transfer from US to The Netherlands. Qual Saf Health Care 17: 291-295 [Abstract] [Full text]  
  • Mourad, S.M., Nelen, W.L.D.M., Hermens, R.P.M.G., Bancsi, L.F., Braat, D.D.M., Zielhuis, G.A., Grol, R.P.T.M., Kremer, J.A.M. (2008). Variation in subfertility care measured by guideline-based performance indicators. Hum Reprod 0: den281v1-8 [Abstract] [Full text]  
  • Coenen, S., Ferech, M., Haaijer-Ruskamp, F. M, Butler, C. C, Vander Stichele, R. H, Verheij, T. J M, Monnet, D. L, Little, P., Goossens, H., the ESAC Project Group, (2007). European Surveillance of Antimicrobial Consumption (ESAC): quality indicators for outpatient antibiotic use in Europe. Qual Saf Health Care 16: 440-445 [Abstract] [Full text]  
  • Mourad, S.M., Hermens, R.P.M.G., Nelen, W.L.D.M., Braat, D.D.M., Grol, R.P.T.M., Kremer, J.A.M. (2007). Guideline-based development of quality indicators for subfertility care. Hum Reprod 22: 2665-2672 [Abstract] [Full text]  
  • Francis, H C, Prys-Picard, C O, Fishwick, D, Stenton, C, Burge, P S, Bradshaw, L M, Ayres, J G, Campbell, S M, Niven, R M. (2007). Defining and investigating occupational asthma: a consensus approach. Occup. Environ. Med. 64: 361-365 [Abstract] [Full text]  
  • van Roosmalen, M S, Braspenning, J C C, De Smet, P A G M, Grol, R P T M (2007). Antibiotic prescribing in primary care: first choice and restrictive prescribing are two different traits. Qual Saf Health Care 16: 105-109 [Abstract] [Full text]  
  • Nelen, W.L.D.M., Hermens, R.P.M.G., Mourad, S.M., Haagen, E.C., Grol, R.P.T.M., Kremer, J.A.M. (2007). Monitoring reproductive health in Europe: what are the best indicators of reproductive health? A need for evidence-based quality indicators of reproductive health care. Hum Reprod 22: 916-918 [Abstract] [Full text]  
  • Grunfeld, E., Lethbridge, L., Dewar, R., Lawson, B., Paszat, L. F, Johnston, G., Burge, F., McIntyre, P., Earle, C. C (2006). Towards using administrative databases to measure population-based indicators of quality of end-of-life care: testing the methodology. Palliat Med 20: 769-777 [Abstract]  
  • Cremonesi, A., Setacci, C., Bignamini, A., Bolognese, L., Briganti, F., Di Sciascio, G., Inzitari, D., Lanza, G., Lupattelli, L., Mangiafico, S., Pratesi, C., Reimers, B., Ricci, S., de Donato, G., Ugolotti, U., Zaninelli, A., Gensini, G. F. (2006). Carotid Artery Stenting: First Consensus Document of the ICCS-SPREAD Joint Committee. Stroke 37: 2400-2409 [Abstract] [Full text]  
  • Miller, C. D., Ziemer, D. C., Kolm, P., El-Kebbi, I. M., Cook, C. B., Gallina, D. L., Doyle, J. P., Barnes, C. S., Phillips, L. S. (2006). Use of a Glucose Algorithm to Direct Diabetes Therapy Improves A1C Outcomes and Defines an Approach to Assess Provider Behavior.. The Diabetes Educator 32: 533-545 [Abstract] [Full text]  
  • Guttmann, A., Razzaq, A., Lindsay, P., Zagorski, B., Anderson, G. M. (2006). Development of Measures of the Quality of Emergency Department Care for Children Using a Structured Panel Process. Pediatrics 118: 114-123 [Abstract] [Full text]  
  • Cryer, C, Langley, J D, Jarvis, S N, Mackenzie, S G, Stephenson, S C R, Heywood, P, on behalf of the International Collaborative Effor, (2005). Injury outcome indicators: the development of a validation tool. Inj. Prev. 11: 53-57 [Abstract] [Full text]  
  • Elwyn, G, Rhydderch, M, Edwards, A, Hutchings, H, Marshall, M, Myres, P, Grol, R (2004). Assessing organisational development in primary medical care using a group based assessment: the Maturity MatrixTM. Qual Saf Health Care 13: 287-294 [Abstract] [Full text]  



Student BMJ

Sepsis

The latest guidlines will affect how we practice medicine

www.student.bmj.com

Listen to the latest BMJ Interview