Papers

Quality of life measurement: bibliographic study of patient assessed health outcome measures

BMJ 2002; 324 doi: http://dx.doi.org/10.1136/bmj.324.7351.1417 (Published 15 June 2002) Cite this as: BMJ 2002;324:1417
  1. Andrew Garratt (andrew.garratt{at}uhce.ox.ac.uk), research officera,
  2. Louise Schmidt, research officera,
  3. Anne Mackintosh, database managera,
  4. Ray Fitzpatrick, professorb
  1. a National Centre for Health Outcomes Development, Unit of Health-Care Epidemiology, Institute of Health Sciences, University of Oxford, Oxford OX3 7LF
  2. b Department of Public Health, Institute of Health Sciences, University of Oxford, Oxford OX3 7LF
  1. Correspondence to: A Garratt
  • Accepted 9 January 2002

Abstract

Objectives: To assess the growth of quality of life measures and to examine the availability of measures across specialties.

Design: Systematic searches of electronic databases to identify developmental and evaluative work relating to health outcome measures assessed by patients.

Main outcome measures: Types of measures: disease or population specific, dimension specific, generic, individualised, and utility. Specialties in which measures have been developed and evaluated.

Results: 3921 reports that described the development and evaluation of patient assessed measures met the inclusion criteria. Of those that were classifiable, 1819 (46%) were disease or population specific, 865 (22%) were generic, 690 (18%) were dimension specific, 409 (10%) were utility, and 62 (1%) were individualised measures. During 1990-9 the number of new reports of development and evaluation rose from 144 to 650 per year. Reports of disease specific measures rose exponentially. Over 30% of evaluations were in cancer, rheumatology and musculoskeletal disorders, and older people's health. The generic measures—SF-36, sickness impact profile, and Nottingham health profile—accounted for 612 (16%) reports.

Conclusions: In some specialties there are numerous measures of quality of life and little standardisation. Primary research through the concurrent evaluation of measures and secondary research through structured reviews of measures are prerequisites for standardisation. Recommendations for the selection of patient assessed measures of health outcome are needed.

What is already known on this topic

Quality of life measures are increasingly used for measuring health outcomes in evaluative research

There is little standardisation in the use of such measures within clinical trials

What this study adds

There has been exponential growth in reports relating to the development and evaluation of quality of life measures

The number of reports varies considerably according to the health problem

Introduction

Clinical trials and similar forms of evaluative study should incorporate the patient's perspective of outcome.1 For complete assessment of the benefits of an intervention it is essential to provide evidence of the impact on the patient in terms of health status and health related quality of life. These terms refer to experiences of illness such as pain, fatigue, and disability and also broader aspects of the individual's physical, emotional, and social wellbeing. 2 3 Unlike conventional medical indicators, these broader impacts of illness and treatment need, wherever possible, to be assessed and reported by the patient.

Several reviews have criticised researchers for their failure to use appropriate measures of health related quality of life in evaluations purporting to address the impact of interventions by assessing outcomes of concern to patients.37 Trials may either neglect outcomes other than conventional clinical, laboratory, and radiological measures or may use limited, inappropriate, or poorly validated indicators as surrogates of the patient's own experiences.6 It is not clear whether this failure to incorporate patients' assessments of outcome arises because appropriate methods do not exist or because methods exist but have not been widely adopted. There may be practical or logistical difficulties in obtaining reliable reports of outcomes from patients. Also there may be differences in the perceived importance of health related quality of life and related constructs in different aspects of clinical and evaluative research.

In recent years considerable enthusiasm has been expressed for the potential of questionnaires to provide accurate evidence of outcomes from the patient's perspective. It is not clear how well developed such methods are and whether they are available across the full range of health problems. We undertook an extensive review to describe the extent to which patient assessed outcome measures have been developed and applied and examined whether such instruments are available for all aspects of clinical research.

Methods

Search strategy—We retrieved reports relating to the development and evaluation of patient assessed measures. We based our search terms on terminology applicable to the development and evaluation of patient assessed health outcomes and terminology used in structured reviews. 3 8 We searched the following from their inception to 2000: AMED, Biological Abstracts, British Nursing Index, Cinahl, Econlit, Embase, Medline, PAIS International, PsycInfo, Royal College of Nursing database, Sigle, and Sociological Abstracts. We searched only for references in English. We created an electronic database from the retrieved records.

Assessment of reports—The inclusion criteria comprised the development and testing of patient assessed measures including aspects of health status and quality of life, summary items, and symptoms. We included measures completed on behalf of the patient by proxy and measures of carers' health and quality of life. We excluded reports that related solely to the use of measures. We assessed the reports for the different types of measure (box) and specialties using a classification based on that used in a review of quality of life measures within randomised clinical trials, supplementing where necessary.2

Types of measure

Dimension specific measures focus on particular aspects of health such as psychological wellbeing and usually produce a single score—for example, Beck depression inventory9

Disease or population specific measures include aspects of health that are relevant to particular health problems and may measure several health domains—for example, asthma quality of life questionnaire10

Generic measures can be used across different patient populations; they usually measure several health domains—for example, SF-3611

Individualised measures allow respondents to include and weight the importance of aspects of their own life; they usually sum to produce a single score—for example, patient generated index12

Utility measures have been developed for economic evaluation, incorporate preferences for health states, and produce a single index—for example, EuroQol EQ-5D13

Figure1

Number of reports for four main types of measure by year

Results

Search strategy

After we removed duplicates the initial download from the electronic databases produced 23 042 records. Of these, 3921 (17.0%) met the inclusion criteria and reported on the development and testing of patient assessed measures of health outcome. The 3921 reports cited 1275 identifiable measures. The figure shows the number of new reports for the four most common types of measure in the period 1990-9. The number of reports increased from 144 new records in 1990 to 650 in 1999. At the time of our search the databases were incomplete for 2000.

Table 1 shows the types of measure evaluated within the 3921 reports. There was considerable overlap between the types of measure because a large number report the concurrent validation of measures. Most (1819) reported the development and evaluation of measures specific for a disease or population; 865 reported generic measures; 690 reported dimension specific measures; 409 reported utility measures; and 62 reported individualised measures. Within the category of generic measures, most reported the development of health profiles such as the SF-36,11 Nottingham health profile,14 and sickness impact profile.15 Within the category of dimension specific measures, most reported measures of psychological wellbeing, symptoms, and physical function. Within the category of utility measures, most reported the development and testing of multi-attribute measures such as the EuroQol13 and health utilities index.16

Table 1.

Number of reports for different types of measure

View this table:

The figure shows that reports of disease or population specific measures are responsible for most of the growth in evaluations over the period 1990-9. After nine years of growth, the number of reports relating to generic measures declined in 1999. There was more modest growth in reports of dimension specific and utility measures over the entire period.

Table 2 shows the number of reports for each of the 30 specialties that we identified. The largest number of evaluations were for rheumatology and musculoskeletal medicine, cancer, and older people; these three accounted for 31% of the 3921 reports. Mental health, neurological diseases, paediatrics-child health, and respiratory diseases were the only other specialties that accounted for more than 5% of records each. There were also a large number of reports (6%) for generic and utility measures that have been evaluated within general populations.

Table 2.

Specialties covered by 3921 reports

View this table:

The frequency with which specialties appear is reflected in the number of records pertaining to individual measures in table 3. The arthritis impact measurement scales,17 health assessment questionnaire,18 and European Organisation for Research into the Treatment of Cancer quality of life questionnaire (EORTC QLQ-C30)19 were the three disease specific measures reported most frequently. However it was the generic measures, including the SF-36,11 sickness impact profile,15 and Nottingham health profile14 that had undergone the largest number of evaluations. These three measures accounted for 16% of the total number of reports; they have been evaluated across numerous patient populations and have been translated into several languages. Population norms are also widely available for these measures. Of the utility measures, the EuroQol13and health utilities index16 have undergone the largest number of evaluations.

Table 3.

Most widely evaluated measures within 3921 reports

View this table:

Discussion

The application of patient assessed measures of health outcome has become increasingly important to evaluation of health care.1 We have shown considerable growth in the production of measures to support this trend. Growth has not been consistent across specialties or health problems and has been concentrated around the development of measures specific for diseases or populations.

Growth by specialty

There was considerable variation in the development and evaluation of measures across the 30 specialties identified. Much of this work has taken place in cancer and rheumatology and musculoskeletal medicine, which account for over a fifth of reports. Both specialties have a long established history of assessment of quality of life and policies to promote the use of measures within evaluative research introduced by funding bodies and professional organisations. 20 21 The remaining 28 specialties accounted for between 0.3% (infection) and 8% (elderly) of the total records. There was relatively less development and evaluation of measures for burns and trauma, intensive care, and gynaecology. Many problems presented in gynaecology clinics, while benign, are chronic and associated with substantial psychosocial distress.22 It is therefore surprising that the assessment of outcomes from the patient's perspective is not so well advanced.

Types of measure

Four of the five main types of measure—generic, disease or population specific, dimension specific, and utility—have undergone sustained growth in terms of the development and evaluation of properties of measurement. Generic measures are broadly applicable and can therefore be used across patient populations. They include the SF-36,11 which was the most widely evaluated measure accounting for over 10% of the total number of reports.

Disease or population specific measures focus on aspects of health that are important to specific health problems and therefore have greater potential to fulfil the necessary measurement criteria for outcome measures within clinical research, including responsiveness to small but important changes in health.3 This type of measure formed the bulk of the growth in evaluations, accounting for almost half of all reports. The demand for such measures has been stimulated by their relevance to clinical trials, which continues to be the area in which patient assessed measures are most commonly applied. There are many measures that are specific to health problems such as asthma,23 back pain,24 chronic obstructive pulmonary disease,25 and diabetes.8

The growth in utility based and individualised measures is more modest. Both have measurement properties that are not captured by generic and specific measures based on summed rating scales.3 Utility measures incorporate preferences for health states and together produce a single index which is useful for making comparisons across treatments and health problems for purposes of economic evaluation. Individualised measures allow patients to include aspects of their lives that they consider important together with weightings designed to measure the relative importance of different domains.12

Selection of measures

The different types of measure are all potentially useful for evaluating health outcomes from the perspective of the individual patient, and there are now multiple measures available within these individual categories. Those wishing to select a measure for a specific application face quite a daunting task. Although there is some evidence for the standardisation of generic measures with a few measures achieving widespread application, this is not the case for disease specific measures. For many patient populations there are several specific measures. It is perhaps not surprising that there is evidence of a lack of consistency in the selection of measures for clinical trials which hinders comparisons between studies.2 In a study of 67 clinical trials, 48 were found to use 62 different existing measures and 13 reported new measures.2

The selection of measures can be informed through primary research that compares measures against recommended criteria,3 recommendations based on expert consensus, and structured reviews that assess the evidence for different measures. The concurrent evaluation of measures within primary research typically involves the comparative evaluation of reliability, validity, and responsiveness. Recommendations have been produced for the use of patient assessed measures in rheumatoid arthritis and back pain. 20 26 Our search strategy identified 314 reviews of instruments. The quality of the reviews was variable with just 47 using the words comprehensive, structured, or systematic within the title or abstract. Most reviews compared measures for reliability, validity, and responsiveness to change. However several other important considerations relating to the selection of patient assessed measures have been described, 3 27 the most pertinent being the relevance of the content of a measure to the proposed application.

Conclusions

The huge growth in the number of patient assessed measures of health outcome has obvious benefits in terms of the availability of measures for specific populations. However, potential users require guidance particularly when faced with multiple measures. Structured reviews together with recommendations based on patient and professional consensus are required for the effective application of measures. Concurrent evaluation can also help to determine the most suitable measure for a particular application. Finally, researchers should undertake comprehensive literature searches to ascertain whether a suitable measure is available before they decide to develop a new one.

Acknowledgments

We thank Elizabeth Oram and Monique Raats, who helped with data management and literature searches.

Contributors: RF and AG designed the protocol for the study. AM performed the literature searches under the supervision of RF and AG. Data were extracted by AG and AM. AG, AM, and LS assessed records against inclusion criteria. RF, AG, and LS analysed the records. All authors helped to write the paper. AG is the guarantor.

Footnotes

  • Funding AG, AM, and LS are funded by the Department of Health as part of its funding of the National Centre for Health Outcomes Development. The views expressed in this paper are those of the authors and not necessarily those of the Department of Health.

  • Competing interests None declared.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.