General Practice

Measuring outcomes in primary care: a patient generated measure, MYMOP, compared with the SF-36 health survey

BMJ 1996; 312 doi: (Published 20 April 1996) Cite this as: BMJ 1996;312:1016
  1. Charlotte Paterson, general practitionera
  1. a Warwick House Medical Centre, Taunton, Somerset TA1 2YJ
  • Accepted 29 February 1996


Objective: To assess the sensitivity to within person change over time of an outcome measure for practitioners in primary care that is applicable to a wide range of illness.

Design: Comparison of a new patient generated instrument, the measure yourself medical outcome profile (MYMOP), with the SF-36 health profile and a five point change score; all scales were completed during the consultation with practitioners and repeated after four weeks. 103 patients were followed up for 16 weeks and their results charted; seven practitioners were interviewed.

Setting: Established practice of the four NHS general practitioners and four of the private complementary practitioners working in one medical centre.

Subjects: Systematic sample of 218 patients from general practice and all 47 patients of complementary practitioners; patients had had symptoms for more than seven days.

Outcome measures: Standardised response mean and index of responsiveness; views of practitioners.

Results: The index of responsiveness, relating to the minimal clinically important difference, was high for MYMOP: 1.14 for the first symptom, 1.33 for activity, and 0.85 for the profile compared with <0.45 for SF-36. MYMOP's validity was supported by significant correlation between the change score and the change in the MYMOP score and the ability of this instrument to detect more improvement in acute than in chronic conditions. Practitioners found that MYMOP was practical and applicable to all patients with symptoms and that its use increased their awareness of patients' priorities.

Conclusion: MYMOP shows promise as an outcome measure for primary care and for complementary treatment. It is more sensitive to change than the SF-36 and has the added bonus of improving patient-practitioner communication.

Key messages

  • A generic health status instrument provides a useful profile of an individual or population, but is not necessarily responsive to change

  • An instrument that is patient generated may be responsive while remaining brief

  • The use of a patient generated measure within the consultation helps the practitioner to be more patient centred

  • Outcome measurements in chronic disease are more meaningful if charted alongside the diverse treatment options that patients use.


Medical outcomes belong first and foremost to patients. Their personal experience of illness, as well as the influence of the wide variety of help and treatments they seek, needs to be incorporated into the measurement process. The outcomes we are interested in measuring in primary care are seldom single beforeafter events; they are usually related to patients' progress over time.

In a multidisciplinary primary health care team, an outcome measure helps with systematising and with learning from the daily clinical work. Requirements, adapted from the work of Ruta and colleagues,1 are that such a measure should:

  • Measure the aspects and effects of the illness that the patient decides are most important

  • Enable the patient to score the chosen variables

  • Be a sensitive measure of within person change over time

  • Be applicable to the whole spectrum of illness seen in primary care

  • Be capable of measuring the effects of a wide variety of care

  • Be brief and simple enough to complete in a 7-10 minute consultation.

A recent review of outcome measures for primary care illustrates the evolution of instruments that acknowledge the importance of subjective perceptions of health and which focus on the measurement of function and quality of life.2 Many scales were originally validated for their discriminatory function, however, and there has been little research on their ability to evaluate change over time. A study of the sickness impact profile suggests that a good discriminant scale is not necessarily good at evaluating change.3

The medical outcome study has produced a range of evaluative scales from which the short form health survey, SF-36, has been tested in Britain. It produces distinctive profiles in for common conditions4 and is applicable to primary care5 and to minor conditions such as varicose veins.6 However, these studies have not assessed the sensitivity of SF-36 to change, and such data are available only for a few major surgical interventions.7 8 The SF-36 is not suitable for completion and scoring within a consultation, which detracts from its clinical usefulness.

The COOP-WONCA charts have the advantage of providing instant information within a consultation but they allow patients only a limited number of responses and no input into what is measured. They have been tested in British primary care and found to be acceptable,9 and they have been used to measure change over time in acute asthma10 and heart failure.11 However, few data on reliability and responsiveness were published in these studies.

The move towards involving the patient in generating the measure as well as in scoring it has led to a variety of disease specific measures.12 13 There is evidence from these studies that involving the patient in generating the measure may produce an instrument that is highly responsive to change over time while remaining brief.14

Failure to find an appropriate outcome measure resulted in the design and piloting of a new instrument. This study tests the instrument, the “measure yourself medical outcome profile”—MYMOP—alongside the SF-36 health survey for responsiveness, validity, and clinical usefulness in primary care.



The study took place at Warwick House Medical Centre, Taunton, Somerset, which houses a four partner, non-fundholding practice as well as nine part time complementary practitioners. The organisation of this team has been described elsewhere.15 The two osteopaths, the acupuncturist, the homoeopath, and all four doctors took part in the research.

The MYMOP follow up questionnaire

View this table:

MYMOP was designed and piloted in the practice over four months. It consists of four items, each scored by the patient on a seven point scale (see box). The first two scales are for the two symptoms that the patient specifies as most important. The third is an activity of daily living that is being disrupted or prevented by the illness, which the patient also specifies. The fourth asks the patient to rate their general feeling of wellbeing. All ratings are for the previous week. On second and subsequent profiles the wording of the previously chosen items is unchanged but there is an optional fifth item for a new symptom. The profile score is calculated as the mean of the scored items.


The sample consisted of a systematic sample of general practitioner patients plus all practice patients who consulted the complementary practitioners as new patients during the study period. Doctors' appointment books had every seventh appointment in normal surgeries highlighted, and receptionists handed out information and MYMOP forms to all these patients on arrival. The entry criteria were that the patient gave consent, presented a symptom of more than seven days' duration, and was not already in the study. If the patient was eligible, MYMOP 1 was completed within the consultation; all ineligible patients had a reason specified on an exemption slip. Practitioners gave guidance on completion but care was taken to ensure that the patient's, not the practitioner's, criteria were chosen and scored.

All follow up was postal. The previously chosen symptoms and activity were written on the form, ready for scoring, but the previous score was not known to the patient. Follow up forms included a five point change score and questions on help seeking behaviour (box). MYMOP was repeated at two and four weeks, and the SF-36 was completed at entry and four weeks. A subsample (consecutive patients of each practitioner until 10 patients for that practitioner had completed 16 weeks' follow up) repeated MYMOP at eight and 16 weeks. At the end of the study a MYMOP chart (see fig 1) was sent to the patient and to the treating practitioner(s).

Clinical usefulness was assessed with an audiotaped semistructured interview with all practitioners at the end of the study period.


Responsiveness, or sensitivity to change, was assessed in a variety of ways. The gradient of changes in scores across the spectrum of clinical change was analysed.3 The standardised response mean (the mean change in score divided by the SD of change in scores8) was calculated. The index of responsiveness14 was calculated as the change in scores of patients reporting themselves “a little better” divided by the SD of change in scores for patients reporting themselves “about the same”; it was not calculated for patients reporting “a little worse” because of small numbers in this group.

Epi-Info software was used for statistical analysis. MYMOP change scores had a normal distribution and, when variances were equal, parametric tests were used. SF-36 change scores were not normally distributed and non-parametric tests were used.



The sample consisted of 265 patients, of whom 218 were recruited by general practitioners and 47 by complementary practitioners. All of the 659 patients who made up the general practitioners' systematic sample were either recruited (218) or had recorded reasons for their ineligibility (441). The reasons for ineligibility were no symptoms (161), symptoms for seven days or less (110), not attending appointment (46), withholding consent (30), being in study already (29), doctor or receptionist forgot (20), and other reason—for example, patient too distressed, doctor running too late (45).

At one month 215 patients (81%) returned their third MYMOP questionnaire, and 193 patients (73%) returned both their third MYMOP and their second SF-36. Of the 135 patients followed up for four months, 103 (76%) completed follow up.

The mean (SD) age of the sample was 47 (17.6) years (range 2-84 years) and 174 patients (69%) were female. Six children under 15 years completed MYMOP with the help of their parent, but they were not given the SF-36. Table 1 shows the SF-36 profile of patients entering this study.

Table 1

Mean change in MYMOP and SF-36 scores at four weeks, and standardised response mean (SRM)

View this table:


In total, 387 MYMOP forms were mailed out (at 2, 4, 8, and 16 weeks) to be completed at home. Of these, 29 (7%) were incomplete, mostly because the patient had failed to score one of the variables.

Of the 265 patients who completed MYMOP in the consultation, 174 (66%) nominated a second symptom and 210 patients (79%) nominated a restricted activity. At follow up, a third symptom was nominated on one occasion by 67 patients (25%) and on more than one occasion (different symptoms) by 19 patients (7%).


The change in MYMOP scores at two weeks (table 2) and at four weeks showed a consistent gradient across the spectrum of clinical change. This gradient, and the difference between scores for “a little better” to “about the same,” were significant for all MYMOP scales except wellbeing. SF-36 change scores at four weeks, with the exception of bodily pain, did not show a smooth gradient from clinical improvement to deterioration, and the differences between the five change ratings were not significant for any of the SF-36 scales.

Table 2

Construct validity of MYMOP: change in MYMOP scores over two weeks (between first and second administration) for categories of perceived change in clinical condition

View this table:

Standardised response mean (table 1) and index of responsiveness (table 3) were high for MYMOP and lower for the SF-36.

Table 3

Mean change in scores over first four weeks, and index of responsiveness, for MYMOP and SF-36

View this table:


Construct validity is shown by the correlation between perceived change in condition and MYMOP score (table 3). In addition, the first symptom, activity, wellbeing, and the MYMOP profile score all showed significantly greater improvement for acute conditions (symptom present <4 weeks) than chronic conditions (symptom present for >4 weeks). For example, the mean change in symptom 1 score at four weeks for patients with acute conditions was 1.94 (SD 2.14) and for chronic conditions it was 1.23 (1.72) (P= 0.009, Mann-Whitney test).

Criterion validity is shown by comparison with SF-36 scores (table 4). Because good health is denoted by high scores on SF-36 and low scores on MYMOP, positive correlations have a minus coefficient. For the total study sample when the questionnaires were first given, correlations between MYMOP scales and SF-36 scales were positive and significant and were strongest for the wellbeing scale and MYMOP profile. For symptom 1 the correlation coefficients ranged from -0.08 to -0.24, for activity from -0.16 to -0.31, for wellbeing from -0.19 to -0.48, and for the profile from -0.24 to -0.45. Table 4 shows that correlations between the MYMOP profile and the SF-36 scale were weaker for patients who had their problem for less than four weeks, becoming non-significant for the scales measuring physical functioning, general health, and role, emotional. This is expected, as SF-36 asks questions relating to the whole of the past month.

Table 4

Criterion validity of MYMOP: correlations between MYMOP profile scores and SF-36 scores when questionnaires were first given

View this table:


Practitioners reported that MYMOP was quick and easy to do and was popular with patients. However, fitting it into the consultation was not easy for most practitioners, irrespective of length of appointments. It was useful to see the patient complete the scale in the consultation. The patient's choice of symptom or activity was helpful in understanding the patients' viewpoint and directing treatment in that direction, or in uncovering problems the patient had not presented directly or that the practitioner had not “heard”: “What I would have written down for my patients isn't what they did …it makes you realise you're not listening to what they're saying. What they wrote down was a good reflection of their feelings—but when someone describes something to you, you interpret it in your own terms don't you?” said one.

The numerical scoring by patients also led to new insights. The MYMOP charts (fig 1) were found useful in reviewing cases, especially when patients had not returned to see the practitioner, and there was enthusiasm for using them in case discussions. It was also suggested that their use in chronic conditions might help the patient uncover patterns and influences on their symptoms.

Fig 1
Fig 1

Typical MYMOP chart for a patient treated at the medical centre


In this study four doctors and four complementary practitioners of considerable diversity used MYMOP with a systematic sample of 218 patients in general practice and 47 patients of complementary practitioners. The instrument was applicable to all patients presenting with symptoms to conventional and complementary practitioners, and it elicited high response and completion rates. Six children and parents completed it without any apparent difficulty, but a separate investigation would be required to investigate at what age a child's response could be measured separately to the parent's response.

MYMOP is designed to measure within person change over time, and thus it must be both valid and responsive. The property of responsiveness includes the concept of reproducibility, as the denominator of the responsiveness index is the variability in score in stable subjects. Thus, for evaluative instruments, responsiveness replaces the concept of reliability.16 The responsiveness index relating to minimal clinically important change was greater, for all MYMOP scales except wellbeing, than the level of 0.8 nominated as “high” by previous work.17 These results support the hypothesis that a patient generated measure may be responsive despite being brief. The wellbeing scale was less responsive, but practitioners reported that it was clinically useful, especially in chronic disease, where an improvement in wellbeing may be a more realistic aim than a large improvement in symptoms or function.

MYMOP's validity was supported by its ability to detect different degrees of change in relation to change scores and in acute and chronic conditions, and by its correlations with SF-36 scores. Although the issue of clinical usefulness was clouded by follow up being postal and not related to clinical follow up, interviews provided important information on the effect of using the instrument in the consultation. In particular, the practitioners gained new insights into the patient's view of the problem.

Whether these results are generalisable to other settings remains to be tested. Further theoretical and practical investigation is needed of the use of the second and third symptoms and the combining of mean scores as a MYMOP profile. When the first two symptoms relate to different problems there are difficulties in interpretation, both clinically and theoretically, and a profile score in this situation is meaningless. Future trials will modify the instrument so that each questionnaire presented to a patient relates to only one problem, with the patient being the arbiter of attribution of symptoms. The clinical usefulness of the questionnaire needs further investigation in a situation where scores are part of routine clinical follow up. The meaning of MYMOP to the patient, and whether patients find it as easy to understand as the pictorial COOP-WONCA charts, needs to be explored by patient interviews and comparative studies.


In the evaluation of complementary treatments a patient generated measure may overcome the problem of the different diagnostic frameworks of different disciplines.18 For example, a group of patients labelled by conventional medicine as homogeneous in suffering from migraine would be a heterogeneous mix of a variety of diagnoses by traditional Chinese medicine or homoeopathy. Taking the definition of the problem back to the patient's concerns for the purpose of outcome evaluation means that complementary medicine will not be falsely constrained by the assumptions of scientific medicine.

Basing research solely on doctors' diagnostic categories is also a problem with conventional medicine. Howie's research into upper respiratory illness in general practice showed that doctors do not agree on diagnosis, nor do they necessarily base their treatment on it.19


Responsiveness to change of the SF-36 health survey was poor in this study no matter what the method of assessment. Bodily pain and social functioning were the most responsive scales, but even then the standardised response mean was low compared with other studies,8 and the index of responsiveness was in the small-moderate range.17

A generic measure such as SF-36 may be expected to be less responsive because small treatment effects can be lost in the stability of other measured variables; on the other hand, the scale gives a global assessment of health status and allows for comparison between study populations. Patient generated measures are quite the opposite as they concentrate on measuring only those features that the patient wishes to change. These results suggest that in measuring the outcome of care MYMOP would be a useful addition to the SF-36.


The MYMOP questionnaire and chart can be used to visually chart progress and quantify outcomes in case studies. Further trials are necessary to assess the usefulness of the instrument in routine clinical practice and audit, observational studies of a defined patient group, n of 1 trials, and clinical trials. Its use is likely to make the consultation and treatment more patient centred, and investigation of its use as a teaching aid at both undergraduate and postgraduate levels is warranted.

I am grateful for the enthusiastic collaboration of practitioners Gillian Blacklock, Mike Bridger, Jo Corbett, Sian Hanson, Anthony Hawks, Sally Hill, Sara Kennard, Rosemary Norton, Andrew Perry, Martha Price, Alan Spragg, and Phil Walpole, and I thank the staff and patients at Warwick House.

This paper is based on a research project presented in part requirement for the MSc in general practice at the United Medical and Dental Schools of Guy's and St Thomas's Hospitals, where I am grateful to Nicky Britten and Jane Ogden for many helpful discussions. Thanks also to Gordan Guyatt for his valuable input.


  • Funding The study was funded by a grant from the Royal College of General Practitioners and a bursary from the Royal Society of Medicine.

  • Conflict of interest None.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
View Abstract