General Practice

Collecting morbidity data in general practice: the Somerset morbidity project

BMJ 1996; 312 doi: (Published 15 June 1996) Cite this as: BMJ 1996;312:1517
  1. Nicky Pearson, senior registrar public health medicinea,
  2. Jim O Brien, consultant public health medicinea,
  3. Huw Thomas, general practitionerb,
  4. Paul Ewings, district statisticiana,
  5. Lesley Gallier, project coordinatora,
  6. Alan Bussey, consultant public health medicinea
  1. a Somerset Health Authority, Taunton TA2 7PQ
  2. b King Edward Road Surgery, Minehead
  1. Correspondence to: Dr Pearson.
  • Accepted 25 April 1996


Objective: To collect a valid, complete, continuous, and representative database of morbidity presenting to primary care and to use the data to help commission services on the basis of local need and effectiveness.

Setting: Computerised general practices in Somerset.

Methods: Participating general practices were selected to be representative of the district health authority population for general practice and population characteristics. All conditions presented at face to face consultations were assigned a Read code and episode type and the data were regularly validated. Data were sent by modem from the practices via a third party to the health authority each week.

Main outcome measures: Proportion of consultations coded and accuracy of coding.

Results: 11 practices agreed to participate. Validations for completeness during April 1994 to March 1995 revealed that 96.4% of the records were coded; 94% of the 1090 records validated had appropriate episode types and 87% appropriate Read codes. The results have been used to help formulate the health authority's purchasing plans and have enabled a change in the local contracts for surgery for glue ear.

Conclusions: The project has shown the feasibility of establishing a network of practices recording and reporting the morbidity seen in primary care. Early indications are that the data can be useful in evidence based purchasing.

Key messages

  • This study shows that general practitioners will contribute to a morbidity database if given fi- nancial support and confidentiality is safeguarded

  • Validation procedures ensure that the data are high quality

  • Information from the database can be used to purchase health care based on the population's needs as well as to monitor and improve the health of the population


An important responsibility of health authorities and general practitioners is to assess the health needs of their local populations and to commission services to meet these needs. This task is hampered by a lack of routinely available, up to date, and accurate information on incidence and prevalence of disease. Existing national sources of morbidity data, of which the most comprehensive are the national morbidity studies,1 2 3 4 are not representative of smaller localities or health authority populations, may be out of date as the studies take place only decennially, and may be biased by the fact that only general practitioners who have volunteered contribute.

General practice records have already been recognised as a potentially rich source of morbidity data.5 Improved uptake of computerisation in primary care and technological advances in electronic data transfer make it possible to use general practice computer systems for assessing local health needs. However, attempts at collecting up to date and reliable information in other areas of Britain have had only limited success because of problems with completeness and accuracy of the data, Read codes, computer software, and information technology, particularly arising as a result of the wide variety of commercially available general practice computer systems.6 7 8

The Somerset morbidity project aims to collect valid, complete, continuous, and representative data on disease presenting to primary care. The resulting database is readily obtainable; useful to purchasers, providers, and general practitioners; and could be reproduced in other health districts. We report on the data collection methods and analyse the data collected on three conditions in greater detail: glue ear, where the health authority has used the data to commission surgical services; asthma, which has generated interest among the general practitioner user group; and hypertension, where the local prevalence seemed to be significantly different from the prevalence found nationally.


To avoid the potential bias of using volunteer general practitioners we selected practices with a suitable computer system that were representative of the district health authority population for general practitioner, practice, and population characteristics. In 1993, when recruitment began, 97% of practices in Somerset were computerised and 64% were using a computer system suitable for our study (using Read codes and software compatible with our data extraction software). We recruited from these practices in four stages using a computerised model established for the purpose. The model initially generated all possible samples of practices that met the representativeness criteria within predefined acceptable limits. Those practices appearing most frequently in the samples were approached and recruited. As practices were recruited the sampling process was repeated.

We chose this method as it was systematic, repeatable by others, and maximised the chances of selecting a representative sample if a substantial number of practices declined to participate. However, only one practice we approached declined. We treated the resulting sample as a randomly selected sample of clusters of unequal size for the purpose of calculating confidence intervals for population characteristics using the method described by Kish.9

Each practice is paid a small fee, currently 40 pence a year for each patient registered with the practice, which covers the cost of generating the weekly report.


Practices record Read and episode type codes for each face to face contact between patients and practice medical and nursing staff. The episode types refer to whether the patient is consulting with the illness for the first time (F), attending for a new episode of a previously diagnosed illness (N), or for a follow up consultation (O).

The extraction software searches for each diagnosis for each patient during the time specified. It also prioritises episode types for each diagnosis: F over N, over O. Each diagnosis is therefore represented only once for a patient for the time concerned. Data are routinely extracted at the end of each week, the end of each quarter, and the end of each year. These searches produce different databases that enable disease processes with different epidemiological patterns to be studied optimally.

The data are sent electronically by modem to an independent collection agency (the Royal College of General Practitioners Birmingham Research Unit) and then on to the database held at the health authority. No identifiable patient details leave the practice, and the research unit gives data from each practice a code, which is known only to the project coordinator. Analyses are presented coded by practice, and each practice knows only its own code.


Several systems are in place to ensure optimum data quality. Firstly, all general practitioners and practice staff are given initial training and regular updates. Secondly, the project coordinator visits each practice on a random day every three months to validate the data. The Read codes entered for a random 25 patient sample are checked against the paper notes to ensure that they are compatible, appropriate, and given correct codes and episode types. Thirdly, the computer carries out edit checks for rare diagnoses and those recorded outside the usual age and sex parameters—for example, gynaecological disorders in men, very rare diagnoses where a coding error is more likely than the disease—for example, tetanus immunisation wrongly coded as a case of clinical tetanus. Incorrectly coded items noted during the validation visits and as a result of the edit checks are brought to the practice's attention and the computerised records amended. Data integrity checks are also carried out to verify the electronic interchange process, and a general practice user group meets regularly to enable data feedback and interpractice comparisons and to agree common use of Read codes where appropriate across the participating practices.


We recruited a final sample of 11 practices with a combined population of about 65 000. The practices were representative of the health authority for geographical spread, urban-rural balance, practice size, presence of a female partner, and age and sex structure (table 1). Table 2 compares the morbidity for the Somerset practices with those practices which participated in the fourth national morbidity study.4

Table 1

Characteristics of practices in Somerset morbidity project and whole district health authority

View this table:
Table 2

Rate of disease per 100 000 patients (95% confidence interval) in Somerset morbidity project April 1994-March 1995 and national survey 1991-2

View this table:

Of the 4685 records validated for completeness during April 1994 to March 1995, 4516 (96.4%) of the records were coded. During the same period 1035 (94%) of the 1090 records validated had appropriate episode types and 948 (87%) appropriate Read codes. Of the 142 cases where the Read code was queried, most were not incorrectly coded but were either not coded consistently throughout the course of the illness or not coded in as much detail as was possible from the available information.

The distribution of illness across the disease categories in Somerset was similar to the national distribution (table 2). However, some differences with national data were apparent.


Table 3 shows the incidence of glue ear in 0-14 year olds during April 1994 to March 1995 (using all the Read codes which participating general practitioners are currently selecting for glue ear). The incidence of new cases was 17.7 per 1000 0-14 year olds.

Table 3

Incidence (rate/100 000 children) of glue ear in Somerset morbidity project April 1994-March 1995

View this table:

Table 4 shows the age and sex breakdown of patients with asthma from Somerset compared with national data. The prevalence of asthma in Somerset is higher than the prevalence nationally for both males and females, with children aged 5-14 years and young adults from 15-24 years having the highest prevalence. Nationally, however, the 0-4 age group has the highest prevalence.4 There was wide variation in prevalence of asthma between participating practices (table 5), practice F having over twice the number of asthmatic patients as practice A after standardising for age and sex.

Table 4

Prevalence (rate/100 000 patients) of asthma and hypertensive disease in Somerset morbidity project April 1994 to March 1995 and national survey 1991-2

View this table:
Table 5

Variation in prevalence (rate/100 000 patients) of disease among practices in Somerset morbidity project

View this table:

We found a much higher prevalence of hypertension in Somerset than in the national study (table 4). The variation between practices of the prevalence of hypertension was also large.



There are obvious difficulties in collecting data in general practice, particularly that in the short time available for the average consultation it is not feasible to ask general practitioners to adhere to exact, previously defined diagnostic criteria for all illnesses they see. Thus, some of the interpractice variation is undoubtedly not true variation in morbidity but due to general practitioners' different diagnostic practices. However, much prescribing and many referrals to secondary care are triggered by these working diagnoses and therefore the interpractice variation has practical relevance.

Our system is a practical non-disruptive method of gathering data across the range of disease in the sector of the health service where most people consult. Useful information on the incidence of common disorders has previously been gathered from general practitioners' diagnoses,10 and the approach of using general practitioners' working diagnoses has been defended in the context of the weekly returns service.11 We believe we have improved on previous methods of collecting data in primary care by concurrently removing the potential bias of volunteer practices, incorporating local representativeness, recording across the whole range of disease, using a range of validation procedures, and making data available promptly.

Our data were similar to those reported in the fourth national morbidity study, which is reassuring given that other authors have compared their general practice data with data from other sources to assess external consistency.12


Locally valid and representative data on morbidity are vital to purchasers of services if evidence based commissioning is to be realised. Glue ear is one example of how these data have been used. The recent publication of surgical treatment rates for glue ear across the United Kingdom showed that surgical intervention rates varied from four per 1000 children in the lowest region to nine operations per 1000 in the highest region. National effectiveness studies have shown that about half of episodes of glue ear will resolve spontaneously in three months and three quarters will resolve by six months with little recurrence.13 Therefore, an appropriate surgical intervention rate for glue ear should be roughly 25% of new episodes. The incidence of new episodes of glue ear in Somerset was 17.7 per 1000 0-14 year olds, suggesting that the surgical intervention rate should be about 4.4 per 1000. The actual surgical intervention rate was 6 per 1000 children. This indicates that some children in Somerset may be receiving inappropriate surgery. Clearly the figure of 4.4 per 1000 is not exact and there could be variation around it, but the combination of widely accepted effectiveness information with local valid data on the disease enabled a modification to the contract to purchase reduced levels of surgery and increased watchful waiting.

Like many common surgical conditions, services for patients with glue ear have traditionally been supply led rather than need driven. When local providers have been challenged to explain high intervention rates they have often asserted that they result from high local incidences of the condition rather than a high inclination for intervention. Until now such arguments could be countered only by conducting expensive and time consuming community based surveys. Our data enabled more informed debate over the contract for glue ear services.

The general practitioner user group also believe they will be able to use the data on asthma, which showed wide interpractice variation. They intend using the relative proportions of the incidence of individuals developing asthma for the first time, the incidence of acute asthma attacks by those known to have asthma, and attendances for ongoing management as a way of gauging the effectiveness of their management of chronic asthma. Practices whose asthma management clinics are functioning optimally would have relatively fewer attendances for acute attacks than ongoing management.


The availability of up to date, representative, and accurate data from primary care on all disease in Somerset has allowed general practitioners, public health doctors, and commissioners to begin to work together to pursue a goal of primary care led purchasing. Databases such as ours offer an opportunity to use peer reviewed evidence of effectiveness in a local context, which we hope to pursue further. They are a useful addition to available information about the health of our local population and may enable us to monitor the effect of evidence based purchasing on the diseases seen in primary care.

We thank all participating general practitioners and staff in the Somerset morbidity project, Dr Douglas Flemming and his staff at the Weekly Returns Service for advice and support, and Dr Anna McCormick for software for computerised edit checks.


  • Funding South Western Regional Health Authority Research and Development Directorate (grant No 275), Department of Health Information Management Group, and Somerset Health Authority.

  • Conflict of interest None.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
View Abstract