- Tony Kendrick, professor of primary medical care1,
- Christopher Dowrick, professor of primary medical care2,
- Anita McBride, research fellow1,
- Amanda Howe, professor of primary care3,
- Pamela Clarke, research assistant2,
- Sue Maisey, research associate3,
- Michael Moore, senior lecturer1,
- Peter W Smith, professor of social statistics4
- 1University of Southampton Primary Medical Care Group, Aldermoor Health Centre, Southampton SO16 5ST
- 2University of Liverpool School of Population, Community and Behavioural Sciences, University of Liverpool, Liverpool L69 3GB
- 3University of East Anglia School of Medicine, Health Policy and Practice, University of East Anglia, Norwich NR4 7TJ
- 4Southampton Statistical Sciences Research Institute, University of Southampton, Southampton SO17 1BJ
- Correspondence to: T Kendrick
- Accepted 19 February 2009
Objective To determine if general practitioner rates of antidepressant drug prescribing and referrals to specialist services for depression vary in line with patients’ scores on depression severity questionnaires.
Design Analysis of anonymised medical record data.
Setting 38 general practices in three sites—Southampton, Liverpool, and Norfolk.
Data reviewed Records for 2294 patients assessed with severity questionnaires for depression between April 2006 and March 2007 inclusive.
Main outcome measures Rates of prescribing of antidepressants and referrals to specialist mental health or social services.
Results 1658 patients were assessed with the 9 item patient health questionnaire (PHQ-9), 584 with the depression subscale of the hospital anxiety and depression scale (HADS), and 52 with the Beck depression inventory, 2nd edition (BDI-II). Overall, 79.1% of patients assessed with either PHQ-9 or HADS received a prescription for an antidepressant, and 22.8% were referred to specialist services. Prescriptions and referrals were significantly associated with higher severity scores. However, overall rates of treatment and referral were similar for patients assessed with either measure despite the fact that, with PHQ-9, 83.5% of patients were classified as moderately to severely depressed and in need of treatment, whereas only 55.6% of patients were so classified with HADS. Rates of treatment were lower for older patients and for patients with comorbid physical illness (including coronary heart disease and diabetes) despite the fact that screening for depression among such patients is encouraged in the quality and outcomes framework.
Conclusions General practitioners do not decide on drug treatment or referral for depression on the basis of questionnaire scores alone, but also take account of other factors such as age and physical illness. The two most widely used severity questionnaires perform inconsistently in practice, suggesting that changing the recommended threshold scores for intervention might make the measures more valid, more consistent with practitioners’ clinical judgment, and more acceptable to practitioners as a way of classifying patients.
Since April 2006 the United Kingdom general practice contract quality and outcomes framework (QOF) has provided incentives to general practitioners to measure the severity of depression with a validated questionnaire at the start of treatment in all diagnosed cases.1 The aim is to improve the targeting of treatment, particularly antidepressants, to patients with moderate to severe depression, in line with guidelines.2 3 The rationale is that doctors’ global assessments of depression severity do not agree well with valid and reliable self reported measures of severity in terms of cut-off levels for case identification,4 5 6 7 resulting in overtreatment of mild cases and undertreatment of moderate to severe cases.7 8
The three recommended measures of depression severity are the 9 item patient health questionnaire (PHQ-9),9 the depression subscale of the hospital anxiety and depression scale (HADS),10 11 and the Beck depression inventory, 2nd edition (BDI-II).1 12 13 In principle, a higher score on these measures indicates greater severity requiring greater intervention. However, the QOF guidance also recommends that clinicians consider the degree of associated disability, history of depression, and patient preference when assessing the need for treatment rather than relying completely on the questionnaire score.1
These measures were designed for slightly varying purposes, and none of them is a “gold standard” measure of depression severity. HADS is a short screening instrument designed to identify patients with a greater probability of depression, who should then be further assessed with a more extended measure or a clinical interview, rather than a measure of severity in itself.10 BDI-II, on the other hand, was designed as a longer measure of severity,12 and PHQ-9 was developed as both a screening instrument and a severity measure.14 Data on the completion of these measures from the National Health Service Information Centre showed that they were used in a mean of 91% of diagnosed cases across all UK practices in 2007-8, up from 81% in 2006-7.15 The accuracy and utility of the measures has been questioned, however, suggesting that, even if they use the questionnaires, practitioners may ignore the scores when deciding about treatment or referral.16
The aim of this study was to examine the general practice management of patients with depression who completed severity questionnaires, to determine whether the use of the measures was consistent with the rationale for their introduction—specifically whether rates of treatment with antidepressant drugs and referrals for psychological or psychiatric treatment differed in line with patients’ scores on the measures. Other potentially important predictors of intervention—including demographic factors, history of depression, and concurrent physical illness—were also examined to explore whether these factors seemed to influence rates of treatment or referral as well.
The study was conducted in three sites that, between them, served inner city, suburban, and rural areas with varying levels of deprivation, presence of ethnic minorities, and availability of primary care mental health workers and psychological therapies. This allowed us to explore the effect of site on rates of treatment and referral.
Qualitative data on the use of the measures were also collected through interviews with general practitioners and patients and are reported separately.17
The study was conducted in three primary care trusts—Southampton City, Liverpool, and Norfolk. All general practices within Southampton City Primary Care Trust were approached. Practices approached in Liverpool were either members of the Mersey Primary Care R&D Consortium (and therefore had an interest in research) or were members of the Matchworks Consortium, an informal association of practices with an interest in mental health. Practices approached in Norfolk were those which had previously indicated an interest in participating in research. Interested practices were asked if they would provide anonymised data on all patients they had assessed with the depression severity measures between April 2006 and March 2007 inclusive.
With the help of practices’ clerical staff and local primary care trusts’ pharmaceutical advisers, anonymised data were extracted from the computerised medical records of patients for whom a depression severity questionnaire score had been recorded between April 2006 and March 2007. Data extracted included patients’ scores recorded on the questionnaires; age and sex; concurrent physical illness; history of depression; and subsequent management within three months of completion of the questionnaire scores (including follow-up appointments with general practice staff, antidepressant drug treatment, and referrals for psychological, psychiatric, or social services). Data were entered into a Microsoft Excel database and transferred into the statistical programs SPSS, version 15, and Stata, version 10, for analysis.
We estimated that the incidence of new diagnoses of depression in the year would be around 1%,2 a mean of around 60 per practice, but assumed conservatively that only about half of the affected patients would be assessed with a severity measure. We aimed to detect a 15% difference in the proportion of patients treated with antidepressants between those with mild depression and those with moderate to severe depression. On the basis of these assumptions, we aimed to gather data on about 560 patients with mild depression and 280 with moderate to severe depression, from a minimum of 28 group practices providing a mean of 30 patients each: this would provide 90% power at the 5% level of significance to detect the 15% difference, allowing for an intracluster correlation coefficient of 0.05 based on the levels typically found in primary care studies.18
Rates of follow-up, treatment, and referrals were analysed to determine associations with the severity scores recorded. We compared patients in three categories—minimal depression, mild depression, and moderate to severe depression—using a χ2 test with an adjustment to allow for clustering within practices. The threshold scores for mild depression and for moderate to severe depression were 5 and 10 respectively for the patient health questionnaire (PHQ-9),9 14 8 and 11 respectively for the depression subscale of the hospital anxiety and depression scale (HADS),10 and 14 and 20 respectively for the Beck depression inventory (BDI-II).13 The effects of the other factors were investigated using logistic regression models which included, when significant, a random effect to allow for clustering within practices. Originally, our plan was to analyse data on all patients together, regardless of which instrument was used to categorise the severity of their depression.
The numbers of practices agreeing to take part were 15 (54%) of the 28 approached in Liverpool, 13 (34%) of 38 approached in Southampton, and 10 (38%) of 26 approached in Norfolk. Overall, 2294 patients from the 38 practices had a depression severity score recorded in their records, including 1658 for PHQ-9, 584 for HADS, and 52 for BDI-II. The mean number of patients assessed per practice was 60.4, twice as many as we had anticipated. PHQ-9 was used in 32 practices (13 in Liverpool, 10 in Southampton, and 9 in Norfolk) and HADS in 21 (8 in Liverpool, 9 in Southampton, and 4 in Norfolk). Both measures were used in 15 practices. The number assessed with the BDI-II was considered too small for any meaningful analysis. Table 1⇓ shows the characteristics of patients assessed by the other two measures. The sample assessed with HADS included fewer older patients, fewer patients with recurrent depression, and fewer with concurrent physical illness.
Figure 1⇓ shows the distribution of scores for the two measures. The distribution was roughly normal for both measures, although the PHQ-9 scores were slightly more positively skewed. The mean (standard deviation) PHQ-9 score was 15.5 (6.0) and the mean HADS score 11.1 (4.6).
Figure 2⇓ shows the proportions of patients in three categories of depression severity as defined by their scores—minimal depression (scores 0-4 with PHQ-9, 0-7 with HADS); mild (scores 5-9 with PHQ-9, 8-10 with HADS); and moderate to severe (≥10 with PHQ-9, ≥11 with HADS). With PHQ-9, 1384 (83.5%) of the 1658 patients assessed were categorised as having moderate to severe depression (for which active intervention is generally recommended), compared with only 325 (55.6%) of the 584 patients assessed with HADS. On finding these differences in the proportions of patients classified as having moderate to severe depression, we decided to change our plan of analysis. Instead of analysing the data altogether regardless of the instrument used, we carried out separate analyses for the two main instruments.
Table 2⇓ shows the proportions of patients, within each category of depression severity according to PHQ-9 and HADS, in receipt of a follow-up appointment within four weeks; a prescription for an antidepressant; and a referral to a counsellor, primary care mental health worker, psychology, social services, or psychiatry. Overall 1774 (79.1%) of 2242 patients assessed with either measure received a prescription for an antidepressant, and 512 (22.8%) were referred to specialist services. Prescriptions for antidepressants were significantly associated with greater severity of depression for both measures. Follow-up appointments, referrals for counselling, and any referral to specialist services were also significantly associated with greater severity for patients assessed with PHQ-9, but not for those assessed with HADS.
Table 3⇓ shows odds ratios for bivariate associations between items of management and patient characteristics for those assessed with PHQ-9. Follow-up within four weeks, a prescription for antidepressants, and any referral to specialist services were significantly more likely with moderate to severe depression compared with minimal depression. Men were more likely to be referred to psychiatry than women, but referrals and antidepressant prescriptions were significantly less likely for older patients or for patients with diabetes, coronary heart disease, or other chronic physical illness.
Table 4⇓ shows odds ratios for bivariate associations between items of management and patient characteristics for those assessed with HADS. Again, a prescription for antidepressants was significantly more likely with increasing severity of depression, and there were trends towards more referrals with increasing severity and towards fewer antidepressant prescriptions for older patients and for those with concurrent physical illnesses.
Tables 3⇑ and 4⇑ also show that in Southampton practices antidepressant prescriptions were significantly more likely whereas referrals to psychology were significantly less likely, compared with the Liverpool and Norfolk practices. Referrals to primary care mental health workers were significantly more likely in Norfolk practices, at least among those assessed with HADS.
Table 5⇓ shows the results of logistic regression analyses of factors associated with antidepressant prescriptions or any referral to mental health or social services to account for confounding of the various patient and recruitment centre factors. Three factors remained consistently significant across the two groups of patients. A prescription for an antidepressant was significantly more likely for patients with moderate to severe depression on either measure, while patients aged ≥65 years were less likely to be referred to specialist services, and patients in Southampton practices were more likely to be prescribed antidepressants than patients at the other two sites.
The distribution of the questionnaire scores we found represents the right hand end of the distribution which is found when all general practice patients are screened.19 General practitioner diagnosis of depression is associated with greater severity,4 20 and the distribution we found is consistent with the purpose of the quality and outcomes framework (QOF) indicator, which encourages the use of questionnaires to assess the severity of depression that has already been diagnosed and is being considered for treatment rather than to screen patients for undiagnosed depression.
Around 80% of patients assessed with either the patient health questionnaire (PHQ-9) or the hospital anxiety and depression scale (HADS) received prescriptions for antidepressants, and around 20% were referred to specialist services. In our logistic regression analysis, a greater severity of depression according to either measure was significantly associated with an increased likelihood of being prescribed an antidepressant, and being referred to specialist services was significantly more likely for patients with the highest scores on HADS. These findings are in line with the rationale for the introduction of the measures. They also accord with previous research showing that treatment of depression in general practice is related to greater severity of symptoms.21 However, we found other factors to be associated with treatment and referral, and rates of treatment and referral were not consistent for categories of severity when we compared the two measures of depression.
Strengths and weaknesses of the study
We recruited 38 general practices, 10 more than the 28 we had aimed for, and found twice as many patients per practice had been assessed with the severity questionnaires than we had anticipated, which meant we had a much larger sample overall than we expected. This increased the potential power of the sample. However, this was not a random sample of general practices as practices had to be willing to take part, so we probably recruited practices including doctors with a greater interest in the assessment of depression. They may not be representative of all UK general practices, although they were recruited from a range of locations including inner city areas, relatively affluent suburbs, and rural areas across the three recruitment sites, including a range of levels of deprivation and differing proportions of minority ethnic patients. The somewhat higher response rates among practices in Liverpool and Norfolk were probably because the practices approached there were members of groups interested in research or mental health or both.
Our sample included too few patients assessed with the Beck depression inventory for meaningful analysis of that measure, so we were unable to include it. Originally, we had intended to analyse the total sample together, but when we found a marked difference between the samples assessed with PHQ-9 and with HADS in the proportions of patients classified as having moderate to severe depression, and therefore potentially in need of treatment, we decided we should keep the analyses separate for the two instruments. The considerably larger than expected sample size favoured this separate analysis, but the number assessed with HADS was only a third of the number assessed with PHQ-9, which meant that within the HADS sample we had less power to identify significant associations between scores and treatment and referral, increasing the risk of a type II error.
Findings in relation to other studies
Our logistic regression analyses showed that factors other than measures of depression severity were independently associated with treatment or referral. Patients aged ≥65 years were less likely to be referred to specialist services, which is in line with previous research suggesting undertreatment of older people with depression, recently highlighted by Age Concern.22
The bivariate associations between older age and lower rates of antidepressant treatment among patients assessed with PHQ-9 seemed to be accounted for by the presence of diabetes or coronary heart disease when these were considered together in the logistic regression. It seems likely that more older patients were administered the PHQ-9, not because they were being considered for treatment in ordinary surgeries, but because they had been screened for depression routinely as part of practice care for diabetes or coronary heart disease, as screening in those conditions is encouraged in the quality and outcomes framework.1 If so, it is possible that general practitioners tended to discount depression detected on screening when compared with depression presented to them by patients, perhaps because of an unwillingness to medicalise distress and label patients who were not complaining of depression. However, we cannot be sure this was the reason, and ideally we would need to compare rates of treatment and referral between patients with and without diabetes or coronary heart disease in a sample where all patients had been screened for depression to determine whether this explanation is correct.
Whatever the reason, it is perhaps surprising that patients with diabetes, coronary heart disease, and other physical illness tended to be treated and referred less often, despite the fact that they are known to be at higher risk of depression, and that depression is associated with a worse prognosis for the physical illnesses.23 24 25 It seems that encouraging general practitioners to screen for depression in these patients does not lead on to them being more likely to receive treatment than patients without diabetes or coronary heart disease. It is possible that the practitioners were concerned about possible drug side effects affecting the comorbid physical problems. However this would not explain the lower rates of referral for psychological treatments. Another possibility is that patients with comorbid physical conditions and multiple medications may be reluctant to accept either treatment or referral and even more so if they have been detected by screening rather than presenting symptoms themselves.
A history of depression was associated with a greater likelihood of drug treatment, at least among patients assessed with PHQ-9, which suggests that patients with a history may be more willing to consider medication in light of previous experience of the illness, or general practitioners may be more willing to prescribe for them, or both.
There were differences in treatment and referral rates between recruitment sites, which may be explained by local circumstances. Patients in Southampton practices were more likely to be prescribed antidepressants than patients at the other two sites, which is consistent with prescribing analyses and cost data from the Prescription Pricing Authority26 showing that Southampton practices were relatively high prescribers of antidepressants. The average daily quantity use of selective serotonin reuptake inhibitors was 2150.15 per 1000 “age, sex and temporary resident originated prescribing units” in Southampton City Primary Care Trust for the second quarter of 2006 compared with 1529.69 for England (personal communication Julie Mulvihill, prescribing support technician, Southampton City). At the time covered by our study data, psychological services and counselling had longer waiting lists in Southampton than in Liverpool and Norfolk, which may explain the differences in treatment rates, as general practitioners probably prescribe antidepressants more often if specialist psychological treatment is less readily available.
Meaning of the study and implications for practice
The proportion of patients categorised as having moderate to severe depression was 83.5% for those assessed with PHQ-9 compared with only 55.6% for those assessed with HADS. Despite this large difference in classification of patients, the general practitioners prescribed antidepressants to a similar proportion of patients assessed with either measure, which suggests that the score on the measure was not the defining factor in the decision to prescribe. It is likely that the practitioners usually decided on treatment on the basis of their clinical judgment, taking into account other factors besides the symptom count—such as the degree of associated impairment, a history of depression, and previous treatment with antidepressants. This explanation is supported by the findings of the qualitative interviews of participating general practitioners which suggested that they usually used their clinical judgment to decide on treatment and were only occasionally influenced by the questionnaire scores.17 Future research could address this issue by asking practitioners to rate their certainty of diagnosis and need for treatment before administering the instruments.
Research suggests that general practitioners’ categorical assessments of patients often differ from those of questionnaire measures.4 7 The question then is which is more accurate in terms of predicting the need for treatment, general practitioners’ clinical judgment or questionnaire measures? Previous research into concomitant administration, to a single sample of patients, of the two measures included in this study found that a greater proportion of the sample was classified as having moderate to severe depression according to PHQ-9 than according to HADS.27
It should be emphasised that neither of the two measures is an optimum measure of the severity of depression, and scores above the recommended cut-off values give only an indication that a particular patient is likely to have major depressive disorder. Recent validation studies against more extensive diagnostic assessments have suggested that the accuracy of the measures in predicting major depressive disorder could be improved by using a more conservative cut-off score of 12 rather than 10 with PHQ-96 28 and a less conservative cut-off score of 9 rather than 11 with HADS.6 In our sample 76.1% of patients scored ≥12 on PHQ-9 and 72.6% of patients scored ≥9 on HADS, so if these two cut-off values had been used a similar proportion of patients would have been classified as having moderate to severe depression by the two instruments. These proportions would also be more in line with the general practitioners’ rates of treatment, and so changing the recommended cut-off scores in these ways might make the scores more valid, more consistent with practitioners’ clinical judgment, and therefore more acceptable to practitioners as a way of classifying patients.
Both PHQ-9 and HADS were used in some of the participating practices, so even within group practices it seems that doctors may differ in terms of their instrument of choice. This is an argument for retaining the option of using a number of severity measures within the quality and outcomes framework, as long as greater consistency in classifying mild and moderate depression can be achieved.
What is already known on this topic
Under the quality and outcomes framework, UK general practitioners are rewarded for using validated questionnaire measures of depression severity at the outset of treatment in all diagnosed cases
While general practitioners are using the questionnaires in more than 90% of diagnosed cases, qualitative evidence suggests they doubt the validity of the measures and use their clinical judgment to decide about treatment regardless of patients’ questionnaire scores
What this study adds
Prescriptions for antidepressants and referrals to specialist services were significantly associated with higher depression severity scores, but other factors were independently associated with treatment and referral including patient age and concurrent physical illness and geographical area
The patient health questionnaire classified 83.5% of patients as moderately to severely depressed and in need of treatment, compared with only 55.6% of patients assessed with the hospital anxiety and depression scale, but overall rates of treatment and referral were similar for patients assessed with either measure
Changing the recommended severity thresholds for intervention might make the measures more valid, more consistent with practitioners’ clinical judgment, and therefore more acceptable to practitioners as a way of classifying patients
Cite this as: BMJ 2009;338:b750
We thank the participating practices and their staff, and the PCT pharmaceutical advisers for their help with data collection. We also thank Simon Gilbody, Geraldine M Leydon, Robert Peveler, Deborah Sharp, and Andre Tylee for advice on the design, conduct, and interpretation of the study.
Contributors: TK and CD devised the idea of the study and designed the methods. TK raised the funding. AMcB, PC, and SM were responsible for implementing the study. PWS led the analysis with TK, with contributions from CD, MM, and AH. TK prepared the first draft of the manuscript, and all authors contributed to each section of the final draft of the manuscript. TK is guarantor for this study.
Funding: This study was funded by an unrestricted educational grant from Lilly, Lundbeck, Servier, and Wyeth pharmaceuticals. It also received funding from Southampton City Primary Care Trust and the Mental Health Research Network, West, North West, and East Anglia hubs. None of the above bodies had any role in study design; the collection, analysis, and interpretation of data; the writing of the paper; or the decision to submit this paper for publication. The study was sponsored by the University of Liverpool.
Competing interests: The study was funded by pharmaceutical companies that manufacture antidepressants. TK has received fees for presenting at educational meetings from Lilly, Lundbeck, Wyeth, and Pfizer pharmaceuticals. TK and CD are members of the mental health expert panel of advisers for the UK general practice contract quality and outcomes framework, which recommended the inclusion of the incentives for using depression severity questionnaires in the contract.
Ethical approval: Permission for the study was obtained from Liverpool Paediatric Research Ethics Committee, reference 07/Q1502/23, and approval was obtained from local ethics committees and primary care trust research governance offices at all three sites.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.