Intended for healthcare professionals

CCBYNC Open access
Research

Effect of telephone health coaching (Birmingham OwnHealth) on hospital use and associated costs: cohort study with matched controls

BMJ 2013; 347 doi: https://doi.org/10.1136/bmj.f4585 (Published 06 August 2013) Cite this as: BMJ 2013;347:f4585
  1. Adam Steventon, senior research analyst1,
  2. Sarah Tunkel, director2,
  3. Ian Blunt, senior research analyst1,
  4. Martin Bardsley, head of research1
  1. 1The Nuffield Trust, London W1G 7LP, UK
  2. 2Ernst & Young, London SE1 2AF, UK
  1. Correspondence to: A Steventon adam.steventon{at}nuffieldtrust.org.uk
  • Accepted 9 July 2013

Abstract

Objectives To test the effect of a telephone health coaching service (Birmingham OwnHealth) on hospital use and associated costs.

Design Analysis of person level administrative data. Difference-in-difference analysis was done relative to matched controls.

Setting Community based intervention operating in a large English city with industry.

Participants 2698 patients recruited from local general practices before 2009 with heart failure, coronary heart disease, diabetes, or chronic obstructive pulmonary disease; and a history of inpatient or outpatient hospital use. These individuals were matched on a 1:1 basis to control patients from similar areas of England with respect to demographics, diagnoses of health conditions, previous hospital use, and a predictive risk score.

Intervention Telephone health coaching involved a personalised care plan and a series of outbound calls usually scheduled monthly. Median length of time enrolled on the service was 25.5 months. Control participants received usual healthcare in their areas, which did not include telephone health coaching.

Main outcome measures Number of emergency hospital admissions per head over 12 months after enrolment. Secondary metrics calculated over 12 months were: hospital bed days, elective hospital admissions, outpatient attendances, and secondary care costs.

Results In relation to diagnoses of health conditions and other baseline variables, matched controls and intervention patients were similar before the date of enrolment. After this point, emergency admissions increased more quickly among intervention participants than matched controls (difference 0.05 admissions per head, 95% confidence interval 0.00 to 0.09, P=0.046). Outpatient attendances also increased more quickly in the intervention group (difference 0.37 attendances per head, 0.16 to 0.58, P<0.001), as did secondary care costs (difference £175 per head, £22 to £328, P=0.025). Checks showed that we were unlikely to have missed reductions in emergency admissions because of unobserved differences between intervention and matched control groups.

Conclusions The Birmingham OwnHealth telephone health coaching intervention did not lead to the expected reductions in hospital admissions or secondary care costs over 12 months, and could have led to increases.

Introduction

Facing rising costs, healthcare systems around the world are exploring innovative ways to improve efficiency. Particular attention has been placed on the use of technology to help manage long term health conditions,1 including one-to-one telephone health coaching. This involves a regular series of phone calls between patient and health professional. The calls aim to provide support and encouragement to the patient and promote healthy behaviours such as treatment control, healthy diet, physical activity and mobility, rehabilitation, and good mental health.2 The hope is that the patient will maintain their own health more independently, and that the professional and patient will be in a better position to identify problems before they become critical. In turn, admissions to hospital may be prevented.3 Avoidable hospital admissions are both undesirable for the patient and expensive for the payer.

Video abstract

In a systematic review of telephone health coaching for people with long term conditions, only nine of 34 studies investigated effects on health service use.2 Four studies showed effects in this area, but findings were hard to generalise because the studies looked at a range of different health conditions, had relatively small samples (the average sample was fewer than 360 people), and included interventions that were heterogeneous. Five of the nine interventions included telemonitoring of vital signs such as blood pressure alongside telephone coaching. Since the review, other larger studies have been conducted.

Wennberg and colleagues conducted a large randomised controlled trial of 174 120 patients with employer based health insurance.4 The intervention was given to patients who were at high predictive risk (for example, of future hospital costs), with a lower risk threshold used in one treatment group than in the other. The researchers concluded that telephone care management reduced hospital admissions overall and among patients with selected long term conditions (heart failure, coronary artery disease, chronic obstructive pulmonary disease, diabetes, and asthma), although there was no statistically significant effect on the subset of admissions that came through the emergency room. Among the subgroup of patients with long term conditions, overall medical and pharmacy costs were $51 (£33.8; €38.8) per month lower in the more aggressively targeted group than the other. Their intervention did not include telemonitoring, but it did include an element of shared decision making for “preference sensitive conditions” (for example, with regards to treatment options for arthritis of the hip). The shared decision making could be largely responsible for the intervention’s effect on admissions that did not come through the emergency room.5

Further, Lin and colleagues studied telephone health coaching for 874 US Medicaid members of working age (range 18-64 years, mean 45.3) with at least one of 10 qualifying long term conditions and at least two acute hospital admissions or emergency department visits within a 12 month period.6 Their intervention relied solely on phone calls and mailed educational materials. Health coaches provided information on conditions and treatment options, empowered patients to self manage and self monitor their conditions, and encouraged patients to communicate their preferences to providers. Compared with a matched control group, the authors found no effects on hospital use and expenditures over one year. However, over two years, the number of emergency department visits reduced by a smaller amount for intervention patients than for matched controls, leading to a relative 20% increase in emergency department visits for intervention patients. Findings over one year were later reproduced in a randomised controlled trial among Medicaid patients.7

There is continued interest in telephone health coaching with several providers, but the evidence base is unclear. As only a limited number of large studies have examined the effects on hospital use and have produced contradictory findings,4 6 more information is needed to understand the elements that make up a successful service. We were commissioned by the Department of Health in 2010 to evaluate the effect of England’s largest example of telephone health coaching (Birmingham OwnHealth) on hospital use and associated costs. Previous evaluations of Birmingham OwnHealth had shown that patients had high levels of satisfaction and believed that the service reduced their need to go to hospital.8 In addition, a study showed reductions in levels of glycated haemoglobin (HbA1c), blood pressure, and body mass index among a subset of patients with poorly controlled diabetes.9 We present estimates of the effect of the Birmingham OwnHealth service on hospital admissions and associated costs. The service was decommissioned in 2012 after a consultation exercise.10

Methods

In summary, a retrospective design was used to assess the effects of an existing service. Matched controls aimed to reflect the changes in hospital use that can occur over time even without an intervention.11 12

Intervention, including patient recruitment

Birmingham OwnHealth was established in 2006 in a large city with industry. This area had health inequalities and some parts with high deprivation. Birmingham OwnHealth aimed to improve self care strategies, improve clinical indicators, and reduce health service use.3

The service targeted people with heart failure, coronary heart disease, diabetes, or chronic obstructive pulmonary disease. Inclusion criteria were all of the following:

  • A recorded diagnosis of one of the targeted conditions

  • A minimum level of disease severity (for example, HbA1c>7.4 in the past 15 months)3

  • Age 18 or older

  • Ability to communicate on the telephone

  • A recorded address and practice registration.

Potentially eligible patients were identified through analysis of data extracts sourced from the participating general practices. Summary files were then reviewed and ratified by general practitioners. General practitioners applied additional clinical judgment in determining which patients to refer into the Birmingham OwnHealth service, based on a set of consideration factors. These factors included comorbidities under active treatment, personality and mental health problems, and life circumstances such as pregnancy. General practice staff sent an introductory letter to the selected patients, which was followed by a phone call by a representative from Birmingham OwnHealth.

Once enrolled, patients (who were known as “members”) were assigned a care manager, who were specially trained nurses employed by NHS Direct. General practice data were also transferred onto the operational systems used by Birmingham OwnHealth. During the programme, care managers followed five fundamental steps: assessment, recommendation, follow-up, ongoing management, and review.

Care managers made regular telephone calls to patients. These calls were usually made monthly at a predetermined date and time to suit the user, although a minority of patients received calls more frequently (two to four times per month) owing to disease severity, social isolation, or severe weather. As the number of members grew, some members were stepped down (or “graduated”) to quarterly calls. During the telephone calls, care managers asked patients about current health status and symptoms, and recorded this information along with other information such as recent test results and changes to treatment. Care managers then gave personalised guidance and support, aiming to build continuing relationships with patients and provide motivation, skills, and knowledge to encourage patients to better manage their health conditions. The calls focussed on eight priorities in care management—namely, to ensure that patients were able to do all of the following:

  • Know how and when to get help for health and social care problems

  • Learn about their condition, and agree and set treatment goals within a personalised care plan

  • Take medicines correctly

  • Get recommended tests and services

  • Act to keep the condition in good control

  • Learn how to make changes to lifestyle and circumstances to reduce risks

  • Build on strengths and overcome obstacles, while strengthening personal social networks

  • Follow up with specialists and appointments.

Calls were structured into modules focussing on each of these priorities, and used screen prompted algorithms to structure the conversation. The software prompted care managers to follow guidelines in each priority area (for example, provide basic information about treatment). Care managers, however, were not constrained to follow protocols. Additional educational materials could be sent to patients.

Care managers aimed to coordinate input across services, for example, when developing personalised care plans, and could refer patients onto existing services such as mental health services and social care. General practitioners were offered monthly phone calls and quarterly meetings with their assigned care manager to discuss patients. The service was set up to provide proactive calls to patients rather than act as a phone-in service, and inbound calls were less common, comprising about 5% of calls.

Study populations

Study participants were enrolled in Birmingham OwnHealth between the time the intervention commenced in April 2006 and December 2008. This cut-off period was chosen to ensure sufficient time to follow-up for at least 12 months when this study was commissioned. The service provider identified all such patients recorded in their operational datasets, from which we excluded those without a record of inpatient or outpatient hospital use in the three years before enrolment. Because the matching variables came from hospital data, we could not accurately characterise patients without previous hospital activity. However, we would expect these excluded patients to have low future levels of hospital use,11 and therefore represent little potential to reduce hospital costs over 12 months.

The large size of the service reduced the scope to find matched controls locally, and there may have been spillover effects. Therefore, we selected several comparable areas within England to provide a pool of potential matched controls. Four of these areas (Bradford and Airedale, Sandwell, Stoke, and Wolverhampton) were drawn from a national area classification,13 which were similar to the intervention area in terms of demography; occupational mix; and rates of education, occupation, limiting long term illness, and unpaid care. We also included Walsall, which was commonly used as a comparator by the Primary Care Trust that commissioned Birmingham OwnHealth. We checked that a similar health coaching service via telephone had not operated in the selected areas, using internet searches and discussions with colleagues. As a result, we excluded parts of Walsall from the pool.

Study endpoints and sample size calculation

Our primary endpoint was the number of urgent and unplanned (“emergency”) hospital admissions per head over 12 months. Our hypothesis was that the service could alter rates of emergency admissions in either direction. Increases in emergency admissions have been suggested by studies of other types of interventions involving patient outreach in England.14

We performed a sample size calculation at the outset of the study to check that we were likely to have data for a sufficient number of patients to produce meaningful conclusions. We thought it important to detect relative changes of 15% should they occur, based on the level of effect judged as meaningful in similar studies,15 at 90% power and with two sided P=0.05. Annual admission rates for the usual care group were assumed to be 0.25 per person with a standard deviation of 0.4.16 Calculations were performed in SAS 9.3, and assumed a correlation of 0.15 between the number of admissions for intervention and matched control patients. Based on these assumptions, 2035 intervention patients were needed.

Secondary endpoints calculated over 12 months included the number of planned (“elective”) hospital admissions, number of hospital bed days, number of attendances for hospital based ambulatory care (“outpatient attendances”), and costs of secondary care.

Data sources and data linkage

The service providers had access to identifiable data for participants, including a national patient identifier (the “NHS number”), sex, date of birth, and postcode. These data were transferred to the NHS Information Centre for health and social care who used them to link participants to national administrative data for secondary care activity (the hospital episode statistics (HES)).16 The data linkage required an exact or partial match on several of the variables at once. After the data had been linked, the HES identity was transferred to the evaluation team together with the date of patient enrolment into Birmingham OwnHealth, year of birth, sex, and small geographical area code. As a result, we only had access to “pseudonymised” data in which all identifiable fields had been removed or encrypted. The ethics and confidentiality committee of the National Information Governance Board confirmed that data could be linked in this way without explicit patient consent.

Variable definitions

For intervention patients, study endpoints were calculated over 12 months after the date of enrolment into Birmingham OwnHealth. For matched controls, we used the date of enrolment for the corresponding intervention patient. Variables were therefore calculated over the same period for matched pairs as for intervention patients.

Analysis of inpatient activity was limited to “ordinary admissions” by excluding regular ward attendances, maternity events, and transfers. Admissions were then classified into either emergency or elective admissions, based on the method of admission. Bed days included stays after emergency admission only and excluded same day admissions and discharges.

Secondary care costs included inpatient and outpatient costs. They were estimated by applying a set of unit costs specific to the case mix,17 which represented the tariff amounts that providers were allowed to charge commissioners in the United Kingdom’s health service. We did not include adjustments for regional differences in care costs, to allow robust comparison of the volume of care services between intervention patients and matched controls. We only attached costs to activity covered by mandatory tariffs, which excluded locally negotiated non-tariff payments and augmented care payments associated with critical care.

Baseline variables were derived using hospital data recorded before the enrolment dates. The variables were based on those used in an established predictive model for emergency hospital admissions over 12 months.18 These variables were age band; sex; area based socioeconomic deprivation score19; health conditions; the number of long term health conditions; and previous emergency, elective, and outpatient hospital use. There were 16 health condition variables, formed from primary and secondary codes from ICD-10 (international classification of diseases, 10th revision) on inpatient data over three years. These health conditions were anaemia, angina, asthma, atrial fibrillation and flutter, cancer, cerebrovascular disease, congestive heart failure, chronic obstructive pulmonary disease, diabetes, history of falls, history of injury, hypertension, ischaemic heart disease, kidney failure, mental health conditions, and peripheral vascular disease.

In addition to these baseline variables, we estimated the risk of emergency hospital admission in the subsequent 12 months. The predictive risk models were based on the variables used in the published model18 but reweighted to reflect patterns of hospital use of Birmingham residents who had never been enrolled into the service. Models were constructed using logistic regression on a monthly basis through the enrolment period and validated on split samples. The estimated β coefficients from the validated models were then applied to intervention patients and potential controls to produce the risk scores that would be used in the matching process.

Methods to select control group

There are several methods for selecting matched control groups, but the aim is always to select, from the wider population of potential controls, a subgroup of patients that is similar to the intervention group with respect to predictive baseline variables.20 The risk score was strongly predictive, so we used a calliper approach whereby the pool of potential matches for a given intervention patient was narrowed down to those patients with a similar risk score (within 20% of one standard deviation).21 From within this restricted set, one control was selected for the intervention patient based on the individual baseline variables using the Mahalonobis multivariate distance metric.22 One matched control was selected for each intervention patient. This was done without replacement, so that the control group consisted of distinct individuals.

The main diagnostic in matched control studies is balance, which refers to the similarity of the distribution of baseline variables between intervention and matched control groups. Formal statistical tests are not recommended in the assessment of balance, because they depend on the size of the groups as well as their similarity.23 Instead, we assessed balance using the standardised difference, which is the difference in means as a proportion of the pooled standard deviation.24 Although the standardised difference would ideally be minimised without limit, 10% is often used as a threshold to denote meaningful imbalance.25 The ultimate aim was to select a matched control group that was well balanced across all of the baseline variables, including the predictive risk score; therefore, we adapted the set of variables included in the Mahalanobis distance until we achieved the satisfactory balance.26

Statistical approach

After matched groups had been constructed, we estimated the intervention effect using a difference-in-difference estimator. Thus, intervention and matched control groups were compared in terms of the change in the number of hospital admissions observed from the year before the enrolment date to the year after the enrolment date. Paired t tests were conducted on the change scores to reflect the matched nature of the data.27

Analysis was conducted over 12 months, regardless of death. The use of national administrative data to define variables meant that we considered there was a limited amount of missing data, because patients could be tracked even if they moved out of the Birmingham area, provided that they remained within England. We did not analyse patients who could not be linked to hospital data or could not be matched to a control.

Efforts to avoid bias and sensitivity analysis

The analysis was designed to reduce the susceptibility of the study to differences between intervention and control groups. Differences in variables that are predictive of future hospital use could result in confounding and biased estimates. The matching algorithm removed differences in important predictive variables and the difference-in-difference estimator was expected to remove the effect of all confounders that are not time varying, regardless of whether or not they were observed.

We conducted sensitivity analyses to test the robustness of our findings to time varying, unobserved confounding.28 Firstly, we followed the recommendation of West and colleagues29 and compared the intervention and matched control groups in terms of an outcome that we did not expect to be influenced by the intervention—namely, in-hospital mortality over 12 months (although one randomised study has reported effects of telephone health coaching on mortality30). Secondly, we assessed the strength of unobserved confounding that would be required to alter our findings in relation to a dichotomised version of the our primary endpoint. Specifically, we simulated a hypothetical unobserved confounder and estimated the odds ratios that would be required between this confounder and intervention status and outcome for our findings to be altered.31 The values thus obtained were compared with estimates of odds ratios for unobserved confounders, based on another study.9

Ethics approval

The National Research Ethics Service confirmed that ethical approval was not required for this work, because it involved retrospective analysis of non-identifiable data for the purposes of service evaluation.

Results

Study populations

Of 3525 patients enrolled during the study period, 3070 (87.1%) were linked uniquely to one individual in the hospital data. Of the 455 records that did not link to hospital data, 86% had missing or incomplete personal linkage data in the service’s operational system. Of 3070 patients linked to hospital data, 2703 (88.0%) patients had a history of inpatient or outpatient hospital use. Matched controls were found for 2698 (99.8%) of these patients. Data from the Birmingham OwnHealth service’s operational system showed that the median duration of enrolment for the included intervention patients was 776 days (25.5 months; fig 1). Telephone calls typically lasted for 15 minutes.

Figure1

Fig 1 Length of time spent enrolled in the Birmingham OwnHealth service. Solid line=best estimate; shaded area=95% confidence interval

Predictive risk models were fitted for each of the 33 months spanning participant enrolment and were validated in separate samples. The median positive predictive value was 57.0% (range 55.6% to 58.6%) and the median sensitivity was 4.3% (3.9% to 4.7%), calculated using a risk threshold of 0.5. The area under the receiver operating characteristic curve had a median value of 0.698 (range 0.688 to 0.701).

In relation to diagnoses of health conditions and other baseline variables, intervention patients differed markedly from the general population of the control areas (table 1). For example, intervention patients had 1.2 chronic health conditions on average, compared with 0.3 conditions for the general population. However, after matching, matched controls and intervention patients had similar characteristics. For example, both groups had 1.2 chronic health conditions on average, mean age of 65.5 years, mean predictive risk score of 0.17, similar prevalence of health conditions, and similar previous hospital use (table 1, fig 2). Standardised differences were much lower than the 10% threshold, apart from diagnoses for angina (11.1%) and mental health conditions (15.0%).

Table 1

Differences in demographic, health or healthcare characteristics of study groups before and after matching. Data are proportion (%) of individuals or mean (standard deviation) unless otherwise stated

View this table:
Figure2

Fig 2 Differences between 2698 coached patients and 2698 matched controls at the start of intervention. HD=heart disease; CHF=congestive heart failure; COPD=chronic obstructive pulmonary disease; CVD=cerebrovascular disease; PVD=peripheral vascular disease. *Score of 10=most deprived; tenths were defined on the basis of national data for the Index of Multiple Deprivation 200719

Comparing hospital use and costs

Intervention patients had more emergency admissions in the year after enrolment than in the year before enrolment (0.38 v 0.31 admissions per head). A smaller increase was observed for matched controls (table 2, fig 3). Comparing the two groups, emergency admissions increased by 0.05 per head more among intervention patients than among matched controls (95% confidence interval 0.00 to 0.09, P=0.046; table 3), which was a relative increase of 13.6% (0.2% to 27.1%).

Table 2

Estimated intervention effects for secondary care use

View this table:
Figure3

Fig 3 Comparison of rate of emergency admissions

Table 3

Difference-in-difference estimate of intervention effect (per head)

View this table:

Differences between groups in hospital bed days and elective admissions were not significant (table 3). Outpatient attendances rose by 0.37 attendances per head more among intervention patients than among matched controls (95% confidence interval 0.16 to 0.58, P<0.001), due to a fall among controls. Overall secondary care costs increased by £175 (€203; $268) per head more among intervention patients than among controls (£22 to £328, P=0.025).

Sensitivity analysis for unobserved confounding

Within 12 months of the intervention, 2.3% of patients in both the intervention (n=63) and matched control (n=63) groups died in hospital. Sensitivity analysis simulated a hypothetical unobserved confounding variable and showed that, for the apparent increase in emergency admissions to be reversed, such a variable would need to be strongly associated with both intervention status and outcome, with odds ratios greater than 2.8. By comparison, insulin treatment (which is one variable we did not observe) had an odds ratio of 1.6 with intervention status.9

Discussion

Statement of findings

Telephone health coaching aims to support patients in managing their long term health conditions. The hope is that, by promoting healthy behaviours and by providing a means to identify problems before they become critical, telephone health coaching can help prevent crises that lead to hospital admissions. We compared a large sample of people receiving telephone health coaching in England to a well balanced, retrospectively matched control group using person level data. Rather than see a reduction in hospital activity in the study group, we found that emergency admissions increased at a faster rate among intervention patients than matched controls, as did outpatient attendances and secondary care costs. Therefore, there was no evidence of reductions in hospital admissions, and no savings were detected from which to offset the cost of the intervention.

Strengths and weaknesses

We were able to study a large number of intervention patients with a high rate of data linkage (87%). Imperfect linkage was mainly due to imperfect recording of individual identifiers on the service’s operational system, because most records that did not link had missing or incomplete personal linkage data. On the assumption that recording omissions happened at random, our sample was an unbiased sample from the population receiving the intervention. Although the analysis then focussed on patients with previous hospital use, this variable is where the scope for savings was highest.

The use of administrative data meant that data were available for a high proportion of patients, and avoided problems of under-reporting by patients about how many services were used.32 However, the quality of data was not directly under our control. Potential problems with administrative data included limited insight into the quality and appropriateness of care,33 and observational intensity bias if coding practices varied between geographic areas.34

We obtained data for more patients than what our sample size calculation suggested was needed (2698 v 2035). Therefore, although we originally envisaged that we would only be able to detect differences in emergency admission rates of 15% or higher, the 13.6% increase detected was statistically significant, and was unlikely to be the result of chance (P=0.046). A 13.6% increase in emergency admissions is substantial for the health service, and much more than the general increase in age standardised rates of admission of 2.5% a year.35

The main risk to validity in this observational study was that, although intervention and matched control groups were similar in terms of an established set of predictors of future hospital use, they could have differed in ways that we could not observe (that is, there may have been unobserved confounding). Typically, only a small proportion of eligible patients receive complex interventions out of the hospital.36 Birmingham OwnHealth was a relatively established service, and at least 80% of the local general practices had participating patients. Nevertheless, there are around 9000 patients in the area with uncontrolled diabetes,37 for example, while only around 3000 patients received the intervention in the time period chosen.

We sought to minimise unobserved confounding by careful selection of the pool of potential controls, matching on previous outcomes and difference-in-difference estimation. The eligibility criteria for the service included clinical variables such as HbA1c. These variables were not recorded in our dataset. However, we ensured that the prevalence of health conditions was similar between the two groups, as were variables that are correlated with clinical indicators, such as hospital use.38 Sensitivity analysis showed that, although the increase in emergency admissions could conceivably have been caused by unobserved confounding, it is unlikely that we missed a reduction. To have missed such a reduction, the amount of unobserved confounding would have had to be greater than is realistic for clinical variables. Further, it is reassuring that no differences were observed in in-hospital mortality between the two groups. For example, if disease control had been worse among intervention patients, more deaths might have been expected.

Observational study designs have some advantages over randomised controlled trials. This study looked at a population that was selected to participate in telephone health coaching in routine practice. By contrast, randomised controlled trials may have poor generalisability when the patients in the trial differ from those who would receive the intervention routinely. Such differences might occur because of the selection of healthcare settings or practitioners for a trial, the choice of eligibility criteria for a trial, certain individuals preferring not to participate, or study design.39

This study investigated effects on hospital use and associated costs for people enrolled between 2006 and 2008. The effect of the service might have changed over time, because the eligibility criteria were later broadened to include chronic kidney disease, stroke, and transient ischaemic attack from 2009; and hypertension and older patients at risk from 2010. Although not the focus of this study, Birmingham OwnHealth might have affected the use of primary care or health related quality of life. Because the median duration of enrolment was several years, effects could have been over longer time periods than those analysed in this study. Other work has found improvements in clinical metrics among participants with poorly controlled diabetes,9 as well as high levels of patient satisfaction.8

Comparison with other studies

Previous studies of the effect of telephone health coaching on service use have generally been encouraging. Four of nine studies identified by a systematic review found evidence of an effect on health service use, although sample sizes were typically small.2 A more recent, large randomised controlled trial found reductions in hospital admissions and expenditures.4 Before the current study, the largest observational study of telephone health coaching included 874 Medicaid members, and found no effect.6 The current study supports their findings on a larger sample drawn from England (n=2698).

One possible explanation for the apparently contradictory nature of the findings from these studies might be subtle differences in the design of the interventions. Aspects of intervention design such as the frequency of telephone calls vary widely between studies,2 although the profile of telephone calls in the current study (usually monthly) was not out of line. The largest randomised controlled trial4 included decision making for preference sensitive conditions, while three of the four effective interventions identified by the systematic review involved telemonitoring of vital signs in addition to health coaching. Telemonitoring could be effective at reducing hospital admissions even when combined with automated motivational messages and symptom questions rather than health coaching.15 This study adds weight to the conclusions suggested by the Medicaid studies6 7 that health coaching is not effective at reducing hospital use by itself. Further, although potentially explained by unobserved confounding, we found evidence that the intervention in Birmingham increased emergency admissions. Previous evaluations of other complex interventions out of hospital have also found indications of increases.14 One possibility is that increases occur as a result of greater observation. Indeed, in other settings, more intense observation and greater use of diagnostic tests have been found to correlate with the number of medical interventions made.40

Discrepancies between the findings of different studies could also be due to study settings, with both the Medicaid studies and the current study relating to a publicly insured population. Targeting the intervention using the outputs from a predictive risk model may increase effectiveness, as could better integration with existing primary and secondary care services. Finally, the evaluation method could affect results. Confounding is possible in observational studies, although our study design attempted to limit this threat to validity as much as possible.

Conclusions

We conclude that Birmingham OwnHealth did not lead to the anticipated levels of reductions in hospital admissions or associated costs. Based on a systematic review and subsequent studies, including the present study, standard telephone health coaching seems unlikely to lead to reductions in hospital use, without the addition of other elements such as telemonitoring, shared decision making for preference sensitive conditions, or predictive modelling. More care coordination might also be needed. Unless health coachers have established relationships with other clinical staff, new interventions could prove to be additions to existing patterns of service use, rather than create efficiencies.

The study serves as a warning that efficacy as demonstrated by randomised controlled trials might not imply effectiveness in routine practice.41 Because administrative datasets are regularly updated, the methods used in the present study may be useful to monitor new services to ensure that benefits are achieved.

What is already known on this topic

  • Telephone health coaching provides support and encouragement to patients to manage long term health conditions

  • It is hoped that hospital admissions will be prevented as a result, creating efficiency gains for healthcare systems

  • However, the current evidence base is unclear; many studies have been small and interventions are heterogeneous

What this study adds

  • This study adds weight to the existing view that health coaching by itself is not effective at reducing hospital use over 12 months

  • Coaching could be coupled with other interventions such as shared decision making or telemonitoring, and the context in which interventions are delivered might also be crucial

  • Efficacy of new services as demonstrated by randomised controlled trials might not imply effectiveness in routine practice

Notes

Cite this as: BMJ 2013;347:f4585

Footnotes

  • We thank staff in Birmingham East and North Primary Care Trust who organised data for scheme participants; John Grayland for his ongoing support; NHS Information for health and social care for providing invaluable support and acting as a trusted third party for the linkage to national hospital data; and Sally Inglis, Susannah McClean, and Doug Altman for their comments on a previous version of this manuscript. The data analysis for this paper was generated using the SAS software, version 9.3 (SAS Institute). SAS and all other SAS Institute product or service names are registered trademarks or trademarks of SAS Institute.

  • Contributors: AS, IB, and MB designed the study. In addition, AS led the analysis and prepared the draft manuscript, ST liaised with the scheme about access to participant data, and IB derived unit costs for HES data. All authors reviewed the manuscript. AS was the study guarantor.

  • Funding: This study was funded by the Department of Health in England, which reviewed the study protocol as part of the application for funding and agreed to publication. The views expressed are those of the authors and not of the Department of Health and does not constitute any form of assurance, legal opinion or advice. The organisations at which the authors are based shall have no liability to any third party in respect to the contents of this article.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: funding from the Department of Health for the submitted work; AS, IB, and MB have a range of current or pending research grants on related topics from funding bodies including the National Institute for Health Research, Technology Strategy Board, and NHS trusts; ST, as an Ernst & Young employee, has declared that Ernst & Young is a consulting firm which may at times undertake consultancy work relevant to the commissioning and provision of community based care.

  • Ethical approval: The ethics and confidentiality committee of the National Information Governance Board confirmed that data linkage (as described in the methods) was possible without explicit patient consent. The National Research Ethics Service confirmed that ethical approval was not required for this work, because it involved retrospective analysis of non-identifiable data for the purposes of service evaluation.

  • Data sharing: No additional data available.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.

References

View Abstract