Evaluation of telephone first approach to demand management in English general practice: observational study

Objective To evaluate a “telephone first” approach, in which all patients wanting to see a general practitioner (GP) are asked to speak to a GP on the phone before being given an appointment for a face to face consultation. Design Time series and cross sectional analysis of routine healthcare data, data from national surveys, and primary survey data. Participants 147 general practices adopting the telephone first approach compared with a 10% random sample of other practices in England. Intervention Management support for workload planning and introduction of the telephone first approach provided by two commercial companies. Main outcome measures Number of consultations, total time consulting (59 telephone first practices, no controls). Patient experience (GP Patient Survey, telephone first practices plus controls). Use and costs of secondary care (hospital episode statistics, telephone first practices plus controls). The main analysis was intention to treat, with sensitivity analyses restricted to practices thought to be closely following the companies’ protocols. Results After the introduction of the telephone first approach, face to face consultations decreased considerably (adjusted change within practices −38%, 95% confidence interval −45% to −29%; P<0.001). An average practice experienced a 12-fold increase in telephone consultations (1204%, 633% to 2290%; P<0.001). The average duration of both telephone and face to face consultations decreased, but there was an overall increase of 8% in the mean time spent consulting by GPs, albeit with large uncertainty on this estimate (95% confidence interval −1% to 17%; P=0.088). These average workload figures mask wide variation between practices, with some practices experiencing a substantial reduction in workload and others a large increase. Compared with other English practices in the national GP Patient Survey, in practices using the telephone first approach there was a large (20.0 percentage points, 95% confidence interval 18.2 to 21.9; P<0.001) improvement in length of time to be seen. In contrast, other scores on the GP Patient Survey were slightly more negative. Introduction of the telephone first approach was followed by a small (2.0%) increase in hospital admissions (95% confidence interval 1% to 3%; P=0.006), no initial change in emergency department attendance, but a small (2% per year) decrease in the subsequent rate of rise of emergency department attendance (1% to 3%; P=0.005). There was a small net increase in secondary care costs. Conclusions The telephone first approach shows that many problems in general practice can be dealt with over the phone. The approach does not suit all patients or practices and is not a panacea for meeting demand. There was no evidence to support claims that the approach would, on average, save costs or reduce use of secondary care.


Appendix 1. Further details of analytic methods [posted as supplied by author]
Our evaluation related to 147 practices using the 'telephone first' approach supported by the two commercial companies. Of these 145 practices were included in the analysis of hospital utilisation and costs because two had closed before the final data collection point that we used. 146 practices were used in the analysis of the GP Patient Survey, data not being available for one practice.

Methods of analysing primary care utilisation data
Data on phone and face to face appointments, and on continuity of care were extracted from the practices computer systems by the commercial company up to 28 th October 2016 and transferred to the research team as anonymised data sets for analysis. For individual appointment-level data information was available regarding: the date and time an appointment was booked; the date and time the appointment occurred; the type of appointment (face-to-face, telephone, home visits or administrative); and who the appointment was with (GP, nurse or other). The continuity of care data included the usual provider continuity score and patient age. To preserve anonymity, the commercial company created the usual provider continuity score 1 calculated for each patient who had two or more appointments in any calendar month, as the number of appointments with the GP most frequently seen divided by the total number of appointments in that time period. Patients could not be linked across different months. The company also provided details on the date the 'telephone first' approach was introduced and the current status of the system i.e. whether still running the 'telephone first' system per protocol, running a hybrid system or if the practices had abandoned the approach.
Only practices which launched the 'telephone first' approach before the 31 st December 2015 were included in the final data set to allow sufficient time for the system to have bedded in (potentially allowing at least 10 months of post intervention data for each practice, though in reality this was often less).
Two types of analysis were carried out for each of the outcomes. The first was a before and after analysis in intervention practices, illustrated by the 'super-posed epoch graphs' in appendix 3.1 where the introduction of the system in each practice is set at time zero. Second, a regression analysis was performed for each outcome looking first for step changes at the time the intervention was introduced and second for a change in the preceding trend (e.g. slowing down of a previous increase). We also model heterogeneity in these changes to examine whether the intervention has a different effect in different practices.

Before and after super-posed epoch analyses
Graphical super-posed epoch analyses were performed for the five main outcomes to illustrate the unadjusted change in outcomes before and after the introduction of the intervention. For all outcomes except continuity of care the average of the outcome was calculated over 30-day periods relative to the launch date; with the practice launch date defining time zero for individual practices.
For continuity of care the calculation was based on calendar months due to the format of data provided. Given the intervention started at different time points in each practice, different relative time periods include data from different periods of time.
The outcomes, definitions and analytic approach taken are summarised in table A.

Continuity of care
For patients with two or more appointments in a month, the proportion of appointments that are with the GP most frequently seen in that month (score from 0 to 1).

• GP appointments
• Face-to-face appointments only

Linear mixed effects regression
Individual patients in each month the mean of individual practice means with available data for each time period. These means were plotted as time relative to launch.

Regression analysis
Mixed effects regression analysis was used to investigate within practice changes associated with the intervention. The models used all take a similar form, but different types of model were used depending on the outcome. In brief the models captured a step change associated with the start of the intervention as well as a change in trend, for example the intervention may have led to an immediate increase relative to the background trend which was then eroded by a reduction over time. We also model heterogeneity in these changes to examine whether the intervention has a different effect in different practices. A complete case analysis was used with the exception of the total time spent consulting.
Each model contained a categorical variable for month to account for seasonality and a categorical variable for day of week to account for variations across the week. Continuous time relative to launch date (in years) was included to account for underlying secular trends. A random intercept was included to account for differing baselines by practice and the resulting clustering of observations within practices.
A binary variable indicating when the intervention was present captured any instantaneous "step" change in the outcome at the start of the intervention. An interaction between time relative to launch and the intervention indicator captured whether the linear trend changed following intervention. For continuity of care models only, the age of the patient was also included as a third order (cubic) polynomial in addition to the variables described above. Analyses were performed for all types of appointments combined as well as separately by type of appointment (face to face or telephone).
A large amount of data on length of appointment was missing from the practice data provided by the commercial company (30%), especially so for telephone consultations (52% compared with 18% for face to face consultations). Ignoring appointments with missing durations would have led to a systematic underestimation of the total time spent consulting. To overcome this appointment length was imputed for those appointments with missing length and then added to the observed lengths to obtain a better estimate of the total time spent consulting for each day in each practice. Because we were imputing individual consultation lengths rather than total time spent consulting per day in a practice a single imputation was made using a linear regression model similar that used in the analysis of individual consultation lengths but with fixed effects for practice rather than random ones and stratified by before and after intervention launch. As required to avoid biasing standard errors under single imputation, the imputed standard errors were multiplied by the square root of the ratio of the proportion of cases requiring no imputation.
To better approximate normal distributions, data were log-transformed prior to analysis. For the ease of interpretation exponentiated coefficients are presented as duration ratios (i.e. the relative change in total time spent consulting by a practice).
Our main analysis was done on an intention to treat basis. It includes all practices identified by the commercial company even when the practices were using a hybrid form of the 'telephone first' approach or had since ceased using it all together. A sensitivity analysis (appendix 2) was performed restricting the analysis to practices where we believed, on the basis of information provided by the commercial company, that the system was being run consistent with the company's protocols. responses) following the introduction of a reminder postcard in addition to the two reminder surveys already used. The data period included data from a total of 8,323 practices, but not all practices had data for each wave of the survey; the number of practices contributing data varied between 8,243 and 7,687 in any one wave. This is primarily due to practice closures and openings.

Methods of analysing national GP Patient Survey data
Practices using a 'telephone first' approach between 2011 and 2016 were identified based on information provided by the two commercial companies providing support for implementing this service (hereafter we refer to these practices as intervention practices). The companies provided details on the practices that were using their system, the date the 'telephone first' approach was introduced and the current status of the system i.e. whether still running a full 'telephone first' system, running a hybrid system or if the practices had abandoned the system. Only practices which launched the 'telephone first' approach before the 31 st December 2015 were included in the analysis to allow sufficient time for the system to have bedded in; those with later launch dates were classified as non-intervention practices. In total 146 intervention practices were identified in the GPPS data set. The number of practices receiving the intervention varied over time (see Table B); only one practice was using the 'telephone first' approach throughout the entire data period. A total of 29,472 surveys were received from intervention practices post launching the 'telephone first' approach. 2. Generally, how easy is it to get through to someone at your GP surgery on the phone?

3.
Would you recommend your GP surgery to someone who has just moved to your local area?

4.
How often do you see or speak to the GP you prefer?

5.
How long after initially contacting the surgery did you actually see or speak to them?

6.
How convenient was the appointment you were able to get?

7.
Overall, how would you describe your experience of making an appointment?
In each case responses were rescaled between 0 (poor experience) and 100 (good experience). In the case of the first question above on 'GP communication', a composite variable was created taking the mean of all informative responses as long as three or more informative responses were given.
Two types of analysis were carried out for each of the outcomes. The first was a before and after analysis, illustrated by the 'super-posed epoch graphs' where the introduction of the system in each practice is set at time zero. Second, a regression analysis was performed for each outcome looking first for step changes at the time the intervention was introduced and second for a change in the preceding trend (e.g. slowing down of a previous increase). We also model heterogeneity in these changes to examine whether the intervention has a different effect in different practices.

Before and after analysis of GPPS scores (intervention practices only)
Graphs have been produced illustrating changes in patient experience scores in intervention practices before-and-after the introduction of the intervention (shown in appendix 3). A super-posed epoch analysis is performed whereby the number of survey waves relative to the intervention launch date is calculated for each practice; the survey immediately preceding the intervention launch date is defined as time zero, the survey wave immediately following intervention launch date is time period one, and the one following that time period two etc. Given the intervention started at different time points in each practice, different relative time periods include data from different periods of time. Further not all practices had data for all time periods relative to intervention launch. The analysis was restricted to data two years either side of the launch date.
For each of the seven GPPS experience measures we calculated: (1) the mean score within each intervention practice for each relative time period; and (2) the mean across all intervention practices with available data for each relative time period. These means were plotted as time relative to launch.

Comparison with other practices in England (controlled regression analysis)
The super-posed epoch analysis does not control for what is happening external to the intervention and may also be confounded by a number of factors. For example, the timing of intervention launch relative to nationwide trends and other NHS initiatives. In the second stage of the analysis we undertook a controlled regression analysis to estimate a within practice difference-in-difference effect of the intervention. Controls were selected at random from all practices classified as non-intervention practices (i.e. those practices not on the list provided by commercial companies). For computational reasons our analysis was restricted to data from all intervention practices and a random 10% sample of nonintervention practices, with between 778 and 976 control practices providing data at any one time.
For each continuous GPPS experience measure a separate mixed effects linear regression model was used. With the exception of changing the outcome (patient experience measure) the structure of the models is otherwise the same. Patient-level adjustment is made for self-reported gender, age (eight groups) and ethnicity (five groups) taken from GPPS responses and Index of Multiple Deprivation (IMD, a small area measure of socio-economic deprivation based on patient's postcode of residence) using groups defined by national quintiles. There is an indicator variable for survey wave capturing both seasonal differences and longer term trends. Variation in baseline levels for each practice resulting in clustering are accounted for using a random intercept for practice, as well as a random slope for time allowing for differential trends. This modelling allows us to capture the background scores against which the effect of the intervention can be measured. Respondents who had not provided data for one or more of the socio-demographic variables were excluded from the analysis.
The effect of the intervention is captured using two fixed effect variables. The first is an indicator variable which takes the value 1 where the practice was an intervention practice and the survey was mailed after the intervention had started and zero otherwise. This term is intended to capture a step change in experience measure after starting the 'telephone first' approach. The second variable is equal to the time (in years) since the intervention started at a practice and survey mail out. Where the practice is a nonintervention practice, or the survey mail out precedes the intervention this variable equals zero. This variable is intended to capture a change in trend of patient experience scores post intervention. Further, random slopes are included for both of these variables to allow for heterogeneity in the effect of the intervention between practices. The estimated standard deviation of these random slopes is combined with the fixed effects to calculate a 95% mid-range, i.e. the range of intervention effect we expect to see across most practices.
Finally a supplementary analysis was performed to investigate if the effect of the intervention was differential between those in or not in work by including a main effect for working status (based on GPPS responses) and an interaction between working status and the intervention variable. This analysis was motivated by early findings from the qualitative workstream. For all observations where a patient's general practice was recorded, patient age, gender, and index of multiple deprivation (IMD) (a measure of small area-level deprivation of the patient's home address) were extracted. Age groups were then defined as 5 year age groups up to the age of 19, then 10 year bands up to the age of 89 with a further group containing all ages 90 and over. IMD was classified into 5 groups with quintile defining cut points, from most to least deprived. The number of attendances/admissions per calendar month in each age group, by gender, and by IMD strata were calculated within each practice separately for each year and month of data for the following outcomes: • A&E attendances We also excluded data from practices in years in which their practice code did not appear in the NHAIS system denominator files, even when attendances/admissions were attributed to patients at the practice. Further, we excluded the data from practices in the year preceding one where the practice did not appear, in order to exclude practices where mergers or closures may have occurred during the year of analysis. One intervention practice was excluded on the basis of this criteria, with a launch date in September, but the practice no longer appearing in the Exeter denominator files the following April. This gives a final analysis sample of 841 control practices and 145 intervention practices.
Finally, the data in intervention practices were restricted to between one-year before and one-year after the practice launch date. This was to focus any changes on those attributable to the 'telephone first' approach rather than any other factors, including potential concerns about the impact of practice mergers and closures. Data for control practices were included for the whole time period to give the best characterisation of background trends and variability.
In some months, practices included in the analysis had no relevant admissions or attendances for A&E, Inpatient or Outpatient data in one or more age, gender and IMD combination (there are 120 possible combinations). For each of these practice months, where there were no attendances or admissions for a particular stratum, we include the strata with a count of "0" admissions or attendances.
A regression analysis was performed for each outcome looking first for within practice step changes at the time the intervention was introduced and second for a within practice change in the preceding trend (e.g. slowing down of a previous increase). We also model heterogeneity in these changes to examine whether the intervention has a different effect in different practices.
Our main analysis was conducted on an intention to treat basis. It includes all practices as identified by the commercial companies even when they informed us that the practices were running a hybrid form of the system or were no longer running the system. This was done in order to avoid selection bias, whereby only the successful practices continue with the system in the recommended form. A sensitivity analysis (appendix 3.5) was also performed restricting the analysis to practices where we believed, on the basis of information provided by the commercial companies, that the system was being run consistent with the companies' protocols.
A mixed effects Poisson regression was used adjusting for patient age, gender and time period (96 dummy variables, one for each month from April 2008 to March 2016) as fixed effects, a random intercept to account for different baseline rates between practices, and a random slope for a continuous time variable allowing for different trends between practices. The effect of the intervention is captured using two fixed effect variables. The first is an indicator variable which takes the value 1 where the practice was an intervention practice and the HES data related to a time period after the intervention had started and zero otherwise. This term is intended to capture a step change in attendances after starting the 'telephone first' approach. The second variable is equal to the time (in years) since the intervention started at a practice. Where the practice is a non-intervention practice, or the data relates to a period preceding the intervention this variable equals zero. This variable is intended to capture a change in trend of patient experience scores post intervention. We additionally included a second time period variable with time pre-and post-intervention in intervention practices only, and 0 in control practices, to allow for a differential trend in admissions or attendances in control compared with intervention practices overall, unrelated to the intervention start date. Further, a random slope is included for the step change variable to allow for heterogeneity in the effect of the intervention between practices. The estimated standard deviation of these random slopes is combined with the fixed effects to calculate a 95% mid-range, i.e. the range of intervention effect we expect to see across most practices.

Methods of economic analysis
The methods of economic analysis are summarised in table C. Costs reported as non-staff fixed and variable costs related to the implementation and routine operation of telephone first, and costs of any staffing changes.
Overall change in minutes spent consulting multiplied by cost per minute.
Analysis of quantity and cost of drugs monthly from 2008 to 2016 in control practices and +/-1 year of commencement in intervention practices. Sub-analyses for ambulatory care-sensitive conditions & antibiotic prescriptions. 2016 price year. Multilevel mixed-effects log-linear regression adjusting for time period and intervention as fixed effects and random intercept and slope.

Secondary Care Costs
HES data -all intervention practices and 829 randomly sampled control practices for 2008 to 2016 Application of unit costs to calculate cost of A&E and outpatient attendances and inpatient admissions (all, elective and emergency, and for ambulatory care-sensitive conditions) The telephone survey on the costs of the intervention was based on a previous survey undertaken by members of the research team 4 . The survey was administered in the twenty intervention practices taking part in the survey of patient experience. The questionnaire was completed in a single telephone interview with the practice manager or a colleague nominated by the practice manager, with