Exercise versus usual care after non-reconstructive breast cancer surgery (UK PROSPER): multicentre randomised controlled trial and economic evaluation

Abstract Objective To evaluate whether a structured exercise programme improved functional and health related quality of life outcomes compared with usual care for women at high risk of upper limb disability after breast cancer surgery. Design Multicentre, pragmatic, superiority, randomised controlled trial with economic evaluation. Setting 17 UK National Health Service cancer centres. Participants 392 women undergoing breast cancer surgery, at risk of postoperative upper limb morbidity, randomised (1:1) to usual care with structured exercise (n=196) or usual care alone (n=196). Interventions Usual care (information leaflets) only or usual care plus a physiotherapy led exercise programme, incorporating stretching, strengthening, physical activity, and behavioural change techniques to support adherence to exercise, introduced at 7-10 days postoperatively, with two further appointments at one and three months. Main outcome measures Disability of Arm, Hand and Shoulder (DASH) questionnaire at 12 months, analysed by intention to treat. Secondary outcomes included DASH subscales, pain, complications, health related quality of life, and resource use, from a health and personal social services perspective. Results Between 26 January 2016 and 31 July 2017, 951 patients were screened and 392 (mean age 58.1 years) were randomly allocated, with 382 (97%) eligible for intention to treat analysis. 181 (95%) of 191 participants allocated to exercise attended at least one appointment. Upper limb function improved after exercise compared with usual care (mean DASH 16.3 (SD 17.6) for exercise (n=132); 23.7 (22.9) usual care (n=138); adjusted mean difference 7.81, 95% confidence interval 3.17 to 12.44; P=0.001). Secondary outcomes favoured exercise over usual care, with lower pain intensity at 12 months (adjusted mean difference on numerical rating scale −0.68, −1.23 to −0.12; P=0.02) and fewer arm disability symptoms at 12 months (adjusted mean difference on Functional Assessment of Cancer Therapy-Breast+4 (FACT-B+4) −2.02, −3.11 to −0.93; P=0.001). No increase in complications, lymphoedema, or adverse events was noted in participants allocated to exercise. Exercise accrued lower costs per patient (on average −£387 (€457; $533) (95% confidence interval −£2491 to £1718; 2015 pricing) and was cost effective compared with usual care. Conclusions The PROSPER exercise programme was clinically effective and cost effective and reduced upper limb disability one year after breast cancer treatment in patients at risk of treatment related postoperative complications. Trial registration ISRCTN Registry ISRCTN35358984.

. Disability of arm, shoulder and hand (DASH) subscales by treatment arm over time *Adjusted for age, baseline DASH subscale score, breast surgery, axillary surgery, radiotherapy and chemotherapy. Higher scores indicate greater disability. AL= activity limitation; PR = participation restriction; I =impairment.  Strenuous activity = strenuous sport or recreational activity in previous week.

Economic Evaluation Appendix
Overview A within-trial economic evaluation was conducted to estimate the cost-effectiveness of the PROSPER exercise programme compared to usual care after breast cancer surgery. The primary health economic analysis took the form of a cost-utility analysis, expressed in terms of cost per quality adjusted-life year (QALYs) gained and incremental net monetary benefit. The analysis adopted the intention-to-treat principle. In line with NICE guidance 2 , the analysis was based on an NHS and Personal Social Services (PSS) perspective. The price year adopted for the analysis was 2015 which was when the trial intervention materials were developed. The health economic analysis used a 12-month time horizon and consequently no discounting of costs or outcomes was required. Multiple imputation was used to address missing data. Hierarchical linear models were used to analyse the single cost and QALY endpoints, whilst a hierarchical net benefit regression framework was used to jointly examine costs and consequences. Uncertainty around cost-effectiveness was characterised using net-benefit plots and cost-effectiveness acceptability curves (CEACs), in addition to multiple sensitivity analyses.

Measurement of resource Use, costs and outcomes
Intervention costs were captured using a combination of methods including case-report forms (CRFs), an adapted client-service receipt inventory (CSRI) at six months and 12 months follow up, and intervention delivery data collected by physiotherapists and the trial team. The costs within the analysis were divided into four components:  Direct intervention costs  Broader health care/PSS costs  Wider costs  Set up costs The primary analysis adopted an NHS and PSS perspective and was concerned with the costs of delivering the intervention within an NHS setting. Set up costs and wider costs were considered within sensitivity analysis.

Direct intervention costs
All participants received usual care which involved a five minute contact with a specialist breast cancer nurse who provided usual care leaflets (BCC6 3 and BCC151 4 ). In addition to leaflets, the intervention group received a physiotherapist-led exercise programme. Resource use was captured prospectively alongside the trial and we summarise the collection of resource use components in Table S6.  Table S7 was captured primarily through the CRF at six and 12 months. Data on healthcare use were collected for: inpatient care, outpatient care, community health care, medication, and equipment provided. Hospital Episode Statistics (HES) data were obtained for 242 patients who had reached 12 months from randomisation by the end of the 2017-2018 financial year, for use in secondary analysis. The resource use data collected within the CRFs were the primary source of cost data within the trial. Other wider costs considered within secondary analyses included out of pocket costs, privately purchased equipment, and private health care costs. A further analysis included set up costs, which included resource use associated with training physiotherapists.

Measurement of outcomes
In line with NICE guidelines 2 , quality adjusted life years (QALYs) were the primary outcome for the economic evaluation.

Estimating QALYs
To calculate QALYs, it was necessary to obtain health state values for trial participants over multiple time points. We used the EQ-5D-5L, a five-dimension measure of HRQoL recommended by NICE. 2,5 There are multiple value-sets that allow the calculation of utility values associated with each and every state generated by the EQ-5D-5L measure. 6 At the time of writing, NICE preferred the use of the Van Hout et al. algorithm 7 , hence this value set was used to calculate utility values. Health states were measured prospectively using the EQ-5D-5L at three time points: baseline, six and 12 months. Health state values as measured by the EQ-5D-5L were combined with time to calculate QALYs by calculating the area under the curve using the trapezium rule. 8 This method assumes that the health states reported at each time point were linearly interpolated. Participants who died during follow up were given an EQ-5D-5L score of zero at subsequent follow ups beyond the date of death.

Cost-effectiveness analysis methods Missing data and multiple imputation
The cost-effectiveness analysis combines multiple cost components and multiple EQ-5D-5L scores across time points, multiple imputation (MI) was necessary to avoid the pitfalls associated with complete case analysis with substantial missing data. Missing data were assumed to be missing at random. To maximise the use of available data, MI was conducted at the component level e.g. for each healthcare cost variable and EQ-5D-5L at each time point. Costs and EQ-5D-5L scores were imputed jointly using chained equations and predictive mean matching; the imputation model included age, ethnicity, marital status, employment status and recruiting site as covariates. For 15 participants lacking co-variate data, these were dropped from the MI analysis. Given missing data was approximately 30%-35% for each cost component, a total of 35 imputations were calculated to produce 35 complete data sets. MI procedures were conducted within Stata 16. 9

Analyses of resource use, cost and QALYs
Resource use between trial arms was examined using standard statistical methods: descriptively and using t-tests for continuous variables and chi-squared tests for categorical variables, these are extensive and reported elsewhere. Regression models using the MI data were used to examine the impact of the intervention on the single cost and QALY end points. Multi-level linear models which account for the hierarchical data structure by including random effect parameters were used to estimate the single economic end points. Following recommendations, we adjusted for baseline difference between the two arms in the analysis of QALYs by including the EQ-5D-5L as a co-variate. 2

Estimating cost-effectiveness
To examine cost-effectiveness, it was necessary to jointly assess the incremental costs and incremental effects. The net-benefit regression framework was chosen to assess cost-effectiveness as it has several strengths: i) it transforms the cost/QALY data from a ratio into a continuous variable allowing for easier manipulation whilst often normalising the data; ii) by combining costs and outcomes, it can seamlessly account for correlation between the two end points; iii) it allows easy control for baseline and co-variate imbalances 10 ; iv) it can correct for clustering using a multi-level framework, v) it effectively deals with uncertainty around the decision makers willingness to pay (WTP) for the health outcome of interest; vi) it facilitates the generation of cost-effectiveness acceptability curves (CEACs) to present decision uncertainty; and vii) it is relatively straightforward to implement within Stata using MI data.

Characterising uncertainty
CEACs are a graphical representation of the probability that an intervention is cost-effective at different levels of WTP. NICE recommend that WTP thresholds of £20,000 and £30,000 per QALY are included within the CEAC when assessing uncertainty. 2 For a range of WTP thresholds, including those specified by NICE, CEACs were created to characterise uncertainty within cost-effectiveness estimates.

Sensitivity analyses
Several sensitivity analyses were conducted to examine the uncertainty surrounding trial results. These sensitivity analyses included: i) Complete case analysis. This analysis considered only complete cases. ii) Cost per DASH point. Should the intervention arm be associated with higher costs than the usual care arm, then the cost per DASH point were to be estimated. iii) Costing from a societal perspective. In this sensitivity analysis, wider societal costs are included within the cost-effectiveness analysis. This includes: NHS health costs, private costs and over the counter (OTC) medication. iv) Incorporating training within the evaluation. Sites were trained both centrally and at hospital sites, this analysis used a conservative approach whereby it is assumed each site was trained separately with up to two trial staff undertaking training for four hours at each hospital site. v) Excluding high cost cancer healthcare usage. This analysis limited costs to intervention costs, community care costs, outpatient physiotherapy, outpatient pain clinics, outpatient complementary therapies/exercise facilities, and analgesics. vi) Using HES cost data instead of CSRI data for hospital costs. This sensitivity analysis re-ran the primary analysis for the 242 participants with 12 months of complete data post-randomisation, prior to the HES cut-off date (31/3/2018) and used HES data for costing hospital costs instead of CSRI inpatient and outpatient data. As these hospital data are obtained centrally, we assumed these data were complete. Inpatient spells during the study and other hospital-based care costs were estimated by linking hospital episode data with Health Resource Groups, using the Reference Cost Grouper software 11 , and then costed using NHS reference costs. 12

Analysis of cost
Intervention costs associated with the intervention were relatively small ( Table S8). The mean cost of physiotherapy appointments for those in the intervention arm was £103. Both trial arms received information leaflets alongside a 5-minute discharge appointment, however, these contributed very little to cost. For the intervention arm, there were other small costs, such as personalised exercise planner and manual, and manuals for the physiotherapist, and Therabands, these again were relatively small (£26). The total direct incremental cost associated with the intervention compared to the usual care arm was £129. Breast cancer related treatment formed most of the hospital costs (both inpatient and outpatient) for both arms, with noncancer related costs being relatively minor. Medication costs were high and variable in both arms reflecting the oft high cost nature of cancer therapeutics. When comparing the incremental cost between the two arms (Table  S9), the intervention was -£387 (95%CI: -2491, 1717) compared to the control arm representing a cost saving. As represented by the wide confidence intervals, there was substantial uncertainty surrounding this figure driven by the high costs related to breast cancer treatment.

Analysis of QALYs
Health utility at each time point is reported in Table S10Error! Reference source not found.. At baseline, there was a very slight imbalance between the two arms with the usual care arm having a mean utility score of 0.666 compared to 0.683 in the intervention arm. The period from baseline to six months is associated a small decrease in health utility in both arms (control = 0.648, intervention = 0.673). Between six months and 12 months the utility scores diverge with the intervention arm increasing to 0.705; in contrast the usual care arm deteriorated to 0.633. Thus by 12 months the intervention arm reported improved utility scores compared to baseline, whilst the usual care arm reported worse scores. The utility scores which use the MI data tells a similar story ( Figure S1). Imputed utility scores at all time-points are near identical to the complete case data, however uncertainty surrounding those estimates is reduced as reflected by the slightly narrower confidence intervals.  Figure S1: EQ-5D-5L Trajectory The analysis of QALYs are shown in Table S11.. Using the MI data and controlling for baseline imbalance, the intervention arm accrued 0.029 (95% CI 0.001, 0.056) more QALYs than the usual care arm. This was a statistically significant increase (p=0.04).

Cost-effectiveness analysis
From the analysis of costs and analysis of QALYs it was evident that the intervention arm dominated the usual care arm. The joint cost-effectiveness results combining costs and QALYs within a net-benefit framework are shown in Figure S2 and Figure S3. As seen in Figure S2, net-benefit was positive at all levels of WTP including zero, this reflects the domination of intervention over the usual care arm. As represented by the lower 95% confidence interval for net-benefit being below zero, there is uncertainty surrounding the results. This aligns with the cost analysis that showed there was a large degree of uncertainty surrounding the incremental cost estimate. To examine the levels of uncertainty around the results, a CEAC was created ( Figure S3). Even at a WTP of £0 there is still a 61% chance that the intervention is more cost-effective than the usual care arm. The CEAC is upward sloping due the positive co-efficient associated with incremental QALYs in the intervention arm. That is, as WTP for health benefits increase, so does the probability the intervention is cost-effective. At the NICE specified WTP threshold values of £20,000 per QALY and £30,000 per QALY there is a 78% and 84% probability that the intervention is the more cost-effective of the two arms. Given that EQ-5D-5L utility scores were diverging at the final time point it is reasonable to conclude that this probability would increase if the time horizon were extended beyond the trial as the intervention arm continues to accrue more QALYs than the usual care arm.

Sensitivity analyses
Summary results for the primary results and all sensitivity analyses are shown in Table S12. Sensitivity analysis one considers the cost-effectiveness results using the complete case data. The complete-case analysis provides supporting evidence for cost-effectiveness with there being a 65% chance the intervention is the more costeffective option at a WTP of £20,000 per QALY rising to 68% at a WTP of £30,000 per QALY. Sensitivity analyses considered cost per DASH point. As reported in the main manuscript, the exercise intervention was associated with improved DASH score and lower costs. Given this, the intervention dominated the usual care Cost-effectiveness acceptability curve arm and so a cost per DASH point was deemed unnecessary due to the problems associated with interpreting a negative ICER. Secondary analysis 3 considers the impact of broadening the costing perspective from NHS and PSS to a societal perspective. This included other private costs healthcare costs, private equipment purchases, over the counter medication, and other costs. Income losses were omitted due to the lack of data for this variable. In terms of cost-effectiveness, this further strengthens the case for cost-effectiveness with the intervention continuing to dominate the usual care arm. In this analysis the intervention at a threshold of £20,000 per QALY has an 83% chance of being more cost-effective than the usual care arm when costed using this perspective. Sensitivity analysis 4 included training costs. Across the 17 sites, a total of 312 hours of training time were accounted for, this including the time of the trainers. The inclusion of these costs led to an increase in costs per intervention participant of £55.54. This had very little impact on the results of the costeffectiveness analysis. In this analysis the probability of the intervention being cost-effective at a cost per QALY threshold of £20,000 falls marginally to 76.8%. Sensitivity analysis 5 considers a narrower costing perspective limited to those costs that are more plausibly to be impacted by shoulder problems, such as upper limb stiffness and pain, rather than cancer more generally. This led to much lower cost estimates with the mean costs falling to £732 (95%CI: £649, £815) per person. In this analysis the intervention arm is associated with an increased cost of £106 (95% CI: -£49, £262) with the probability of the intervention being cost-effective at a cost per QALY threshold of £20,000 per QALY increasing to 97%. This reflects the low costs and reduced uncertainty around cost-estimates within this analysis. This final sensitivity analysis used HES data to calculate hospital costs instead of CSRI data. Given the timescales involved for obtaining HES data within the trial timeline, it was only possible to obtain full 12-month data for 242 (63%) of the recruited participants. Within this analysis, costs were slightly higher within the intervention arm (+£166) with a great deal of uncertainty surrounding the estimate (95%CI: -£3849, £4181). This large increase in uncertainty is reflected in the CEAC with a 62% chance that the intervention is more cost-effective at a threshold of £30,000 per QALY.

Discussion
This economic evaluation examined the costs and outcomes associated with the PROSPER exercise intervention in comparison to usual care. A multi-level net-benefit regression framework was used to assess the costeffectiveness of the intervention and to estimate the uncertainty surrounding the results. The results found that the exercise intervention was cost-effective compared to usual care, with the exercise intervention within the primary analysis having a 78% chance of being the more cost-effective option at the NICE cost-effectiveness threshold of £20,000 per QALY. The results were robust to a range of sensitivity analyses. Given that EQ-5D-5L utility scores were diverging at the final time point it is reasonable to assume that these estimates are conservative. This is reinforced by secondary analysis 5, which found there was a 97% chance of costeffectiveness when excluding likely non-attributable costs, e.g. high cost cancer treatments and inpatient surgery, which drove much of the uncertainty around the cost estimates in the other analyses.
There were several limitations to this economic analysis. Whilst missing EQ-5D-5L data were relatively low, there was significant missing data for health care usage data as is common within trials. To address this, multiple imputation was used to make the most of available data whilst retaining uncertainty; this requires an assumption that data is missing at random which may not be the case. Although the cost-effectiveness estimates were favourable, there was large uncertainty surrounding incremental cost estimates. This was due to the high cost and variable nature of breast cancer treatment whereby certain cancer treatments unrelated to the rehabilitation of the shoulder post-surgery account for most of the costs. Consequently, we included a sensitivity analysis that included only those costs that might plausibly be related to shoulder pain and discomfort. In this analysis, there was much less uncertainty around cost estimates which resulted in a very high probability of the intervention being cost-effective (97%). The sensitivity analysis that used HES data was only performed on a subset of the data and therefore is not fully representative of the sample. Furthermore, it only captured hospitalisations and therefore will have missed those costs (savings) most likely to be attributable to the intervention.