- Jeremy Fairbank, consultant orthopaedic surgeon ()1,
- Helen Frost, research fellow2,
- James Wilson-MacDonald, consultant orthopaedic surgeon1,
- Ly-Mee Yu, statistician3,
- Karen Barker, director of physiotherapy research1,
- Rory Collins, professor4
- for the Spine Stabilisation Trial Group
- 1Nuffield Orthopaedic Centre, Oxford OX3 7LD
- 2University of Warwick, Division of Health in the Community, Coventry CV4 7AL
- 3Centre for Statistics in Medicine, Oxford OX3 7LF
- 4Clinical Trial Service Unit and Epidemiological Studies Unit, Radcliffe Infirmary, Oxford OX2 6HE
- Correspondence to: J Fairbank
- Accepted 24 March 2005
Objectives To assess the clinical effectiveness of surgical stabilisation (spinal fusion) compared with intensive rehabilitation for patients with chronic low back pain.
Design Multicentre randomised controlled trial.
Setting 15 secondary care orthopaedic and rehabilitation centres across the United Kingdom.
Participants 349 participants aged 18-55 with chronic low back pain of at least one year's duration who were considered candidates for spinal fusion.
Intervention Lumbar spine fusion or an intensive rehabilitation programme based on principles of cognitive behaviour therapy.
Main outcome measure The primary outcomes were the Oswestry disability index and the shuttle walking test measured at baseline and two years after randomisation. The SF-36 instrument was used as a secondary outcome measure.
Results 176 participants were assigned to surgery and 173 to rehabilitation. 284 (81%) provided follow-up data at 24 months. The mean Oswestry disability index changed favourably from 46.5 (SD 14.6) to 34.0 (SD 21.1) in the surgery group and from 44.8 (SD14.8) to 36.1 (SD 20.6) in the rehabilitation group. The estimated mean difference between the groups was –4.1 (95% confidence interval –8.1 to –0.1, P = 0.045) in favour of surgery. No significant differences between the treatment groups were observed in the shuttle walking test or any of the other outcome measures.
Conclusions Both groups reported reductions in disability during two years of follow-up, possibly unrelated to the interventions. The statistical difference between treatment groups in one of the two primary outcome measures was marginal and only just reached the predefined minimal clinical difference, and the potential risk and additional cost of surgery also need to be considered. No clear evidence emerged that primary spinal fusion surgery was any more beneficial than intensive rehabilitation.
Chronic low back pain is a common cause of distress and results in considerable personal and public financial consequences. Management is mostly non-operative, but spinal fusion has been used for nearly 90 years. Spinal fusion rates vary between and within countries.1 In England about 1000 lumbar fusions are performed per year.2 An almost direct relation exists between the numbers of operations performed each year and of orthopaedic and neurosurgeons per head of population.3 In the United States, spinal fusions for “degenerative changes” rose sharply from around 11 000 operations per year in 1996 to 37 000/year in 2001 (a 336% increase).4 Both the rationale and the techniques used to fuse the spine have been questioned.5 Multidisciplinary rehabilitation programmes that focus on physical, psychological, social, and occupational factors have been advocated for patients with chronic pain of the low back.6–8
This trial was conceived in response to the identification of weak evidence for surgery as a priority by the NHS standing group on health technology in 1994.9 10 The pragmatic trial was designed to compare two treatment strategies (spinal stabilisation surgery or intensive rehabilitation) for patients considered by surgeons to be candidates for surgical stabilisation of the lumbar spine.
This multicentre, randomised trial was set in 15 hospitals in the United Kingdom. Only consultant surgeons with training and expertise in performing spinal fusions participated. We approached an additional 39 centres where either the surgeon was unwilling to recruit patients or implementation of the intensive rehabilitation programme was impossible.
We used the uncertainty of outcome principle to define our entry criteria and therefore depended on the current practice of many experienced spine surgeons and their patients.11 Patients who were candidates for surgical stabilisation of the spine were eligible if the clinician and patient were uncertain which of the study treatment strategies was best. Patients had to be aged between 18 and 55, with more than a 12 month history of chronic low back pain (with or without referred pain) and irrespective of whether they had had previous root decompression or discectomy.
Patients were ineligible if the surgeon considered that any medical or other reasons made one of the trial interventions unsuitable. These included infection or other comorbidities (inflammatory disease, tumours, fractures), psychiatric disease, inability or unwillingness to complete the trial questionnaires, or pregnancy. If patients had had previous surgical stabilisation surgery of the spine they were also excluded.
The aim was to determine whether surgical stabilisation of the spine (by fusion or flexible stabilisation) was more or less effective at achieving worthwhile relief of symptoms over a two year period than an intensive rehabilitation programme based on principles of cognitive behaviour therapy.
We assessed outcomes at baseline and 6, 12, and 24 months from randomisation by a trial research therapist in each centre. If the patient was unable to attend the follow-up appointments we mailed the questionnaire. We approached non-responders by phone, through their family doctor, and via national databases.
The two primary measures at 24 months included a back pain specific questionnaire and a standardised walking test. The Oswestry low back pain disability index is scored from 0% (no disability) to 100% (totally disabled or bedridden) and designed to assess limitations of various activities of daily living.12 13 The shuttle walking test is a standardised, progressive, maximal test of walking speed and endurance.14–16
The short form 36 general health questionnaire (SF-36) includes 35 items summarised in two measures related to physical and mental health. Each scale ranges from 0 (worst health state) to 100 (best health state). The summary measures are transformed to give a population mean of 50 (SD 10). The SF-36 is recommended as an outcome assessment for spinal disorders because it provides strong psychometric support and extensive normative data.
Psychological assessment—We used the distress and risk assessment method (DRAM), which includes the modified Zung depression index and somatic perception questionnaire, to assess anxiety and depression.17
Complications—We recorded the intraoperative use of anaesthetic agents, implants, and radiological investigations; complications of surgery and any adverse effects of rehabilitation; postoperative complications, implant failure and repeat surgery; and personal items and devices purchased by the patient because of lower back pain. Work status was monitored. We recorded “obvious pseudoarthrosis” only where it was clear to the treating surgeon that fusion had failed and that this was a problem to the patient.
We used the Oswestry disability index to determine the sample size. The trial was designed to be able to detect a difference in mean score between the intervention groups of as little as 4 points.12 13 We estimated that 133 subjects would be required in each group to detect such a difference at the α = 0.05 level with 80% power. We initially planned to recruit at least this number of patients in each of three separate clinical groups to allow reliable subgroup analysis, but most of the patients were recruited in one clinical category.
Spinal stabilisation surgery—The particular technique used for spinal fusion was left to the discretion of the operating surgeon. This allowed choice of the most appropriate surgical approach, implant (if any), interbody cages, and bone graft material for that patient. A small number of surgeons used flexible stabilisation of the spine (the Graf or Global technique). This was recorded for each patient before randomisation.
Intensive rehabilitation programme—Each centre was modelled on a daily outpatient programme of education and exercise running on five days per week for three weeks continuously. Further details of the programme are reported elsewhere.15 Most centres offered 75 hours of intervention (range 60-110 hours), with one day of follow-up sessions at one, three, six, or 12 months after treatment. The rehabilitation programmes were led by physiotherapists but included clinical psychologists in all but one centre, as well as medical support. The daily exercises were individually tailored and paced to increase repetitions and duration, aiming to build on the participants' baseline ability. They included stretching of major muscle groups, spinal flexibility exercises, general muscle strengthening, spine stabilisation exercises, and cardiovascular endurance exercise using any mode of aerobic exercise (treadmill walking, step-ups, cycling, rowing). All but one centre included daily sessions of hydrotherapy. We used principles of cognitive behaviour therapy to identify and overcome fears and unhelpful beliefs that many patients develop when in pain.
Treatment allocation and recruitment
Surgeons approached patients who were candidates for spinal fusion. Each centre employed a trial research therapist to organise the trial locally, recruit patients, book treatment appointments, and carry out assessments. Patients were given verbal, written, and videotape (OMI, Oxford) explanations of the background and nature of the trial. The trial research therapists obtained written consent and carried out baseline assessments before randomisation.
Randomisation was generated centrally by computer program, with minimisation for various potential confounding factors: age, smoking, litigation, Oswestry score, clinical classification, and planned use of the Graf procedure.
We carried out an intention to treat analysis. We used analysis of covariance (ANCOVA) to analyse quantitative outcomes at 24 months, with corresponding baseline values and treatment group as covariates.
We used multiple imputation to handle missing data. To impute the missing data we constructed multiple regression models including variables potentially related to the fact that the data were missing and also variables correlated with that outcome. We used Stata (StataCorp, College Station, Texas, USA)18 and PROC MI in SAS (SAS Institute, Cary, NC, USA) to obtain similar answers, and only the former are presented.
A total of 349 patients were randomised between June 1996 and February 2002 from 15 centres in the UK (176 allocated to surgery and 173 to rehabilitation). The figure shows the progression through the trial. Table 1 shows the baseline characteristics of patients who entered the trial.
Compliance with treatment and follow-up
Table 2 shows data on participants' compliance with their treatment and follow-up. Forty eight (28%) patients randomised to rehabilitation had surgery by two years. Seven (4%) patients randomised to surgery had rehabilitation instead of surgery.
Intraoperative complications occurred in 19 surgical cases (table 3). Eleven patients required further operations on their lumbar spine during the two year follow-up. We did not identify any specific complications of the rehabilitation programmes.
Oswestry scores improved slightly more in favour of surgery (–4.1, 95% confidence interval –8.1 to –0.1, P = 0.045). After imputation for missing follow-up data the mean difference was –4.5 (–8.2 to –0.8, P = 0.02) (tables 4 and 5). No significant heterogeneity in the effect on the Oswestry score was observed between the predefined groups of patient (table 6). No other difference between groups in any of the other outcomes at 24 months reached significance, even when we used imputed values (tables 4 and 5).
Patients with low back pain who are considered by surgeons to be candidates for spinal fusion may obtain similar benefits from an intensive rehabilitation programme as they do from surgery. Our large randomised controlled trial of spinal fusion surgery compared with intensive rehabilitation was limited by recruitment difficulties, some crossover between intervention groups, and incomplete follow-up at 24 months, but the results should help clinicians and service providers make decisions about the management of chronic low back pain. Both groups improved over time, but this effect may reflect a natural resolution of chronic low back pain or regression to the mean. The Oswestry scores improved significantly more in patients allocated to surgery than in those allocated to rehabilitation. Although this difference just exceeds the 4 points specified in the sample size calculation, clinically this difference is small considering the potential risks and additional costs of surgery. Analyses adjusting for baseline variations or per protocol analysis do not change this interpretation (data not shown). Overall, since the other primary outcome of the shuttle walking test and the other measures did not differ (even after imputation for missing values), the small difference observed in Oswestry scores should be interpreted cautiously. Furthermore, the confidence intervals can be used to rule out differences in Oswestry scores of more than 10 points in favour of surgery and of more than 2 points in favour of rehabilitation. Consequently, they narrow substantially the range of plausible estimates for any benefit of surgery.
Comparison with related research
A Cochrane review in 1999 found a complete absence of randomised controlled evidence for spinal fusion.5 Three randomised controlled trials have been reported subsequently. Möller and Hedlund reported a trial in isthmic spondylolisthesis, with 77 patients randomised to different forms of surgery and 34 patients randomised to an exercise programme.19 The patients allocated to surgery reported greater benefits at two years in terms of Oswestry scores compared with those allocated to exercise, but instrumentation and bone grafting was not found to produce an advantage over bone grafting alone. A Swedish trial randomised 222 patients to three surgical groups of equal size and 72 patients to physiotherapy.20 They reported decreased pain and disability in the surgical group compared with physiotherapy at two years but no difference in outcomes between the different surgical techniques. Little effect of physiotherapy was apparent at two years, although this may have been because of the type or intensity of treatment. Routine physiotherapy and intensive rehabilitation are not the same and should not be considered as such. Brox et al reported no differences between groups in a small trial of 64 patients comparing instrumented posterior fusion with rehabilitation followed to 12 months.21 Improvements in outcomes were comparable with those in both arms of the present trial and in the surgical arm of the Swedish trial.
Evidence is moderate to strong that multidisciplinary rehabilitation including general exercise programmes of muscle strengthening, flexibility training, and cardiovascular endurance along with a cognitive behaviour approach improves function, reduces pain, and work loss in patients with chronic pain of the low back compared with usual care or non-multidisciplinary treatment.8 22 23 This type of treatment was difficult to implement in the trial and, although recommended in recent European guidelines,24 is not routinely available in the NHS.
Strengths and limitations of the study
The uncertainty principle had initially been expected to aid trial accrual by bringing the process of informed consent closer to standard medical practice. However, recruitment was slow and numbers enrolled smaller than planned. Eligibility was based on the uncertainty of outcome principle, but uncertainty does not come easily to surgeons when patients are demanding clear direction and advice. Factors influencing recruitment will be presented elsewhere. This pragmatic trial reflects current practice across the UK of experienced spine surgeons selecting patients for fusion. Surgeons may argue that we excluded the best candidates for surgery through “certainty” of outcome, but this certainty varied between surgeons. Evidence from the Swedish trial25 shows that patients with low neuroticism, narrow discs, and a short time off work do best with surgery.
Surgeons were allowed their own choice of operation to improve the chance of clinical success. The Swedish trial showed no difference between three surgical techniques of fusion.24 These results call into question what lumbar fusion is actually doing to patients with chronic back pain. Elucidation of this question was not the objective of this study. The results are highly relevant to spinal fusion surgery, as well as the new techniques of flexible stabilisation and disc replacement that are being applied to this group of patients.
Loss to follow-up
Loss to follow-up at 24 months (20%) limits the internal validity of the trial. We used multiple imputation as a sensitivity analysis to tackle potential bias resulting from the poor response rate. Overall estimates of the treatment effect were very similar with all methods of statistical analysis.
The pre-randomisation outcomes were scored by the trial research therapists and later checked by computer. All subsequent outcomes were scored centrally. We were not able to blind the trial research therapists to patient allocation after the baseline assessment.
Limitation of outcomes
The available outcome measures are blunt instruments for assessing a complex condition. The minimum clinically important change in the Oswestry scores has been estimated by different observers as being somewhere between 4 and 17.26 Debate continues among back pain experts over the question of what represents a clinically important change. Functional measures are difficult to apply in a multicentre setting, and although the use of muscle measurement techniques may be useful, it was not possible to use them in this trial because of financial limitations. Walking capacity was chosen as it is simple and cheap to measure and often a limitation for people with chronic low back pain.
Compliance with treatment protocol
The 48 (28%) patients who were randomised to rehabilitation and then had additional surgery by two years should be considered as an additional outcome of the trial and taken into account in the interpretation of the results. Although some patients and surgeons were clearly not satisfied with the results of rehabilitation, many more seem to have benefited and avoided surgical intervention.
Nearly three quarters of those patients allocated to rehabilitation avoided surgery by two years. Rehabilitation including a cognitive behaviour approach is not routinely or widely available to patients with chronic pain of the low back, and this trial implies that it should be. Rehabilitation programmes require finance, space, and training, but above all they need the strong support of all clinicians involved in the care of these patients.
What is already known on this topic
Limited evidence shows that patients with severe chronic low back pain treated with spinal stabilisation surgery have a better outcome in terms of pain and disability than with traditional conservative management
The results of spinal stabilisation surgery seem to be similar whatever surgical technique is used
Intensive multidisciplinary rehabilitation including a biopsychosocial approach improves pain and function in severe chronic low back pain compared with usual care or traditional conservative treatment
What this study adds
No clear evidence emerged that primary spinal fusion surgery was more beneficial than intensive rehabilitation using principles of cognitive behaviour therapy
Evidence exists to support intensive rehabilitation with cognitive behaviour principles as an alternative to spinal fusion surgery in the management of chronic low back pain
We thank the patients, who permitted a difficult decision to be made for them; referees, physiotherapist, and surgeons, inside and outside the trial (www.ndos.ox.ac.uk/SST), who helped develop the protocol; the Medical Research Council for supporting the study; NHS R&D (especially Richard Lilford) for supporting and promoting the study.
Additional contributors and members of the MRC steering committee are on www.bmj.com
Contributors JF was responsible for the overall study design, the organisation of the study, recruiting and operating on patients in the study, data analysis, and wrote the first draft of this report. HF was responsible for overall study design, the organisation of the study, the design and implementation of the rehabilitation programme, data analysis, and writing the first draft of the report. JWM was responsible for overall study design, the organisation of the study, recruiting and operating on patients in the study, data analysis, and editing the report. RC was responsible for overall study design, statistical advice and data analysis, and presented analyses to the data monitoring committee. LMY was responsible for statistical analysis. JF and HF are guarantors. KB was responsible for recruiting patients and data analysis. Of the contributors who are not listed as authors, Douglas Altman was responsible for statistical analysis and sat on the data monitoring committee. Alastair Gray was responsible for study design, the organisation of the study and economic analysis. Nicolas Maniadakis was responsible for economic data collection and analysis. Kate Johnston, Helen Campbell, and Oliver Rivero were responsible for economic analysis. Patricia Carver was responsible for data collection and analysis. L Morgan was responsible for data collection and database design. Kate Stevens, Victoria Erlanger, Rebecca Bale collected and entered data. Peter Smith developed and maintained the database.
Funding The Medical Research Council supported the trial financially and was represented on the steering committee. The NHS (326) or private patient insurance (23) funded the treatment of patients. MRC grant number G94431172.
Competing interests JF and JWM have received funding from Synthes for a spinal fellow.
Ethical approval The trial was approved by a multicentre research ethics committee (twice; references 98/5/14 for original and 03/05/034 for long term follow-up) and 15 local research ethics committees.