In drug intervention studies, there is an agreed set of quality standards. These include the use of an appropriate placebo control, random allocation to treatment, and blinding of both patients and researchers. Also increasingly important is preregistration of outcome measures, so that authors do not selectively report only the most favourable outcome measures.
Studies of behavioural interventions, including the current PACE study and its predecessor, 1,2 have not consistently been evaluated by these standards, which has sometimes led to exaggerated claims as to their effectiveness. Here, I comment on this issue from the perspective of an experimental psychologist (I leave it to others to consider theoretical issues, such as validity of the authors’ underlying illness model).
The original PACE study reported that one year after a 24-week graded exercise therapy programme (GET), 61% patients improved on a combined self-rated measure of fatigue and physical function. 2 CBT yielded a similar improvement rate of 59%. On the face of it, that looks impressive. However, the control condition - specialist medical care - also yielded an impressive 45% improvement. This in itself casts doubt on the validity of the self-report measures used to assess improvement. But more importantly, this high level of baseline improvement means that conclusions rest on the differences in improvement rates between the various conditions, some 14-16% of participants. The design of the no-treatment control therefore becomes crucial.
Drug intervention studies include a placebo condition, which controls for spurious factors known to affect outcomes, such as the expectation for improvement, and the patient’s degree of investment. The baseline condition used in the PACE study (specialist medical care) is not adequate to control for either factor: patients in this condition would be unlikely to have the same expectations of improvement as those in the intervention groups (nor, arguably would those in the pacing intervention), nor would they be likely to invest as much effort into the “treatment”, or to develop the same kind of rapport with the therapist.
Recent evidence suggests that self-report measures are much more vulnerable to the placebo effect than more objective measures. 3,4 Given that treatment was unblinded, and not all factors influencing placebo responding were adequately controlled for, objective outcomes – from blinded raters – are essential in order to overcome these criticisms. However, the study reports only one such outcome: the average distance walked in six minutes increased after all treatments but reliably more so after GET. A number of other objective measures planned in the original protocol were simply never reported. 5 All other positive outcomes are from self report.
This leads us to a third major problem, the highly selective reporting of outcome measures. Selective reporting is a major criticism that has been raised against current psychological research standards. 6 The problem is quite simple: our criterion for statistical significance (less than 5% probability of obtaining a significant effect by chance alone) means that up to 1 in every 20 statistical results could very well be artefactual. The more outcomes one samples, the higher the chances of at least one effect being significant by chance alone. By measuring multiple outcomes, and selectively reporting only favourable ones, the researcher is concealing from the reader this heightened probability of a spurious result.
The most recent output from this group, the mediation study by Chalder and colleagues is therefore difficult to interpret, since it claims as its starting premise that the one-year outcomes of the original study demonstrate substantive evidence of genuine treatment effects. 1 Given the problems noted above, I would argue that this requirement has not been met.
The new paper’s most valuable contribution is that it reports one new objective measure: the fitness test (heart rate after a step exercise test). However, GET patients – the group predicted to show the greatest improvement – did not differ from the non-treatment control on this measure. This result, and its lack of prominence in the original paper, leads to further concerns about the selective reporting in the study and the heavy reliance on self-report.
One might object to these criticisms, arguing that in behavioral interventions adequate control conditions are difficult to design, and objective measures difficult to obtain. Both objections can be easily countered. Recently, Lynch, Laws and McKenna reviewed a selection of studies of CBT interventions for major depressive disorder and other severe psychiatric conditions. 7 They identified studies that: a) used control conditions meeting the requirements set out above (e.g., supportive therapy, psycho-education); and b) reported objective outcome measures, from blinded observers. Alarmingly, in a metanalysis of these studies, the treatment effect for CBT was much weaker and/or absent. When standards are raised to the same level as those required in drug interventions, outcomes look much less impressive.
Some might argue that the risk of harm is lower for behavioural than for drug interventions, so there is no need to adhere to the same rigorous standards. However, this assumption needs to be challenged: in the case of ME/CFS, there is evidence that GET may in fact result in adverse effects for some patients. 8 Also, drawing unwarranted conclusions from behavioural intervention studies can do great harm in less direct ways. For example, in the case of MECFS, if policy makers and practitioners believe that there is already a valid “treatment” out there for this condition, they may be less motivated to examine other, more valid treatment options. Even more seriously, we have seen in the British media this week that the results of such studies may be used support a view of MECFS that minimises is severity, exaggerates its responsiveness to treatment, and places responsibly for the illness back on the patient.
In the Psychology literature, there has recently been much discussion of some of the more general weaknesses in psychological methodology. 6 If medical and psychiatric journals wish to continue publishing behavioural studies, they need to make themselves more aware of this literature. Behavioural research should not have a “get out of jail free” card when it comes to scientific rigour.
1. Chalder T, Goldsmith KA, White PD, Sharpe M, Pickles AR. Rehabilitative therapies for chronic fatigue syndrome: a secondary mediation analysis of the PACE trial. Lancet Psychiatry 14 Jan 2015, doi:10.1016/S2215-0366(14)00069-8.
2. White PD, Goldsmith KA, Johnson AL, Potts L, Walwyn R, DeCesare JC, et al, for the PACE trial management group. Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome (PACE): a randomised trial. Lancet 2011; 377: 823-36.
3. Spiegel D. Kraemer H and Carlson RW. Is the placebo powerless? N Engl J Med 2001; 345: 1276-79.
4. Hróbjartsson A and Gøtzsche PC. Placebo interventions for all clinical conditions. Cochrane Database Syst Rev 2010; 1.
5. White PD, Sharpe MC, Chalder T, DeCesare JC, Walwyn R; PACE trial group. Protocol for the PACE trial: a randomised controlled trial of adaptive pacing, cognitive behaviour therapy, and graded exercise, as supplements to standardised specialist medical care versus standardised specialist medical care alone for patients with the chronic fatigue syndrome/myalgic encephalomyelitis or encephalopathy. BMC Neurol 2007; 7:6.
6. Simmons JP, Nelson LD and Simonsohn U. False-positive Psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 2011; 22: 1359-1366.
7. Lynch D, Laws KR and McKenna PJ. Cognitive behavioural therapy for major psychiatric disorder: does it really work? A meta-analytical review of well-controlled trials. Psychological Medicine 2010; 40: 9-24.
8. Twisk FN and Maes M. A review on cognitive behavioral therapy (CBT) and graded exercise therapy (GET) in myalgic encephalomyelitis (ME) / chronic fatigue syndrome (CFS): CBT/GET is not only ineffective and not evidence-based, but also potentially harmful for many patients with ME/CFS. Neuro Endocrinol. Lett. 2009; 30: 284–99.
Competing interests: No competing interests