Re: Updated NICE guidance on chronic fatigue syndrome
Turner-Stokes and Wade,  in their commentary addressing the NICE guidance on chronic fatigue syndrome, have correctly identified the highly problematic nature of the guidance. They have, however - with the exception of one sentence – made the wrong attribution of the source of the problem. The authors note, correctly that “the new draft is based on qualitative evidence provided by a small number of service users”. This, and a disastrous misapplication of GRADE methodology – rather than, as the authors contend, the GRADE methodology itself – is the source of the problem.
As the commentary authors note, the guidelines have chosen to downplay the evidence supporting graded exercise therapy. An appropriate application of GRADE would have come to a very different conclusion, as did a recent Cochrane review of exercise in chronic fatigue syndrome using GRADE methodology: “Exercise therapy probably reduces fatigue at end of treatment (SMD −0.66, 95% CI −1.01 to −0.31; 7 studies, 840 participants; moderate‐certainty evidence)”.
The commentators’ inference that GRADE will consistently undervalue the quality of evidence of complex interventions is further belied by a systematic survey of Cochrane reviews. The survey found that 7/16 (44%) reviews of complex interventions rated the quality of evidence of primary outcomes as moderate – sufficient in GRADE methodology to justify strong favorable recommendations.
Why did the commentators go so wrong in impugning GRADE as the source of the problem in the NICE guidance? They make the case that five aspects of GRADE will lead to nihilistic conclusions regarding quality of evidence supporting complex interventions. They are wrong on all five counts, as we will briefly illustrate.
Blinding of patients may be possible in complex interventions, and lack of blinding should not necessarily lead to rating down the certainty of evidence. Interventions such as surgery, graduated exercise, or cognitive behavioural therapy (CBT) do not allow blinding of clinicians providing treatment. Blinding of patients may, however, be possible: for example, using a sham surgery control or an attention control in a trial of CBT.
At the same time, trial results may or may not be appreciably affected by blinding. A recent meta-epidemiological study reported no difference in estimated treatment effect between trials with versus those without blinding of patients, healthcare providers, or outcome assessors. It is therefore reasonable not to rate down the certainty of evidence for risk of bias because of failure to blind as the sole problem.
Small trials are at risk of imprecision. Complex interventions focusing on outcomes measured as continuous variables may, however, have sufficient sample sizes for robust conclusions. Note the results of the Cochrane review on chronic fatigue syndrome: their results in standard deviation units provide a point estimate of a moderate to large effect (standardize mean difference 0.66) and the lower boundary of the confidence interval (0.31) excludes the threshold – SMD of 0.2 – suggested as a small effect. 
Regarding directness, changes to diagnostic criteria for chronic fatigue syndrome, fibromyalgia, irritable bowel syndrome, or other complex conditions that lack pathognomonic findings may or may not affect results. Systematic review authors can explore the issue in subgroup analysis focused on diagnostic criteria.  The Cochrane review carries out such a subgroup analysis, and there was little or no difference between subgroups based on different diagnostic criteria. It is inappropriate to downgrade on indirectness without clear evidence of a difference in effects between trials using different criteria.
Serious inconsistency, if it exists, warrants exploration to understand the sources. Inconsistency may not, however, be a problem. For instance, the Cochrane review of chronic fatigue syndrome did not rate down results for fatigue at the end of therapy for inconsistency.
GRADE does not rate down the certainty of evidence on the basis that subjective outcomes are reported directly by patients. Indeed, GRADE provides detailed guidance on presenting and interpreting results of patient reported outcomes such as fatigue, pain, physical functioning, and quality of life. This guidance reflects GRADE’s emphasis on what is most important to patients. In the case of chronic fatigue syndrome, the Cochrane review finding of important improvement in fatigue with exercise will be crucial for patients in choosing their treatment.
The NICE evidence review associated with their guideline does not provide a GRADE evidence summary of findings table for fatigue related to exercise interventions. They tell us why: “The use of CBT and GET (Graded Exercise Therapy) has been strongly criticised by people with ME/CFS (myalgic encephalitis/chronic fatigue syndrome) on the grounds that their use is based on a flawed model of causation involving abnormal beliefs and behaviours, and deconditioning. People with ME/CFS have reported worsening of symptoms with GET.” The authors are telling us they reject the randomized trial evidence focusing on patient-important outcomes on the basis of theoretical arguments and anecdote. This, of course, has nothing to do with GRADE – indeed, it is the antithesis of the GRADE approach.
The commentary authors make the case for individually tailored programs combining a range of physical, cognitive and psychological approaches. They may well be right, but they then go on to state: “current NICE methods would discount any randomised controlled trials using this approach, citing risk of bias, inconsistency, imprecision, and subjective outcomes”. There are two serious problems with this statement. First, this is not the reason NICE rejected the evidence – as we have quoted above, it is because of theoretical objections and anecdotes from patients. Second, if they did reject the evidence on the basis suggested, they would be exhibiting – as we have noted – a profound misunderstanding of GRADE.
The commentary authors end by suggesting that NICE abandon GRADE for a system of rating quality of evidence that is “independent of trial design” and offer one such approach developed by the first author of the commentary and published in 2006. An approach that ignores study design, would ignore advances in evidence evaluation over the last 70 years and, as we have pointed out, is based on a seriously misguided understanding of how GRADE should be appropriately applied to complex interventions.
1. Turner-Stokes L, Wade DT. Updated NICE guidance on chronic fatigue syndrome. BMJ. 2020; 371: m4774.
2. Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. Exercise therapy for chronic fatigue syndrome. Cochrane Database Syst Rev. 2019; 10(10): CD003200.
3. Movsisyan A, Melendez-Torres GJ, Montgomery P. Outcomes in systematic reviews of complex interventions never reached "high" GRADE ratings when compared with those of simple interventions. J Clin Epidemiol. 2016; 78: 22-33.
4. Karanicolas PJ, Farrokhyar F, Bhandari M. Practical tips for surgical research: blinding: who, what, when, why, how? Can J Surg. 2010; 53(5): 345-8.
5. Moustgaard H, Clayton GL, Jones HE, Boutron I, Jørgensen L, et al. Impact of blinding on estimated treatment effects in randomised clinical trials: meta-epidemiological study. BMJ. 2020; 368: l6802.
6. Schandelmaier S, Briel M, Varadhan R, Schmid CH, Devasenapathy N, et al. Development of the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN) in randomized controlled trials and meta-analyses. CMAJ. 2020; 192(32): E901-E906.
7. Guyatt GH, Thorlund K, Oxman AD, Walter SD, Patrick D, Furukawa TA, Johnston BC, Karanicolas P, Akl EA, Vist G, Kunz R, Brozek J, Kupper LL, Martin SL, Meerpohl JJ, Alonso-Coello P, Christensen R, Schunemann HJ. GRADE guidelines: 13. Preparing summary of findings tables and evidence profiles-continuous outcomes. J Clin Epidemiol. 2013; 66(2): 173-83.
8. Turner-Stokes L, Harding R, Sergeant J, Lupton C, McPherson K. Generating the evidence base for the National Service Framework for Long Term Conditions: a new research typology. Clin Med (Lond). 2006; 6: 91-7.
Competing interests: GHG is co-chair of the GRADE Working Group; SAF, EAA, PD, MD, MR, JJM, RM, MR and POV are members of the GRADE Working Group. This Rapid Response is not an official communication from the GRADE Working Group.