Re: Pop a million happy pills? Antidepressants, nuance, and the media
What is the threshold for a clinical minimally important drug effect?
Kate Adlington1 draws attention to the uncritical promotion of antidepressant drugs initiated by various professionals and news outlets following the publication of a recent meta-analysis by Cipriani and colleagues.2 She also points out that the small effect sizes were largely neglected and that most news report did not differentiate mild depression from severe depression. The crucial issues thus are whether the effects of antidepressants are clinically relevant and whether these drugs also work in mild-to-moderate depression. A leading psychiatric journal recently published a meta-analysis, co-authored by Dr. Cipriani, in which the authors purport that their findings demonstrate efficacy of newer-generation antidepressants in both mild-to-moderate and severe depression.3 This article was accompanied by an editorial entitled “The alleged lack of efficacy of antidepressants in non-severe depression: a myth debunked”.4 Obviously, journal and authors want to disseminate the message that antidepressant are an effective treatment for mild-to-moderate depression and that practice guidelines should incorporate these findings. However, neither the original paper by Furukawa et al.,3 nor the editorial by Eriksson and Hieronymus4 does in fact provide evidence that the drugs are clinically effective for any form of depression. Instead, their conclusion appears to rely exclusively on the fact that there is a statistically significant drug-placebo difference in mean symptom change. The same misinterpretation of statistical significance also appears to have fuelled the simplistic and over-optimistic conclusions drawn from the Lancet meta-analysis by Cipriani et al.2 In the following, I will summarise their findings and ponder on the reported average treatment effect. I will stress that equating statistical significance with a proof of drug efficacy is a scientific fallacy and I will demonstrate that a drug-placebo difference of this magnitude is evidence of a lack of efficacy.
Research question, findings and conclusions
The recent paper by Furukawa et al.3 addresses a research question with important treatment implications: Are antidepressant drugs equally effective across the spectrum of initial depression severity? The authors use individual-participant data to meta-analytically examine the interaction between baseline depression severity and mean differences in change scores between antidepressants and placebo across 8 weeks of treatment. The authors report a statistically significant mean drug-placebo difference of 1.6 points on the Hamilton Rating-Scale for Depression (HRSD; range 0-52 points), but they did not find a significant interaction between change scores and initial severity. That is, the drug-placebo difference is comparable across the whole spectrum of baseline severity, which agrees with other large-scale meta-analyses of individual-participant data.5, 6 The average treatment effect of 1.6 HRSD points found in this meta-analysis also compares well to the approximately 2 HRSD points found in group-level meta-analyses of moderate-to-severe depression, including the highly publicised work by Cipriani and colleagues.2 Furukawa et al.3 conclude that “Patients would benefit equally through the whole spectrum of severity…” and that “The myth of specifically smaller benefit of antidepressants for the milder spectrum of the disorder in comparison with its severer spectrum must now be expelled“ (p. 456). As mentioned above, this conclusion was reiterated in the accompanying editorial by Eriksson and Hieronymus.4 But do their findings actually warrant such strong and far-reaching claims?
Fallacy of statistical significance testing
I agree with Furukawa and others that the efficacy of antidepressants is comparable across the whole severity spectrum, but the data clearly do not support the claim that antidepressants have a clinical benefit. In fact, what their3 and related findings2 5 6 reveal is that antidepressants are largely ineffective across the whole spectrum, of severity because a 1.6 point drug-placebo difference is a negligibly small effect. Research has shown that a minimal clinically relevant drug-placebo difference would be 7 points in HRSD change scores.7 The reported 1.6 point effect size further represents a very small fraction (≤16%) of the commonly required 10 to 12 points symptom change that define clinical response in moderate to severe depression. So why are these findings interpreted as evidence that the drugs are effective? Many researchers misinterpret statistical significance as evidence of a clinically relevant drug effect. However, one must not confound statistical significance with clinical relevance, since even the most minor effect sizes reach statistical significance when sample size is large.8 9 The American Statistical Association10 formally states that “A p-value, or statistical significance, does not measure the size of an effect or the importance of a result” and they further emphasize that “Any effect, no matter how tiny, can produce a small p-value if the sample size or measurement precision is high enough …” (p. 132). As a matter of facts, statistical significance alone does not prove that antidepressants are effective. Whether a finding is practically important requires a careful examination of the reported effect size. Unfortunately, such a critical evaluation was provided neither in the original paper nor in the accompanying editorial. In contrast, elsewhere Cipriani et al.11 even complained that there is “an undue focus on the binary and polarising question of clinical significance” (p. 462). Such statements underline how wide-spread the misinterpretation of statistical significance testing is. This poses a serious threat to the import and transfer of research findings for psychiatric practice. There is not an undue focus on clinical significance, but rather on statistical significance. Weighing in the practical importance of drug effects is prerequisite for clinical practice. Claiming otherwise undermines scientific progress.
A drug is hardly effective when its average treatment effect falls that short of a minimal clinically important improvement. Unfortunately, many researchers settle for statistical significance and misinterpret statistically significant findings as evidence that a drug is effective. But without considering the reported effect sizes in context, statistical significance testing is a mindless and misleading endeavour.8 12 Whether a drug is helpful to patients must be based on a careful risk-benefit analysis.13 Given that antidepressant use relates to significantly increased odds of serious adverse events and many debilitating side effects, including persistent sexual dysfunction,14 pathological behavioural activation and mania,15 suicide attempts,16 as well as physical dependence and various bodily disorders,17 it is questionable whether a minor treatment effect of less than 2 points on the HRSD outweighs those serious risks. A cautious interpretation of these data3 and related findings,5 6 including the meta-analysis by Cipriani et al.,2 would therefore be that evidence for clinical benefits of antidepressant is not only lacking in mild-to-moderate depression, but also in severe depression. Instead of urging still more drug prescriptions,1 I suggest that researchers should scrutinize whether antidepressants work for any form of depression in a clinically meaningful way by balancing risks and benefits. All else would be incompatible with the principle of evidence-based medicine.
1. Adlington K. Pop a million happy pills? Antidepressants, nuance, and the media. BMJ 2018;360:k1069. doi: 10.1136/bmj.k1069.
2. Cipriani A, Furukawa TA, Salanti G, Chaimani A, Atkinson LZ, Ogawa Y, et al. Comparative efficacy and acceptability of 21 antidepressant drugs for the acute treatment of adults with major depressive disorder: a systematic review and network meta-analysis. Lancet 2018;391(10128):1357-66. doi: 10.1016/S0140-6736(17)32802-7.
3. Furukawa TA, Maruo K, Noma H, Tanaka S, Imai H, Shinohara K, et al. Initial severity of major depression and efficacy of new generation antidepressants: individual participant data meta-analysis. Acta Psychiatr Scand 2018. doi: 10.1111/acps.12886.
4. Eriksson E, Hieronymus F. The alleged lack of eﬃcacy of antidepressants in non-severe depression: a myth debunked. Acta Psychiatr Scand 2018;in press.
5. Rabinowitz J, Werbeloff N, Mandel FS, Menard F, Marangell L, Kapur S. Initial depression severity and response to antidepressants v. placebo: patient-level data analysis from 34 randomised controlled trials. Br J Psychiatry 2016;209(5):427-8. doi: 10.1192/bjp.bp.115.173906.
6. Gibbons RD, Hur K, Brown CH, Davis JM, Mann JJ. Benefits from antidepressants: synthesis of 6-week patient-level outcomes from double-blind placebo-controlled randomized trials of fluoxetine and venlafaxine. Arch Gen Psychiatry 2012;69(6):572-9. doi: 10.1001/archgenpsychiatry.2011.2044.
7. Moncrieff J, Kirsch I. Empirically derived criteria cast doubt on the clinical significance of antidepressant-placebo differences. Contemp Clin Trials 2015;43:60-2. doi: 10.1016/j.cct.2015.05.005.
8. Szucs D, Ioannidis JPA. When null hypothesis significance testing is unsuitable for research: A reassessment. Front Hum Neurosci 2017;11:390. doi: 10.3389/fnhum.2017.00390.
9. Cohen J. The earth is round (P<.05). Am Psychol 1994;49(12):997-1003. doi: 10.1037/0003-066x.50.12.1103.
10. Wasserstein RL, Lazar NA. The ASA's statement on p-values: Context, process, and purpose. Am Stat 2016;70(2):129-33. doi: 10.1080/00031305.2016.1154108.
11. Cipriani A, Salanti G, Furukawa TA, Egger M, Leucht S, Ruhe HG, et al. Antidepressants might work for people with major depression: where do we go from here? Lancet Psychiat 2018;5(6):461-3. doi: 10.1016/S2215-0366(18)30133-0.
12. Gigerenzer G. Mindless statistics. Journal of Socio-Economics 2004;33:587-606.
13. Hengartner MP. Methodological flaws, conflicts of interest, and scientific fallacies: Implications for the evaluation of antidepressants’ efficacy and harm. Front Psychiatry 2017;8:275. doi: 10.3389/fpsyt.2017.00275.
14. Jakobsen JC, Katakam KK, Schou A, Hellmuth SG, Stallknecht SE, Leth-Moller K, et al. Selective serotonin reuptake inhibitors versus placebo in patients with major depressive disorder. A systematic review with meta-analysis and Trial Sequential Analysis. BMC Psychiatry 2017;17(1):58. doi: 10.1186/s12888-016-1173-2.
15. Tondo L, Vazquez G, Baldessarini RJ. Mania associated with antidepressant treatment: comprehensive meta-analytic review. Acta Psychiatr Scand 2010;121(6):404-14. doi: 10.1111/j.1600-0447.2009.01514.x.
16. Fergusson D, Doucette S, Glass KC, Shapiro S, Healy D, Hebert P, et al. Association between suicide attempts and selective serotonin reuptake inhibitors: systematic review of randomised controlled trials. BMJ 2005;330(7488):396. doi: 10.1136/bmj.330.7488.396.
17. Carvalho AF, Sharma MS, Brunoni AR, Vieta E, Fava GA. The safety, tolerability and risks associated with the use of newer generation antidepressant drugs: A critical review of the literature. Psychother Psychosom 2016;85(5):270-88. doi: 10.1159/000447034.
Competing interests: No competing interests