Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence
BMJ 2015; 351 doi: https://doi.org/10.1136/bmj.h4320 (Published 16 September 2015) Cite this as: BMJ 2015;351:h4320
All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
TO THE EDITOR:
I greatly admire the BMJ for stimulating in a variety of ways the discussion of complex issues surrounding S329. But as a Child, Adolescent and General Psychiatrist in private practice, I have neither the time nor the skills to carefully evaluate the positions taken either by Le Noury et al [1] or Keller, et al [2]. Nonetheless, I am of the firm opinion that the original 2001 article should have been retracted long ago by the JAACAP. My opinion is based on reading the relevant parts of the Department of Justice [DOJ] vs GlaxoSmithKline [GSK] court ruling [3].
A summary of the ruling [4] states:
Paxil: In the criminal information, the government alleges that, from April 1998 to August 2003, GSK unlawfully promoted Paxil for treating depression in patients under age 18, even though the FDA has never approved it for pediatric use. The United States alleges that, among other things, GSK participated in preparing, publishing and distributing a misleading medical journal article that misreported that a clinical trial of Paxil demonstrated efficacy in the treatment of depression in patients under age 18, when the study failed to demonstrate efficacy. At the same time, the United States alleges, GSK did not make available data from two other studies in which Paxil also failed to demonstrate efficacy in treating depression in patients under 18.
The settlement itself [4] speaks even more definitely against S329. Examples of this taken from the ruling include:
• …while concealing the fact that Paxil failed to show efficacy on any of the primary endpoints in three controlled trials funded by GSK…. To drive these promotional efforts, GSK touted a medical journal article that it paid to have drafted and that exaggerated Paxil’s efficacy while downplaying risks indentified during one of the trials. [p3-4]
• Three GSK Clinical Trial Failed to Demonstrate Paxil’s Effectiveness…. [p4]
• Study 329 Failed to Show Efficacy of Paxil for Children and Adolescents. [p4]
• …GSK hired Scientific Therapeutics Information, Inc. (STI) to prepare a journal article about Study 329. GSK worked closely with STI on the article by providing a draft clinical report to “serve as a template for the proposed publication,” commenting on multiple drafts, and approving the final version. [p5-6]
• JAMA rejected the article in December 1999 and provided comments to the article’s lead author, which he then circulated to GSK and STI. Some of the comments were extremely critical of how the article portrayed the study’s results. [p7]
• Given the comments received, GSK and the lead author decided to revise the article and send it to what they called “a less demanding journal.” [p7]
• (The article was submitted to the JAACAP and published.) “The final published article still mischaracterized the results of Study 329, even with the changes.” [p8]
• GSK Caused the JAACAP Article to Misrepresent and Minimize Paxil’s Risks to Children and Adolescents. [p9]
• GSK and STI instead revised the article to falsely state that only on of the 11 serious adverse events in Paxil patients was considered related to treatment…. [p10]
• GSK added Dr. Wagner (one of the listed authors of S329) to the agenda of a Paxil Forum meeting in June 2001 to “capitalize” on the impending JAACAP publication. Dr. Wagner’s presentations during the Forum meetings were similar to the one she gave to the sale force. Dr. Wagner said adolescent patients who received Paxil in the 329 study showed “significantly greater improvement.” [p18]
The positions presented by the DOJ led to GSK settling the case, a consequence of which was the largest fine, $3 Billion, ever levied against a pharmaceutical corporation. This outcome does not speak well for S329 and is consistent with the findings of Le Noury et al [1]. As a child and adolescent psychiatrist, I find it embarrassing that none of the listed authors of S329, a number of whom are my colleagues, have acknowledged the problems with the article nor has the JAACAP retracted it despite over a decade of complaints about its authenticity.
Yours,
Edmund C. Levin, M.D.
1. Le Noury J, Nardo JM, Healy D, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ 2015;351:h4320.
2. Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry 2001;40(7):762-72.
3. http://www.justice.gov/sites/default/files/opa/legacy/2012/07/02/us-comp...
4. http://www.justice.gov/opa/pr/glaxosmithkline-plead-guilty-and-pay-3-bil.... p 5.
Competing interests: No competing interests
The response by Keller and selected colleagues [1] to our Restoring Study 329 article alleges three overarching faults[2]: bias and lack of blind ratings in relation to harms; lack of detailed methodology; and failure to consider the available methodological knowledge regarding paediatric depression from twenty-four years ago. Regarding the second issue, there is in fact a detailed explanation of all the methods in our paper and its RIAT Audit Record (appendix 1). We tackle the first and third issues below.
Efficacy
While there was uncertainty twenty-four years ago about the appropriate rating scale to use in pediatric depression trials, there were serious methodological problems in the conduct and reporting of Study 329 that have nothing to do with that uncertainty. Instead, in their reporting of efficacy in Study 329[3], and their defence of it, Keller and colleagues have asked that the field suspend many widely held tenets about clinical trial analysis, by asking us to do the following:
• accept that the a priori protocol is not binding, and that changes can be made to the outcome variables while the study is ongoing, without amending the protocol with the IRB or documenting the rationale for the change
• ignore the requirement to correct the threshold of significance for the analysis of multiple variables
• ignore the requirement that when there are more than two groups, preliminary omnibus statistical analysis needs to be done prior to making any pairwise comparisons between groups - an integral part of the ANOVA analysis declared in the Study 329 protocol
• allow the parametric analysis of rank-order, ordinal rating scales [CGI, HAM-D and K-SADS-L Depressed Mood Items] rather than the expected non-parametric methods specifically derived for this kind of data
• allow 19 outcome measures to be added to the original eight at various times up to and after the breaking of the blind, purportedly according to an analytical plan 'developed prior to opening of the blind’ (In spite of multiple requests, neither GSK nor Keller and colleagues have ever produced this analytic plan, suggesting that either it does not exist, or that it contains information unsympathetic to their claims.)
• accept the dismissal of protocol-specified secondary outcomes and the introduction of rogue variables on the grounds that ‘the Hamilton Depression Rating Scale (our primary outcome measure) had significant limitations in assessing mood disturbance in younger patients’, when none of the protocol-specified secondary outcome measures that they discarded were based on the HAM-D, and two of the rogue measures that they introduced were HAM-D measures
• accept the clinically dubious improvements in four of these rogue variables as evidence of efficacy. (Although these measures achieved statistical significance in the pre-defined eighth (final) week of the acute phase of the study, they did not do so in the weekly assessments over the previous seven weeks, a pattern unseen in any known antidepressant; we are working on another manuscript analysing Keller et al’s rogue variables.)
There was no ambiguity about the appropriateness of these methodological manoeuvres when Study 329 was conducted and reported. However, although some of these problems were obvious when the paper was first published, others were not apparent until we had access to the raw clinical data. This lack of transparency erodes confidence that RCTs will be conducted, analysed and reported free from covert manipulation.
Furthermore, Keller and colleagues also failed to report on the continuation phase of Study 329, even though that was a protocol-specified outcome. A report of this phase is almost ready for submission by us.
Harms
With regard to harms, Keller and colleagues are simply incorrect in many of their claims about our purported bias and lack of blind ratings.
First, our paper makes it clear that both coders in the re-analysis were blind to randomisation status.
Second, there was no ‘re-scoring’. This odd choice of words raises doubts that Keller et al have much expertise in analysing harms. We used a dictionary that adhered much more closely to the verbatim terms used by the face-to-face interviewers. The fact that Keller and colleagues say that we have labelled emotional lability as suicidality makes us wonder if they have seen the individual patient level data; it was the SKBs coders who came up with the term ‘emotional lability’, not the face-to-face interviewers, whose verbatim terms were of suicidal thoughts and behaviour. Simply using the verbatim terms that the named authors or their colleagues had used when faced with these adolescents reveals a striking rate of suicidal events. To argue that our return to these verbatim terms was arbitrary is bizarre.
Third, we made it clear there is unavoidable uncertainty in coding, and we invited others to download the data we have made available and juggle it to see if they can improve on our categorisation of the data. In our correspondence with BMJ, we made it clear that there are items that GSK could argue are more appropriately coded differently. We would be receptive to a rationale for alternate coding of certain items that is cogently argued rather than simply asserted, but our hunch is that a disinterested observer reviewing the coding as presented by GSK across all 1500 adverse effects in this study (or 2000+ if we include the continuation phase) would conclude that our efforts are a better representation of the data.
Fourth, reading our paper makes it clear why we reviewed the clinical records of 93 subjects; these were the subjects who dropped out or became suicidal. Our claims about underreporting of adverse events stand independently of that non-random sub-sample.
With regard to suicidal ideation and attempts, Keller et al. refer to a reanalysis by Bridge and colleagues, which found that there was no significant difference in suicidality between paroxetine and placebo. But Bridge et al. relied on Keller et al.'s misleading 2001 report.
With regard to bias, our point was that the best protection against bias is rigorous adherence to predetermined protocols and making data freely available. We, like everyone, are subject to the unwitting influence of our bias. The question is whether the Keller et al publication of 2001 manifests unconscious bias or deliberate misrepresentation.
The original and restored studies, the study data, reviews and responses are all available at Study329.org, offering a broad range of options when it comes to consideration of authorship, research misconduct and the newly described species, ‘research parasite’[4].
1 Keller MB, Birmaher B, Carlson GA, Clarke GN, Emslie GJ, Koplewicz H, Kutcher S, Ryan N, Sack WH, Strober M. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. Response from the authors of the original Study 329. BMJ 2015;351:h4320
2 Le Noury J, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C, Abi-Jaoude E. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ. 2015 Sep 16;351:h4320.
3 Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry 2001;40:762-72.
4 Longo DL, Drazen JM. Data sharing. N Engl J Med. 2016;374:276-7.
Competing interests: As noted in our RIAT paper
In their letter [1], Keller and co-authors of the Study 329 report published in Journal of the American Academy of Child and Adolescent Psychiatry (JAACAP) in 2001[2] challenge the validity of the 2015 re-analysis published in The BMJ by Le Noury et al. [3], in part because of “substantial problems with RIATT [sic] methodology.” Keller et al. note that the RIAT approach lacks detailed protocols or other documents explaining RIAT methodology, and consider this problematic.
As lead and senior authors of the Restoring Invisible and Abandoned Trials (RIAT) declaration [4], we disagree with Keller et al.’s criticism, and use this opportunity to explain our position.
RIAT is a conceptual framework for bringing corrective action to the scientific literature by publishing unpublished trials and re-publishing published-but-misreported clinical trials. The 2015 publication by Le Noury et al. used the RIAT framework to republish Study 329. The basis for this was the misreporting of the study by Keller et al. in their 2001 publication in JAACAP.[2] This publication was a key piece of evidence in the US Department of Justice criminal lawsuit against GlaxoSmithKline which ultimately settled for US $3 billion in 2012.[5] Other researchers have used the RIAT framework to publish an unpublished trial on colorectal surgery.[6,7]
The RIAT declaration [4] outlines a number of steps that “restorative authors” (here, Le Noury et al.) should use to enable an ethical primary publication of a clinical trial. A key one is that RIAT papers must report the clinical trial according to the original protocol of the original trial. Any analyses conducted that were not pre-specified in the original protocol must be clearly marked as such. (We wrote: “RIAT analyses should follow the analyses specified in the protocol (including any specified in amendments). Any other analyses are discouraged, but if done must be clearly noted as exploratory and not prespecified. At the same time, RIAT authors may wish to critically appraise the trials they report. This can be useful, but the critique should be clearly identifiable and placed in the discussion section.”[4])
PD served as one of the formal peer reviewers for the Le Noury paper and as far as he can tell, the authors followed this guidance.
We and our co-authors specifically intended RIAT to be a living concept, open to suggestions for improvement. While Keller et al. express concern over a lack of “detailed methodology” for RIAT, they do not cite the RIAT declaration [4] nor mention any details of what is actually lacking in the current process. We invite them to read the RIAT declaration.
We welcome all thoughts on how to ensure the most robust RIAT papers possible.
Peter Doshi and Tom Jefferson
References:
[1] Keller M, Birmaher B, Carlson GA, Clarke GN, Emslie GJ, Koplewicz H, et al. Re: Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. Response from the authors of the original Study 329 [Internet]. 2016 [cited 2016 Jan 20]. Available from: http://www.bmj.com/content/351/bmj.h4320/rr-27
[2] Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry 2001;40:762-72. http://www.ncbi.nlm.nih.gov/pubmed/11437014
[3] Le Noury J, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. The BMJ. 2015 Sep 16;351:h4320. http://www.bmj.com/content/351/bmj.h4320
[4] Doshi P, Dickersin K, Healy D, Vedula SS, Jefferson T. Restoring invisible and abandoned trials: a call for people to publish the findings. BMJ 2013;346:f2865. http://www.bmj.com/cgi/doi/10.1136/bmj.f2865
[5] Doshi P. No correction, no retraction, no apology, no comment: paroxetine trial reanalysis raises questions about institutional responsibility. The BMJ. 2015 Sep 16;351:h4629. http://www.bmj.com/content/351/bmj.h4629
[6] Treasure T, Monson K, Fiorentino F, Russell C. The CEA Second-Look Trial: a randomised controlled trial of carcinoembryonic antigen prompted reoperation for recurrent colorectal cancer. BMJ Open. 2014 May 1;4(5):e004385. http://bmjopen.bmj.com/content/4/5/e004385
[7] Treasure T, Monson K, Fiorentino F, Russell C. Operating to remove recurrent colorectal cancer: have we got it right? BMJ. 2014 May 13;348(may13 2):g2085. http://www.bmj.com/content/348/bmj.g2085
Competing interests: We are the first and senior authors of the RIAT declaration, which was coauthored by David Healy, who is part of the group that reanalysed Study 329. PD served as one of the formal peer reviewer for the reanalysis manuscript and provided the Jureidini team with unpaid advice on the RIAT process before the paper was submitted and while it was under review. PD is also a graduate of Brown University, where Professor Keller is Professor Emeritus of Psychiatry and Human Behavior. PD initiated an inquiry in 2012 that resulted in additional information from clinical study reports of Study 329 and eight other studies being posted on GSK’s website. In addition, PD received €1500 from the European Respiratory Society in support of his travel to the society’s September 2012 annual congress in Vienna, where he gave an invited talk on oseltamivir. PD and TJ were co-recipients of a UK National Institute for Health Research grant (HTA – 10/80/01 Update and amalgamation of two Cochrane Reviews: neuraminidase inhibitors for preventing and treating influenza in healthy adults and children: http://www.nets.nihr.ac.uk/projects/hta/108001). This review relied on clinical study reports provided by GSK for zanamivir. TJ receives royalties from his books published by Blackwells and Il Pensiero Scientifico Editore, Rome. TJ is occasionally interviewed by market research companies for anonymous interviews about Phase 1 or 2 pharmaceutical products. In 2011-2013, TJ acted as an expert witness in a litigation case related to oseltamivir phosphate; Tamiflu [Roche] and in a labour case on influenza vaccines in healthcare workers in Canada. In 1997-99 TJ acted as a consultant for Roche, in 2001-2 for GSK, and in 2003 for Sanofi-Synthelabo for pleconaril (an anti-rhinoviral, which did not get approval from the Food and Drug Administration). TJ was a consultant for IMS Health in 2013, and in 2014 was retained as a scientific adviser to a legal team acting on the drug Tamiflu (oseltamivir, Roche). In 2014-15 TJ was a member of two advisory boards for Boerhinger and is in receipt of a Cochrane Methods Innovations Fund grant to develop guidance on the use of regulatory data in Cochrane reviews. TJ has a potential financial conflict of interest in the investigation of the drug oseltamivir. TJ is acting as an expert witness in a legal case involving the drug oseltamivir (Roche). TJ is a member of an Independent Data Monitoring Committee for a Sanofi Pasteur clinical trial.
The BMJ article entitled “Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence”[1] reanalyzed data from the original Paroxetine 329 study[2], a double-blind placebo controlled comparison of paroxetine to imipramine. Paroxetine 329 was designed between 1991 and 1992. Subject enrollment began in 1994, and was completed in 1997. Academic psychiatrists designed the study, with very little change by GSK, which funded the study in an academic / industry partnership. The goal of the study was to advance the treatment of depression in youth, rather than primarily as a drug registration trial.
Overarching issues with the “Restoring Study 329” include:
1. Authors of “Restoring Study 329” evidenced both bias and a lack of blind ratings. In a recent article about “Restoring Study 329” in the Chronicle of Higher Education, Dr. Jureidini is quoted as saying: “We don’t think we’ve done the definitive analysis. It’s not something that can be done absolutely objectively, particularly the interpretation of harms. We can’t protect ourselves completely from our own biases.”[3]
Biases are a serious consideration for Restoring Study 329 because Dr. Jureidini, as he declares in a Footnote on the subject of “Competing interests”, served as an expert witness for plaintiff’s lawyers in legal suites against GSK related to Study 329. In that work Dr. Jureidini would have studied all available data looking at both efficacy and suicidal side effects, using many different approaches to best capture any potential harms.
2. The “restoring invisible and abandoned trials” (RIAT) approach to reanalyzing published studies may provide general guidelines but we could not find publications or available working RIAT documents on detailed protocols. Lack of detailed methodology is a serious concern because there is general consensus in the field that there is not, nor never will be a single correct approach to reanalysis. Small differences in analysis frequently make big differences in statistical results and conclusions.
3. “Restoring Study 329” did not consider available knowledge 24 years ago, when Paroxetine 329 was developed and performed. Clinical research methodology has evolved considerably in the past two decades. These aspects are addressed in comments by established investigators not involved in Paroxetine 329. For example, Referring to “Restoring Study 329” as reported in Psychiatric News Alert [4] Mark Olfson said, “However, the new reanalysis does not alter the totality of clinical trial evidence that continues to support the safety and efficacy of SSRIs for adolescent depression.” And Daniel Pine said “We have known for some time that antidepressant medications have both significant benefits for some children as well as significant risks for other children. This new analysis really does nothing to change this knowledge, and provides no new insights into what we have known about these medications for the past few years.”
Efficacy
Antidepressants considered as a group are superior to placebo for the treatment of anxiety disorders and for depression in adolescents, with similar overall response rates in anxiety and depression. [5]
The two primary outcome measures in Paroxetine 329, did not reach statistical significance. The abstract of the published paper noted: (1), "The two primary outcome measures were endpoint response (Hamilton Rating Scale for Depression [Ham-D] score ≤ 8 or ≥50% reduction in baseline HAM-D) and change from baseline HAM-D score." In Table 2, the p value for the first primary endpoint (Ham-D score ≤ 8 or ≥50% reduction in baseline HAM-D) was reported at p < 0.11 for paroxetine versus placebo. In the same table, the p value for change in HAM-D total score is reported at p < 0.13. While both outcomes were in the direction of a better response for paroxetine over placebo; neither reached our critical alpha level of 0.05. This is clear in the abstract and text of the publication.
In the interval from when we planned the study to when we approached the data analysis phase, but prior to the blind being broken, the academic authors, not the sponsor, added several additional measures of depression as secondary outcomes. We did so because the field of pediatric-age depression had reached a consensus that the Hamilton Depression Rating Scale (our primary outcome measure) had significant limitations in assessing mood disturbance in younger patients. Taking this into consideration, and in advance of breaking the blind, we added secondary outcome measures agreed upon by all authors of the paper. We found statistically significant indications of efficacy in these measures. These secondary outcomes were clearly reported as separate from the negative primary outcomes.
Thus, the authors of “BMJ-Restoring Study 329” were incorrect in stating that “Both before and after breaking the blind, however, the sponsors made changes to the secondary outcomes as previously detailed. We could not find any document that provided any scientific rationale for these post hoc changes and the outcomes are therefore not reported in this paper.” Rather, secondary outcomes were decided by the authors prior to the blind being broken. Secondary outcome measures are frequently, and appropriately, included in study reports even when the primary measures do not reach statistical significance. The authors of “Restoring Study 329” state “there were no discrepancies between any of our analyses and those contained in the CSR [clinical study report]”. The disagreement on treatment outcomes rests on this arbitrary and non-blind dismissal of our secondary outcome measures.
In the abstract we stated “Conclusions: Paroxetine is generally well tolerated and effective for major depression in adolescents.” In this sample and with the state of knowledge at the time, it was justified and appropriate.
Our goal was to learn as much as possible about the use of this compound in youth, so that we could understand what role it (and by extension other SSRIs) could play in the treatment of adolescents with MDD. For us the question was given (1) the data distribution and statistical results for efficacy and side effects that we saw in the study, and (2) that there was well replicated research evidence of efficacy and relative safety of paroxetine in adults, what conclusions should be drawn from our data? The clinical outcomes were substantially in the right direction, with a number of them reaching the 0.05 level of statistical significance. The clinical results comparing paroxetine placebo to and imipramine to placebo paralleled those reported in adults. Side effects were similar to what was known in adults. Thus we reached, the conclusions reported.
Harms
The “Restoring Study 329” reanalysis uses the FDA MedDRA approach to side effect data, which was not available when our study was done. That one can do better reanalyzing adverse event data using refinements in approach that have accrued in the 15 years since a study’s publication is unsurprising and not a valid critique of Paroxetine 329 study as performed and presented.
We emphatically disagree with the “Restoring Study 329” position that statistics are not useful in understanding adverse side effects and that each individual reader should decide for herself when a difference in rates of adverse side effects is meaningful. Statistics offer several approaches to the question of when is there a meaningful difference in the side effect rates between different treatments.
Specific methodology problems in the reanalysis of the “harm” data are as follows: 1) The authors choose a non-random subsample of 85 subjects who were withdrawn from the study plus 8 subjects whom the authors labeled “suicidal” based on their inspection of the data; 2) a different instrument was utilized to re-score the harm effects and only one of the authors was trained in the scoring of the instrument; 3) some side effects were arbitrarily interpreted (e.g., upper respiratory symptoms were labeled as “dystonia” and emotional lability labeled as “suicidality”); 4) in the original paper, side effects were analyzed only during the acute phase, but in the reanalysis, the authors analyzed them during the acute phase, as well as the tapering and follow up phases of the study; and 5) in the original study patients were interviewed face-to-face whereas the reanalysis was based only on the interpretation of the data; and 6) importantly, the two authors were not blind to patients’ randomization status.
Suicidal ideation and attempts
Our field’s understanding of how to approach analysis of suicidal ideation, suicide attempts, and completed suicide has advanced enormously since publication of study 329.
Two definitive reanalyses of the suicidality with antidepressants in adolescents include: 1). The 2003 FDA reanalysis of all RCT data of SSRI studies in youth for all indications.[6] In the FDA analysis the average risk ratio for SSRI versus placebo treated subjects was 1.96 (CI: 1.28-2.98). Considered separately Study 329 did not reach statistical significance for increased suicidality (CI: 0.42-33.21). 2). The methodologically superior reanalysis by Bridge and colleagues also found that in study 329 there was no significant risk difference between paroxetine and placebo. [7]
Paroxetine treatment in youth does not appear to significantly differ from other SSRIs in the risk of suicidal ideation or attempts and whether SSRIs increase or decrease completed suicide remains an open question. [8-12]
Summary
We strongly support efforts to make anonymized raw data from scientific studies available for reanalysis. The validity of “Restoring 329”, however, is doubtful because of author bias and substantial problems with RIATT methodology. To describe Paroxetine 329 as “misreported” is pejorative and wrong based on both state-of-the-art research methods 24 years ago, and retrospectively from the standpoint of current best practices.
Sincerely,
Martin B. Keller, M.D.
Boris Birmaher, M.D.
Gabrielle A. Carlson, MD
Gregory N. Clarke, Ph.D.
Graham J. Emslie, M.D.
Harold Koplewicz, M.D.
Stan Kutcher, M.D.
Neal Ryan, M.D.
William H. Sack, M.D.
Michael Strober, Ph.D.
1. Le Noury J, Nardo JM, Healy D, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ 2015;351:h4320.
2. Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry 2001;40(7):762-72.
3. Basken P. Landmark analysis of an infamous medical study points out the challenges of research oversight. Secondary Landmark analysis of an infamous medical study points out the challenges of research oversight 2015. http://chronicle.com/article/Landmark-Analysis-of-an/233179.
4. Reanalysis of JAACAP Study on Paroxetine Sparks Controversy. Secondary Reanalysis of JAACAP Study on Paroxetine Sparks Controversy 2015. http://alert.psychnews.org/2015/09/reanalysis-of-jaacap-study-on.html.
5. Bridge JA, Iyengar S, Salary CB, et al. Clinical response and risk for reported suicidal ideation and suicide attempts in pediatric antidepressant treatment: a meta-analysis of randomized controlled trials. Jama 2007;297(15):1683-96.
6. Hammad TA, Laughren T, Racoosin J. Suicidality in pediatric patients treated with antidepressant drugs. Arch Gen Psychiatry 2006;63(3):332-9.
7. Bridge JA, Birmaher B, Iyengar S, et al. Placebo response in randomized controlled trials of antidepressants for pediatric major depressive disorder. Am J Psychiatry 2009;166(1):42-9.
8. Gibbons RD, Brown CH, Hur K, et al. Early evidence on the effects of regulators' suicidality warnings on SSRI prescriptions and suicide in children and adolescents. Am J Psychiatry 2007;164(9):1356-63.
9. Olfson M, Shaffer D, Marcus SC, et al. Relationship between antidepressant medication treatment and suicide in adolescents. Arch Gen Psychiatry 2003;60(10):978-82.
10. Gibbons RD, Hur K, Bhaumik DK, et al. The relationship between antidepressant prescription rates and rate of early adolescent suicide. Am J Psychiatry 2006;163(11):1898-904.
11. Valuck RJ, Libby AM, Sills MR, et al. Antidepressant treatment and risk of suicide attempt by adolescents with major depressive disorder: a propensity-adjusted retrospective cohort study. CNS Drugs 2004;18(15):1119-32.
12. Gibbons RD, Mann JJ. Strategies for quantifying the relationship between medications and suicidal behaviour: what has been learned? Drug Saf 2011;34(5):375-95.
Competing interests: Please see attachments to this response
Faced with the suicidal events in Study 329, Drs Verdolini and Agius invite readers to consider the underlying neurobiology but their letter has no links to neurobiology whatsoever. It offers a series of claims regularly made by those with a mania for bipolar disorder. There is no link in these claims to biology and it is not clear that there is any clinical footing to the claims either.
For the record, healthy volunteers become suicidal on serotonin reuptake inhibitors. Are all of these bipolar? The rate of suicidal events on SSRIs in non-depressive indications is roughly the same as it is in depression - are these eating disorder and other patients all bipolar?
The rate of suicidal events on anticonvulsant supposed mood stabilizers in clinical trials of bipolar disorder is roughly double the placebo rate in the same trials - the results map onto the rates for suicidal events in antidepressant trials. It goes without saying these suicidal patients actually bipolar but offering this as an explanation would be ridiculous.
The rate of suicidal events in trials of anticonvulsants used for migraine and epilepsy is again similar to that in bipolar disorder trials - what are we to make of this.
What are we to make of the fact that the rate of suicidal events in antipsychotic trials is also roughly similar - regardless of indication?
A much more parsimonious hypothesis is that certain drugs do not suit certain individuals. We have no idea what the biology is in these cases. We do not even know why some patients on SSRIs become intensely nauseated and others do not - a much more common side effect that suicidality. To suggest that we do know what is going on by bringing bipolar disorder into the frame, a disorder whose biology remains quite opaque, is not helpful
Competing interests: As outlined in Restoring Study 329
Le Noury and colleagues (1) are to be congratulated for their recent reanalysis of Study 329, however it is important to consider the neurobiology which may underly the apparent failure of both imipramine and paroxetine in treating adolescents with major depression.
The population included in this study population was composed of "adolescents aged 12-18 who met DSM-III-R criteria for major depression for at least 8 weeks" (1). Bipolar disorder was one of the exclusion criteria.
However, the usual pattern of the development of bipolar disorder in the UK is that patients begin experiencing recurrent depression in early adolescence and then develop episodes of hypomania some years later (2).
Furthermore, according to Sharma and colleagues (3), 80% of the patients with antidepressant (AD)-resistant ''unipolar'' depressive disorder have threshold and subthreshold bipolar disorder.
Hence it is very likely that, despite the exclusion of clearly bipolar patients from the study design, there may have been some patients included within the sample who could be either subsyndromal for bipolar disorder or who could be still in the early phase of recurrent depressive episodes but are on the trajectory for developing bipolar disorder. As a consequence, these patients, like those who have fully developed bipolar disorder, may be resistant to antidepressants such as SSRIs (4).
It could then be argued that the resistance to treatment could have led to suicidal ideation because of the lack of effectiveness of the medication provided or that the antidepressant medication could have disclosed a mixed state, characterized by increased suicidality (5).
These considerations may explain the outcomes of such studies as Study 329, and should be considered in their interpretation. Furthermore it is important to assess subthreshold current or life-time (hypo)manic symptoms and to take into consideration validators for bipolar disorder such as family history of bipolar disorder, early onset of depressive episodes, and psychiatric comorbidity (4) in assessing adolescent depressive patients. Such an assessment would help identify patients who are at risk of future bipolar conversion as well as being at increased risk of developing suicidal ideation.
References
(1) Le Noury J, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ. 2015 Sep 16;351:h4320. doi: 10.1136/bmj.h4320.
(2) Fawcett M, Agius M. Are there different genotypes in Bipolar II and Bipolar I disorder and if so, why then do we tend to observe Unipolar Depression converting to Bipolar II and then converting to Bipolar I? Psychiatr Danub. 2015 Sep;27 Suppl 1:S160-9
(3) Sharma V, Khan M, Smith A. A closer look at treatment resistant depression: is it due to a bipolar diathesis? J Affect Disord. 2005;84:251Y257
(4) Rihmer Z, Dome P, Gonda X. Antidepressant Response and Subthreshold Bipolarity in ''Unipolar'' Major Depressive Disorder. Implications for Practice and Drug Research. Journal of Clinical Psychopharmacology. August 2013; 33(4): 449-452.
(5) Pacchiarotti I, Nivoli AM, Mazzarini L, Kotzalidis GD, Sani G, Koukopoulos A, et al. The symptom structure of bipolar acute episodes: in search for the mixing link. J Affect Disord. 2013 Jul;149(1-3):56-66.
Competing interests: No competing interests
In scientific inquiry, there is a role for both hypothesis-driven and exploratory research. Research trainees have often heard one form or another of the saying, ‘If you torture your data long enough, it will tell you what you want to hear.’ Nevertheless, the practice of inappropriate data interrogation with the aim of obtaining a p value that is less than 0.05 pervades the literature(1–12). Hence, prespecifying hypotheses helps protect against spurious findings, and also helps ensure that research pursuits are based on a rationale that is informed by known scientific evidence.
Nevertheless, clinical research endeavours typically demand considerable time and other resources, and the knowledge and interpretation of scientific evidence that may provide a basis for hypotheses is constantly evolving; thus, it is imperative that data from studies are adequately interrogated. The key, however, is that this is done in a transparent manner, both in terms of reporting the full extent of exploratory analyses, and in tempering interpretations of findings arising from such exploratory pursuits.
The study published by Keller and colleagues(13) is not transparent about the distinction between the prespecified outcome measures and the additional analyses carried out. In fact, one such exploratory variable is misleadingly described as a “(i.e., primary outcome measure)”(13)(page 765). An internal SKB memo describes the aim to “effectively manage the dissemination of these data in order to minimise any potential negative commercial impact”(14). It described “no plans to publish data from Study 377” [a trial similar to Study 329 with similarly negative results](14) (pdf page 1), and that “Positive data from Study 329 will be published”(14) (pdf page 5). Despite repeated requests from us, GSK was not able to produce adequate evidence for an analytical plan outlining the rationale for the exploratory analyses. Thus, there is nothing to indicate that the additional exploratory measures came about from a renewed understanding of the scientific merits of the additional exploratory variables.
Professor Ericksson and Dr. Hieronymus go on to propose the “depressed mood” item as a more sensitive and appropriate measure of antidepressant efficacy than the total sum of the Hamilton Depression Rating Scale (HDRS)(15). They refer to their recently published analyses of pharmaceutical company studies of SSRIs for adult depression, in which “whereas 56% of 32 comparisons failed to reveal a significant difference between groups when HDRS sum was used as effect parameter, only 9% failed to detect a significant superiority of the active drug with respect to the “depressed mood” item”(15,16). However, whether the single “depressed mood” item is a more appropriate measure of antidepressant efficacy is debatable.
What can be made of the finding that statistical significance is reached on a single item – depressed mood – but not on the sum total of items representing the constellation of symptoms that we presently refer to as major depression(16)? Perusing the results presented in Table 2, almost all HDRS endpoint mean scores for the depressed mood item fall between ratings ‘1’ and ‘2’: the placebo arms mean scores are closer to ‘2’, i.e., ‘spontaneously reported verbally’, and the SSRI arms mean scores are closer to ‘1’, i.e., ‘indicated only on questioning’(16)(Table 2). This finding could be readily explained by SSRI-induced apathy, a common yet underappreciated effect of these drugs(17–23). Thus, patients experiencing SSRI-induced apathy could be less likely to spontaneously report a depressed mood than patients on placebo, all the while there is no substantial difference between the two in terms of their overall symptoms of depression.
Furthermore, while the effect size based on this single “depressed mood” item is described as moderate, the change of much greater magnitude is that of the endpoint versus baseline mean scores, for both the SSRI and placebo arms(16)(Table 2). This highlights the important role of placebo and non-specific factors in the SSRI response. The additional effect from SSRI versus placebo could be partly a result of unblinding due to adverse effects. Both clinician and patient participants can tell with a high degree of accuracy whether they have been assigned to a drug or placebo arm in a trial(24,25), and this is rarely reported in clinical studies(26). Further, adverse events have been shown to correlate with effect size in antidepressant trials(27,28). Of note, in an early study by Thomson, whereas 43 of 68 trials showed tricyclic agents to be superior to inert placebo, only 1 out of 7 trials showed the antidepressant to be superior relative to atropine as an active placebo(28).
As an alternative to symptom-based scales, more meaningful, patient-relevant measures include those that assess function and quality of life. In Study 329, none of such protocol-defined measures showed paroxetine to be more efficacious than placebo, including the clinical global impression mean score, autonomous function check list change, self perception profile change, or the sickness impact profile(29).
In conclusion, while exploratory analyses can yield useful information, they can be – and very often are – used to fish for statistically significant results that are presented in a misleading manner(1–12). It is worthwhile to explore more appropriate and meaningful alternatives to current popular measures to capture patient response to intervention. However, this necessitates full transparency, including access to clinical trial protocols and raw data. Otherwise, we will continue to subject our patients to interventions with a distorted impression of benefits and harms.
1. Dwan K, Altman DG, Cresswell L, Blundell M, Gamble CL, Williamson PR. Comparison of protocols and registry entries to published reports for randomised controlled trials. Cochrane Database Syst Rev. 2011;(1):MR000031.
2. Mathieu S, Boutron I, Moher D, Altman DG, Ravaud P. Comparison of registered and published primary outcomes in randomized controlled trials. JAMA. 2009 Sep 2;302(9):977–84.
3. Hughes S, Cohen D, Jaggi R. Differences in reporting serious adverse events in industry sponsored clinical trial registries and journal articles on antidepressant and antipsychotic drugs: a cross-sectional study. BMJ Open. 2014;4(7):e005535.
4. Chan A-W, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA. 2004 May 26;291(20):2457–65.
5. Melander H, Ahlqvist-Rastad J, Meijer G, Beermann B. Evidence b(i)ased medicine--selective reporting from studies sponsored by pharmaceutical industry: review of studies in new drug applications. BMJ. 2003 May 31;326(7400):1171–3.
6. Chan A-W, Altman DG. Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors. BMJ. 2005 Apr 2;330(7494):753.
7. Bourgeois FT, Murthy S, Mandl KD. Outcome reporting among drug trials registered in ClinicalTrials.gov. Ann Intern Med. 2010 Aug 3;153(3):158–66.
8. Vedula SS, Bero L, Scherer RW, Dickersin K. Outcome reporting in industry-sponsored trials of gabapentin for off-label use. N Engl J Med. 2009 Nov 12;361(20):1963–71.
9. Rising K, Bacchetti P, Bero L. Reporting bias in drug trials submitted to the Food and Drug Administration: review of publication and presentation. PLoS Med. 2008 Nov 25;5(11):e217; discussion e217.
10. McGauran N, Wieseler B, Kreis J, Schüler Y-B, Kölsch H, Kaiser T. Reporting bias in medical research - a narrative review. Trials. 2010;11:37.
11. Saini P, Loke YK, Gamble C, Altman DG, Williamson PR, Kirkham JJ. Selective reporting bias of harm outcomes within studies: findings from a cohort of systematic reviews. BMJ. 2014;349:g6501.
12. Dwan K, Altman DG, Arnaiz JA, Bloom J, Chan A-W, Cronin E, et al. Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PloS One. 2008;3(8):e3081.
13. Keller MB, Ryan ND, Strober M, Klein RG, Kutcher SP, Birmaher B, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry. 2001 Jul;40(7):762–72.
14. 19981014PositionPiece.pdf [Internet]. [cited 2015 Oct 25]. Available from: http://www.healthyskepticism.org/documents/documents/19981014PositionPie...
15. Anonymous. Study 329 did detect an antidepressant signal from paroxetine. The BMJ [Internet]. 2015 Oct 18 [cited 2015 Oct 25]; Available from: http://www.bmj.com/content/351/bmj.h4320/rr-14
16. Hieronymus F, Emilsson JF, Nilsson S, Eriksson E. Consistent superiority of selective serotonin reuptake inhibitors over placebo in reducing depressed mood in patients with major depression. Mol Psychiatry. 2015 Apr 28;
17. Barnhart WJ, Makela EH, Latocha MJ. SSRI-induced apathy syndrome: a clinical review. J Psychiatr Pract. 2004 May;10(3):196–9.
18. Bolling MY, Kohlenberg RJ. Reasons for quitting serotonin reuptake inhibitor therapy: paradoxical psychological side effects and patient satisfaction. Psychother Psychosom. 2004 Dec;73(6):380–5.
19. Fava M, Graves LM, Benazzi F, Scalia MJ, Iosifescu DV, Alpert JE, et al. A cross-sectional study of the prevalence of cognitive and physical symptoms during long-term antidepressant treatment. J Clin Psychiatry. 2006 Nov;67(11):1754–9.
20. Lee SI, Keltner NL. Antidepressant apathy syndrome. Perspect Psychiatr Care. 2005 Dec;41(4):188–92.
21. Opbroek A, Delgado PL, Laukes C, McGahuey C, Katsanis J, Moreno FA, et al. Emotional blunting associated with SSRI-induced sexual dysfunction. Do SSRIs inhibit emotional responses? Int J Neuropsychopharmacol Off Sci J Coll Int Neuropsychopharmacol CINP. 2002 Jun;5(2):147–51.
22. van Geffen ECG, van der Wal SW, van Hulten R, de Groot MCH, Egberts ACG, Heerdink ER. Evaluation of patients’ experiences with antidepressants reported by means of a medicine reporting system. Eur J Clin Pharmacol. 2007 Dec;63(12):1193–9.
23. Wongpakaran N, van Reekum R, Wongpakaran T, Clarke D. Selective serotonin reuptake inhibitor use associates with apathy among depressed elderly: a case-control study. Ann Gen Psychiatry. 2007;6:7.
24. Margraf J, Ehlers A, Roth WT, Clark DB, Sheikh J, Agras WS, et al. How “blind” are double-blind studies? J Consult Clin Psychol. 1991 Feb;59(1):184–7.
25. Rabkin JG, Markowitz JS, Stewart J, McGrath P, Harrison W, Quitkin FM, et al. How blind is blind? Assessment of patient and doctor medication guesses in a placebo-controlled trial of imipramine and phenelzine. Psychiatry Res. 1986 Sep;19(1):75–86.
26. Fergusson D, Glass KC, Waring D, Shapiro S. Turning a blind eye: the success of blinding reported in a random sample of randomised, placebo controlled trials. BMJ. 2004 Feb 21;328(7437):432.
27. Greenberg RP, Bornstein RF, Zborowski MJ, Fisher S, Greenberg MD. A meta-analysis of fluoxetine outcome in the treatment of depression. J Nerv Ment Dis. 1994 Oct;182(10):547–51.
28. Thomson R. Side effects and placebo amplification. Br J Psychiatry J Ment Sci. 1982 Jan;140:64–8.
29. Noury JL, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. The BMJ. 2015 Sep 16;351:h4320.
Competing interests: Authors on Restoring Study 329
The serious conflicts of interest revealed in le Noury et al's paper and Joshi's commentary, with at least the potential for frank corruption, are surely just one aspect of a much larger problem. Namely, the gross over-prescribing of antidepressant drugs for what is commonly called ‘depression’ but could in many or even most cases be truthfully called ‘understandable unhappiness’. Placebo-controlled RCTs do not typically show more than a modest advantage for active vs placebo medication but apart from Study 329, there are many other trials in which the advantage is either non-existent or visible only after much torturing of the data to make them say what the authors (or the manufacturers) hoped to find. When ‘active’ placebos are used (with side effects that make it harder for both doctors and patients to identify treatment groups) the advantage tends to be even less. The design of most positive antidepressant RCTs makes it difficult to know whether any overall advantage represents a small benefit for a large proportion of patients or a large benefit affecting mainly a rather small proportion suffering from what it is justifiable but no longer fashionable to regard as largely ‘endogenous’ depression. I suspect the latter is closer to the truth but whichever explanation is correct, the consistent reality is that many more patients respond to the placebo and non-specific effects of antidepressants than respond to their specific pharmacological effects.
As early as the 1970s, I drew attention in the BMJ and elsewhere to the probability that many more depressed patients (2-300 annually at a conservative estimate) were dying from deliberate tricyclic antidepressant overdoses than were likely to have been saved by antidepressants from committing suicide.[1] Since then, antidepressant prescribing has greatly increased without any obvious beneficial effect on suicide rates or lost work days due to depression. The comparative safety of SSRIs in overdosage has possibly aided the increase. So has the progressive medicalization and pathologising of normal human experience embodied in successive editions of the DSM.
The brain is an organ and capable, in principle, of dysfunction, like other organs. Such dysfunction, whose nature is still quite speculative despite all those colourful fMRI pictures, can presumably be the major factor in some psychiatric disorders, classic manic-depressive illness being an obvious candidate. It may be a factor, though probably a much smaller one, in some cases of apparently ‘reactive’ depression. However, my experience (which includes providing the psychiatric service at a large university) has been that depression that seems initially to have no obvious or sufficient cause in terms of life events and personality not infrequently becomes understandable after a few sessions when the patient feels able to admit to me (or to himself) that all is not well in his life. It surely requires rather strange philosophical principles to believe that medication should be among the primary responses to such personal conflicts and disappointments, other than through placebo and non-specific effects or the relief of insomnia.
Incidentally, Peter Joshi’s remark, in his commentary on this study, that supplies of placebo medication in Study 329 were compromised at one point because the placebo had passed its expiry date deserves a place in some future anthology of surreal (or anti-bureaucratic) jokes.
REFERENCES
1 Brewer C. Suicide with tricyclic antidepressants Brit Med J 1976;2:110
Competing interests: No competing interests
We read with interest this re-analysis of Study 329 (1) and applaud the authors for the perseverance with which they approached this question. This article clearly shows the importance of independent access to primary trial data. Although the Journal of the American Academy of Child and Adolescent Psychiatry (JAACAP) has stated that no editorial action is called for, we still hope that this publication will finally lead to the long-overdue retraction of the 2001 paper.
Although Study 329 might well be the most infamous example of biased reporting within the psychiatric literature, this practice is widespread. We have previously examined the reporting of randomized controlled trials of antidepressants for the treatment of anxiety disorders and depression in adults, which were submitted to the Food and Drug Administration (FDA) as part of an application for marketing approval (2,3). Sixty-one percent of all negative and questionable (not-positive) trials of antidepressants for depression and 44% of all not-positive trials for anxiety disorders were not published at all. Equally concerning, however, was the fact that the majority of published not-positive trials were reported as positive. Such a positive conclusion was accomplished either through outcome reporting bias (eg, changing primary outcomes) or through spin (concluding that the treatment is beneficial in spite of non-significant results for the primary outcome).
In case of outcome reporting bias, it is especially clear that the published articles present misleading results. For example, our analysis of antidepressant trials for anxiety disorders included Pfizer trial STL-N/S-95-003, which examined the efficacy of sertraline in the treatment of social anxiety disorder. The FDA drug application package for sertraline includes a memo by the medical team leader (Thomas Laughren), stating: “Since the sponsor acknowledged that this was a negative study, one not conducted for registration purposes, it was agreed that they needed to submit only a summary report” (note 1). However, in 2001, this trial was published as a success story by the British Journal of Psychiatry (4). A closer examination reveals that the significant results for sertraline were obtained only by combining two subscales of the Clinical Global Impression – Social Phobia scale with scores on the Social Phobia Scale, and dichotomizing this combined score into nonresponse versus response categories. The discrepancy between this paper’s conclusions and the company’s acknowledgement of a negative result in the FDA review is striking.
It would have been difficult to identify and prevent outcome reporting bias a decade ago. Clinical trial registration (eg, ClinicalTrials.gov) has now made this task easier and fortunately, there is a growing awareness of the perils of biased reporting. Although we did not examine primary trial data like the authors of this RIAT re-analysis, the FDA drug application packages clearly identify numerous negative trials, which, in the published literature, were reported as positive. However, in the 6 months since the publication of our paper on antidepressant trials for anxiety disorders and the 7 years since the publication of Turner et al.’s paper on trials for depression, none of the identified biased trial publications have been retracted.
As with Study 329, these biased articles continue to exist within the literature, with nothing to alert the unsuspecting reader that an independent, protocol-compliant analysis of the data, performed by the FDA, found that the trial did not, in fact, support the efficacy of the medication. Since publications with outcome reporting bias can be readily identified in our manuscripts, we encourage journal editors, pharmaceutical companies and the authors of these papers to retract these publications. Such an unequivocal stance against biased reporting will help ensure that the medical literature is a faithful representation of the true results and a dependable resource for researchers and clinicians.
Notes
1. From the same memo, however, it is clear that this study was initially intended to be submitted as primary support for the claim that sertraline is effective in the treatment of social anxiety disorder.
References
1. Le Noury J, Nardo JM, Healy D, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ. 2015;351:h4320.
2. Roest AM, de Jonge P, Williams CD, de Vries YA, Schoevers RA, Turner EH. Reporting bias in clinical trials investigating the efficacy of second-generation antidepressants in the treatment of anxiety disorders: a report of 2 meta-analyses. JAMA Psychiatry. 2015;72(5):500–10.
3. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med. 2008;358(3):252–60.
4. Blomhoff S, Haug TT, Hellström K, Holme I, Humble M, Madsbu HP, et al. Randomised controlled general practice trial of sertraline, exposure therapy and combined treatment in generalised social phobia. Br J Psychiatry. 2001;179(1):23–30.
Competing interests: No competing interests
Re: Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence
THE CASE OF TRIAL 329 CONFIRMS THE NEED FOR ACCESS TO COMPLETE INDIVIDUAL PATIENT DATA
There is extensive selective reporting of antidepressant trials1,2 and the revised publication of trial 329 according to the RIAT principle3 and using the clinical study report (CSR) identified several cases of suicidality that were missed by the FDA and the original Keller et al. (2001) paper.4
We recently undertook a review of harms including suicidality (called suicidal or self-injurious behaviours in the RIAT paper) caused by newer antidepressants, using CSRs that included trial 329 (our trial 27).5 We had obtained this CSR from the European Medicines Agency (EMA) plus its first appendix, which was the protocol (appendix A). The RIAT authors had access to all the appendices, of which D, G (individual patient listings of adverse events for all patients) and H (case report forms) were particularly important.6 It was stated within the documents of the CSR that we had obtained from EMA, that additional appendices were ‘available upon request’ but the EMA hadn’t requested them. Moreover, we also did not have access to the CSR for the continuation or extension phase of this study, so our comparison is limited to the acute phase.
We compared our results for suicidality (our supplementary data C)5 with the RIAT paper (its Appendix 3)4 and confirmed that there was selective reporting within the CSR itself. The RIAT authors identified three additional suicidality events, one on paroxetine and two on imipramine (one of patients on imipramine had previously had an event, which was mentioned in the CSR). Additionally, there were four suicidality events, which the RIAT authors identified using case report forms, one on placebo and three in two patients on paroxetine (one of patients previously had an event, mentioned in the CSR). The RIAT authors also considered the individual data in the case report forms for the Kiddie SADS scores and HAM-D data scores to confirm suicidal behaviour for these patients (see Table 1).
There were minor discrepancies between our analysis and theirs. Some were related to different methodology, e.g. they coded two events for one patient (329.003.00313, case 4, codes 4.1 and 4.2), while we recorded only the most severe event as per the methods we had in our protocol (the suicidal ideation and preparatory act - going to the roof to jump but being stopped before jumping - both took place on day 12).
Other differences were about judgment. For example, the RIAT authors reported that 11 patients became suicidal in the acute phase on paroxetine and one in the continuation phase. If we take their additional data into account, which we did not have access to; our numbers are 10 patients in the acute phase and two in the continuation phase. The RIAT authors assigned one patient (329.002.00058, case 1) to the taper phase of the study who commenced the continuation phase the day after having completed the acute phase and who had a suicidality event (an intentional overdose on 80 pills of Tylenol) two months later. They explain in their appendix that even though the patient was in the middle of the continuation phase, it appeared that “this case had stopped drug three days before the overdose, then overdosed, and was discontinued completely from the study….. On this basis we have put the case into taper”. The RIAT authors also discuss a possible case of suicidal behaviour on imipramine (329.010.00279, Case 2) where the patient was said to have strange thoughts (thinking abnormal), but as no narrative or additional information was available, we are less convinced, as strange thoughts could be about something else than suicide. In the coding dictionary Medical Dictionary for Regulatory Activities (MedDRA), there is a preferred term of ‘Thinking abnormal’ which is under the high level term of ‘Thinking disturbances.’
We applaud the authors for undertaking such a rigorous piece of work and finally correcting the study 329, 14 years after the original publication. Both their study4 and our review5 illustrate the importance of making all data available for accurate assessments of adverse effects of drugs. Access to only the main report of the CSR is clearly insufficient. Apart from appendices containing individual patient listings for all patients, access to case report forms are also essential as some relevant events may never get coded as adverse events.
References
(1) Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy. N Engl J Med 2008; 358(3):252-260.
(2) Whittington CJ, Kendall T, Fonagy P, Cottrell D, Cotgrove A, Boddington E. Selective serotonin reuptake inhibitors in childhood depression: systematic review of published versus unpublished data. The Lancet 2004; 363(9418):1341-1345.
(3) Doshi, P., Dickersin, K., Healy, D., Vedula, S. S., & Jefferson, T. Restoring invisible and abandoned trials: a call for people to publish the findings. BMJ 2013;346:f2865.
(4) Le Noury J, Nardo JM, Healy D, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ 2015;351:h4320.
(5) Sharma T, Guski LS, Freund N, Gøtzsche PC. Suicidality and aggression during antidepressant treatment: systematic review and meta-analyses based on clinical study reports. BMJ 2016;352:i65.
(6) GSK, Paroxetine - paediatric and adolescent patients. Clinical study reports: Unipolar major depression study 329. Available at: http://www.gsk.com/en-gb/media/resource-centre/paroxetine/paroxetine-pae... (accessed 20 July 2015).
Competing interests: TS received an honorarium from The BMJ Publishing Group to externally validate some of the suicidality data from Appendix 3 of the RIAT study during its peer review process.