Intended for healthcare professionals

CCBYNC Open access

Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence

BMJ 2015; 351 doi: (Published 16 September 2015) Cite this as: BMJ 2015;351:h4320

Re: Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. Response from the authors of the original Study 329

The BMJ article entitled “Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence”[1] reanalyzed data from the original Paroxetine 329 study[2], a double-blind placebo controlled comparison of paroxetine to imipramine. Paroxetine 329 was designed between 1991 and 1992. Subject enrollment began in 1994, and was completed in 1997. Academic psychiatrists designed the study, with very little change by GSK, which funded the study in an academic / industry partnership. The goal of the study was to advance the treatment of depression in youth, rather than primarily as a drug registration trial.

Overarching issues with the “Restoring Study 329” include:

1. Authors of “Restoring Study 329” evidenced both bias and a lack of blind ratings. In a recent article about “Restoring Study 329” in the Chronicle of Higher Education, Dr. Jureidini is quoted as saying: “We don’t think we’ve done the definitive analysis. It’s not something that can be done absolutely objectively, particularly the interpretation of harms. We can’t protect ourselves completely from our own biases.”[3]

Biases are a serious consideration for Restoring Study 329 because Dr. Jureidini, as he declares in a Footnote on the subject of “Competing interests”, served as an expert witness for plaintiff’s lawyers in legal suites against GSK related to Study 329. In that work Dr. Jureidini would have studied all available data looking at both efficacy and suicidal side effects, using many different approaches to best capture any potential harms.

2. The “restoring invisible and abandoned trials” (RIAT) approach to reanalyzing published studies may provide general guidelines but we could not find publications or available working RIAT documents on detailed protocols. Lack of detailed methodology is a serious concern because there is general consensus in the field that there is not, nor never will be a single correct approach to reanalysis. Small differences in analysis frequently make big differences in statistical results and conclusions.

3. “Restoring Study 329” did not consider available knowledge 24 years ago, when Paroxetine 329 was developed and performed. Clinical research methodology has evolved considerably in the past two decades. These aspects are addressed in comments by established investigators not involved in Paroxetine 329. For example, Referring to “Restoring Study 329” as reported in Psychiatric News Alert [4] Mark Olfson said, “However, the new reanalysis does not alter the totality of clinical trial evidence that continues to support the safety and efficacy of SSRIs for adolescent depression.” And Daniel Pine said “We have known for some time that antidepressant medications have both significant benefits for some children as well as significant risks for other children. This new analysis really does nothing to change this knowledge, and provides no new insights into what we have known about these medications for the past few years.”


Antidepressants considered as a group are superior to placebo for the treatment of anxiety disorders and for depression in adolescents, with similar overall response rates in anxiety and depression. [5]

The two primary outcome measures in Paroxetine 329, did not reach statistical significance. The abstract of the published paper noted: (1), "The two primary outcome measures were endpoint response (Hamilton Rating Scale for Depression [Ham-D] score ≤ 8 or ≥50% reduction in baseline HAM-D) and change from baseline HAM-D score." In Table 2, the p value for the first primary endpoint (Ham-D score ≤ 8 or ≥50% reduction in baseline HAM-D) was reported at p < 0.11 for paroxetine versus placebo. In the same table, the p value for change in HAM-D total score is reported at p < 0.13. While both outcomes were in the direction of a better response for paroxetine over placebo; neither reached our critical alpha level of 0.05. This is clear in the abstract and text of the publication.

In the interval from when we planned the study to when we approached the data analysis phase, but prior to the blind being broken, the academic authors, not the sponsor, added several additional measures of depression as secondary outcomes. We did so because the field of pediatric-age depression had reached a consensus that the Hamilton Depression Rating Scale (our primary outcome measure) had significant limitations in assessing mood disturbance in younger patients. Taking this into consideration, and in advance of breaking the blind, we added secondary outcome measures agreed upon by all authors of the paper. We found statistically significant indications of efficacy in these measures. These secondary outcomes were clearly reported as separate from the negative primary outcomes.

Thus, the authors of “BMJ-Restoring Study 329” were incorrect in stating that “Both before and after breaking the blind, however, the sponsors made changes to the secondary outcomes as previously detailed. We could not find any document that provided any scientific rationale for these post hoc changes and the outcomes are therefore not reported in this paper.” Rather, secondary outcomes were decided by the authors prior to the blind being broken. Secondary outcome measures are frequently, and appropriately, included in study reports even when the primary measures do not reach statistical significance. The authors of “Restoring Study 329” state “there were no discrepancies between any of our analyses and those contained in the CSR [clinical study report]”. The disagreement on treatment outcomes rests on this arbitrary and non-blind dismissal of our secondary outcome measures.

In the abstract we stated “Conclusions: Paroxetine is generally well tolerated and effective for major depression in adolescents.” In this sample and with the state of knowledge at the time, it was justified and appropriate.

Our goal was to learn as much as possible about the use of this compound in youth, so that we could understand what role it (and by extension other SSRIs) could play in the treatment of adolescents with MDD. For us the question was given (1) the data distribution and statistical results for efficacy and side effects that we saw in the study, and (2) that there was well replicated research evidence of efficacy and relative safety of paroxetine in adults, what conclusions should be drawn from our data? The clinical outcomes were substantially in the right direction, with a number of them reaching the 0.05 level of statistical significance. The clinical results comparing paroxetine placebo to and imipramine to placebo paralleled those reported in adults. Side effects were similar to what was known in adults. Thus we reached, the conclusions reported.


The “Restoring Study 329” reanalysis uses the FDA MedDRA approach to side effect data, which was not available when our study was done. That one can do better reanalyzing adverse event data using refinements in approach that have accrued in the 15 years since a study’s publication is unsurprising and not a valid critique of Paroxetine 329 study as performed and presented.

We emphatically disagree with the “Restoring Study 329” position that statistics are not useful in understanding adverse side effects and that each individual reader should decide for herself when a difference in rates of adverse side effects is meaningful. Statistics offer several approaches to the question of when is there a meaningful difference in the side effect rates between different treatments.

Specific methodology problems in the reanalysis of the “harm” data are as follows: 1) The authors choose a non-random subsample of 85 subjects who were withdrawn from the study plus 8 subjects whom the authors labeled “suicidal” based on their inspection of the data; 2) a different instrument was utilized to re-score the harm effects and only one of the authors was trained in the scoring of the instrument; 3) some side effects were arbitrarily interpreted (e.g., upper respiratory symptoms were labeled as “dystonia” and emotional lability labeled as “suicidality”); 4) in the original paper, side effects were analyzed only during the acute phase, but in the reanalysis, the authors analyzed them during the acute phase, as well as the tapering and follow up phases of the study; and 5) in the original study patients were interviewed face-to-face whereas the reanalysis was based only on the interpretation of the data; and 6) importantly, the two authors were not blind to patients’ randomization status.

Suicidal ideation and attempts

Our field’s understanding of how to approach analysis of suicidal ideation, suicide attempts, and completed suicide has advanced enormously since publication of study 329.
Two definitive reanalyses of the suicidality with antidepressants in adolescents include: 1). The 2003 FDA reanalysis of all RCT data of SSRI studies in youth for all indications.[6] In the FDA analysis the average risk ratio for SSRI versus placebo treated subjects was 1.96 (CI: 1.28-2.98). Considered separately Study 329 did not reach statistical significance for increased suicidality (CI: 0.42-33.21). 2). The methodologically superior reanalysis by Bridge and colleagues also found that in study 329 there was no significant risk difference between paroxetine and placebo. [7]

Paroxetine treatment in youth does not appear to significantly differ from other SSRIs in the risk of suicidal ideation or attempts and whether SSRIs increase or decrease completed suicide remains an open question. [8-12]


We strongly support efforts to make anonymized raw data from scientific studies available for reanalysis. The validity of “Restoring 329”, however, is doubtful because of author bias and substantial problems with RIATT methodology. To describe Paroxetine 329 as “misreported” is pejorative and wrong based on both state-of-the-art research methods 24 years ago, and retrospectively from the standpoint of current best practices.


Martin B. Keller, M.D.
Boris Birmaher, M.D.
Gabrielle A. Carlson, MD
Gregory N. Clarke, Ph.D.
Graham J. Emslie, M.D.
Harold Koplewicz, M.D.
Stan Kutcher, M.D.
Neal Ryan, M.D.
William H. Sack, M.D.
Michael Strober, Ph.D.

1. Le Noury J, Nardo JM, Healy D, et al. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ 2015;351:h4320.
2. Keller MB, Ryan ND, Strober M, et al. Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry 2001;40(7):762-72.
3. Basken P. Landmark analysis of an infamous medical study points out the challenges of research oversight. Secondary Landmark analysis of an infamous medical study points out the challenges of research oversight 2015.
4. Reanalysis of JAACAP Study on Paroxetine Sparks Controversy. Secondary Reanalysis of JAACAP Study on Paroxetine Sparks Controversy 2015.
5. Bridge JA, Iyengar S, Salary CB, et al. Clinical response and risk for reported suicidal ideation and suicide attempts in pediatric antidepressant treatment: a meta-analysis of randomized controlled trials. Jama 2007;297(15):1683-96.
6. Hammad TA, Laughren T, Racoosin J. Suicidality in pediatric patients treated with antidepressant drugs. Arch Gen Psychiatry 2006;63(3):332-9.
7. Bridge JA, Birmaher B, Iyengar S, et al. Placebo response in randomized controlled trials of antidepressants for pediatric major depressive disorder. Am J Psychiatry 2009;166(1):42-9.
8. Gibbons RD, Brown CH, Hur K, et al. Early evidence on the effects of regulators' suicidality warnings on SSRI prescriptions and suicide in children and adolescents. Am J Psychiatry 2007;164(9):1356-63.
9. Olfson M, Shaffer D, Marcus SC, et al. Relationship between antidepressant medication treatment and suicide in adolescents. Arch Gen Psychiatry 2003;60(10):978-82.
10. Gibbons RD, Hur K, Bhaumik DK, et al. The relationship between antidepressant prescription rates and rate of early adolescent suicide. Am J Psychiatry 2006;163(11):1898-904.
11. Valuck RJ, Libby AM, Sills MR, et al. Antidepressant treatment and risk of suicide attempt by adolescents with major depressive disorder: a propensity-adjusted retrospective cohort study. CNS Drugs 2004;18(15):1119-32.
12. Gibbons RD, Mann JJ. Strategies for quantifying the relationship between medications and suicidal behaviour: what has been learned? Drug Saf 2011;34(5):375-95.

Competing interests: Please see attachments to this response

18 January 2016
Martin B Keller
Boris Birmaher, M.D., Gabrielle A. Carlson, MD, Gregory N. Clarke, Ph.D., Graham J. Emslie, M.D., Harold Koplewicz, M.D., Stan Kutcher, M.D., Neal Ryan, M.D., William H. Sack, M.D., Michael Strober, Ph.D.
attn: Martin B Keller, MD, 700 Butler Drive, Blumer 120, Providence, RI 02906, USA