Statistical analysis in SWEPIS trial is flawed
I read the article reporting the SWEPIS trial (1) with great interest. The trial was stopped early and unfortunately seems to suffer from serious errors owing to conduct, analysis and reporting.
The authors state that “The trial was conducted according to the CONSORT guidelines” (1). Formally this is not a meaningful statement as the CONSORT guidelines are guidelines for “reporting” not for “conducting” trials (2). Anyway, according to CONSORT any use of group sequential statistical methods (interim analyses) should be pre-specified in the trial protocol and “Authors should report whether they or a data monitoring committee took multiple “looks” at the data and, if so, how many there were, what triggered them, the statistical methods used (including any formal stopping rule), and whether they were planned before the start of the trial” (2). The authors state “An interim analysis was planned when 50% of the women had been recruited and had delivered” (1). However, the trial was stopped when 2762 (27.5%) women of the planned 10038 had been recruited. It is also stated that “The statistical analyses were carried out according to a prespecified analysis plan”, but the plan seems not to have been published. Therefore, the reporting of the trial is deficient.
In the results section the first statement says: “On 2 October 2018 the Data and Safety Monitoring Board strongly recommended the SWEPIS steering committee to stop the study owing to a statistically significant higher perinatal mortality in the expectant management group. ... (five stillbirths and one early neonatal death; P=0.03).” However, the reported significance test was not corrected for multiple “looks” and thus the p-value is misleading – and so is the description of the finding as “statistically significant” (2). No references to the rich statistical literature on trials stopped early are listed in the section “Sample size and statistical analyses”. Theory and simulation suggest that trials stopped early for benefit “systematically overestimate treatment effects” (3). A systematic review shows that this bias was greatest in smaller studies, with effect estimates being as large as 5-15 times too high, when the total number of events was less than 20 (3). In the SWEPIS trial the number of events is even as low as six. Therefore, the statistical analysis is flawed.
Perhaps there are good reasons why some methodologists have stated: “Indeed, we may ask whether trials should ever stop early for apparent benefit.” (4)
Until the above issues have been clarified, I suggest that the authors withdraw the article as the findings of the study might be unreliable. Until a reliable re-analysis has been conducted, no decisions should be based on this study.
At a more general level “investigators, journal editors, and clinical experts are not mindful of the problematic inferences that may arise from truncated RCTs. Top journals continue to publish results of trials stopped early but do not require authors [...] to report details that would allow readers to carefully evaluate the decision to stop early.” (5) Thus, if the authors of the present article do not decide to withdraw the article, I suggest the editors of the BMJ make sure that readers get access to the pre-specified analysis plan, and that any errors in the published version are corrected, e.g. that the first sentence in the results section of the abstract “The study was stopped early owing to a significantly higher rate of perinatal mortality in the expectant management group” (1) is corrected so that it reads something like “The study was stopped early owing to an erroneous calculation and interpretation of statistical significance in relation to one of many secondary outcomes. The results are thus not reliable”.
The Research Unit for General Practice in Copenhagen
Department of Public Health
Faculty of Health Sciences
University of Copenhagen
Øster Farimagsgade 5
P.O. Box 2099
TEL +45 35327171
DIR +45 35327155
1. Wennerholm UB, Saltvedt S, Wessberg A, et al. Induction of labour at 41 weeks versus expectant management and induction of labour at 42 weeks (SWEdish Post-term Induction Study, SWEPIS): multicentre, open label, randomised, superiority trial. Bmj 2019;367:l6131. doi: 10.1136/bmj.l6131 [published Online First: 2019/11/22]
2. Moher D, Hopewell S, Schulz KF, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. Bmj 2010;340:c869. doi: 10.1136/bmj.c869 [published Online First: 2010/03/25]
3. Bassler D, Briel M, Montori VM, et al. Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. Jama 2010;303(12):1180-7. doi: 10.1001/jama.2010.310 [published Online First: 2010/03/25]
4. Guyatt GH, Briel M, Glasziou P, et al. Problems of stopping trials early.BMJ. 2012 Jun 15;344:e3863. doi: 10.1136/bmj.e3863. No abstract available. Erratum in: BMJ. 2014;348:319.
5. Montori VM, Devereaux PJ, Adhikari NK, et al. Randomized trials stopped early for benefit: a systematic review. JAMA. 2005 Nov 2;294(17):2203-9.
Competing interests: No competing interests