Study claiming Tamiflu saved lives was based on “flawed” analysis
BMJ 2014; 348 doi: https://doi.org/10.1136/bmj.g2228 (Published 19 March 2014) Cite this as: BMJ 2014;348:g2228
All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Historically, medical journals have served as neutral forums for the presentation and discussion of scientific issues. BMJ has in recent years rejected this historical model by taking an activist role in advocacy campaigns such as #alltrials and the current "investigations" of the regulatory history of Tamiflu. Whether or not BMJ is ultimately found to be on the "right" side of these issues, I believe that the inherent conflict of interest between the journal's putative role as a neutral forum for scientific discussion and the advocacy roles taken by its editors will harm the journal's reputation.
The present episode, in which an opinionated criticism of a paper on one side of the Tamiflu debate was written with the actively solicited input of those on the other side of the scientific debate illustrates the problem caused by these conflicts of interest. The authors of the paper were not even given the opportunity to rebut the journal's criticism in the same issue in which the criticism appeared. Given this level of hardball partisanship, to what extent can any article published in the BMJ be considered truly peer-reviewed? Obviously, the editors have the power to choose reviewers that will render decisions that they agree with. And its obvious that on many of these issues, the editors have a dog in the fight.
In the long run, it is the reputation of the journal that will suffer.
Competing interests: No competing interests
On Wednesday 19th March 2014, researchers from the PRIDE Consortium(1) published the first outputs from a project investigating the effectiveness of neuraminidase inhibitors against outcomes of public health importance during the 2009 pandemic, in The Lancet Respiratory Medicine(2). The headline results suggested that neuraminidase were associated with statistically significant reductions in mortality: overall adjusted odds ratio (OR)=0.81 (95% CI 0.70–0.93; p=0.0024) vs. no treatment; and OR=0.50 (95% CI 0.37–0.67; p<0.0001) if treatment was started within two days of symptom onset.
Within 48 hours, the BMJ issued an article written by a staff journalist, which claimed that the new study “was based on flawed analysis”(3). Zosia Kmietowicz had contacted Dr Mark Jones, University of Queensland, who is also working with the Cochrane Collaboration on another project related to neuraminidase inhibitors. In turn, Jones had provided a detailed statistical critique of the PRIDE study, which formed the centerpiece of Kmietowicz’s article. The PRIDE Consortium were not forewarned about the article and, rather more importantly, not offered any a priori right of reply, as would normally be the case during post-publication correspondence. Faced with such a one-sided critique of its work, the PRIDE Consortium had no option but to post its initial rebuttal in the BMJ(4). There has since been a further critique from Jones and a further statistical rebuttal from the PRIDE Consortium(4).
Thus, the correspondence and debate relating to a major publication in a Lancet Group paper has been played out in the pages of the BMJ, fronted by an entirely one-sided article from a BMJ staff journalist. The major question here, seems to be the propriety of the BMJ and Dr Jones in going beyond the reasonable response to a press release, by asking potential opponents for a detailed statistical critique, without offering the authors of the study any right to reply alongside. A more conventional and considerably more ethical approach would have been to submit correspondence post-publication to The Lancet Respiratory Medicine, which could then have considered the response in the normal way, including offering the PRIDE Consortium a realistic period of time to consider the critique and write a rejoinder.
Jonathan S. Nguyen-Van-Tam
Senior Author, PRIDE Consortium
University of Nottingham, UK
1. http://www.nottingham.ac.uk/research/groups/healthprotection/projects/pr... (last accessed 01 April 1, 2014)
2. Muthuri SG, Venkatesan S, Myles PR, et al. (2014) Effectiveness of neuraminidase inhibitors in reducing mortality in patients admitted to hospital with influenza A H1N1pdm09 virus infection: a meta-analysis of individual participant data. Lancet Respir Med; published online March 19. http://dx.doi.org/10.1016/S2213-2600(14)70041-4
3. Kmietowicz Z. Study claiming Tamiflu saved lives was based on “flawed” analysis. BMJ 2014; 348 doi: http://dx.doi.org/10.1136/bmj.g2228 (Published 19 March 2014)
Competing interests: Senior author of the paper Muthuri et al. (2014) that is being critiqued
Dr Jones has misunderstood our methods as described in the paper Muthuri et al. (2014) and our response to his previous critique.
Our published paper reported the results of two separate survival analyses using a Cox regression shared frailty model. The first survival analysis was prompted by the question of whether treatment with NAI antivirals was associated with a reduction in mortality as compared to no NAI antiviral treatment. For this analysis we followed the exact approach suggested by Dr Jones, i.e. we modelled NAI treatment as a time-dependent covariate that, ‘equals 0 while the patient is untreated, then becomes 1 when treatment begins’ (with NAI untreated patients being coded as ‘0’ for the entire duration of follow-up). This analysis yielded an adjusted hazard ratio (HR) of 0.51 (95% confidence intervals, CI: 0.45-0.58); p<0.0001.
The survival curves in Figure 2 on the other hand, do not relate to this survival analysis; this second survival analysis was prompted by the question of whether the time to treatment initiation is associated with mortality among NAI antiviral treated patients only. For this second survival analysis the effects associated with time to initiation of treatment were explored by stratifying treatment by categories denoting ‘time to treatment initiation from illness onset’. This analysis showed that there was an incremental increase in the mortality hazard with each day’s delay in initiation of treatment up to day 5 as compared with treatment initiated within 2 days of symptom onset [adjusted HR 1.23 (95% CI: 1.18-1.28)]; p<0.0001.
We fear Dr Jones has again misread and/or misinterpreted our paper when he claims that our 'so-called time-dependent analysis (hazard ratio 0.51) is impossible compared to your analysis where treatment is assumed time-independent (relative risk 0.81)'. The risk estimate he quotes, [adjusted odds ratio (OR): 0.81 (95% CI: 0.70-0.93); p=0.0024] is an adjusted odds ratio obtained from a generalised linear mixed model. A direct comparison of this to the adjusted hazard ratio [adjusted HR: 0.51 (95% CI: 0.45-0.58); p<0.0001] obtained from the Cox regression shared frailty model is not appropriate.
Whilst we agree that the paper by Beyersmann et al. (2008) cited by Dr Jones offers mathematical proof that a time-dependent analysis should diminish the effect size in favour of treatment, this is predicated on the assumption that the only time dependent bias at work is immortal time bias. Other time-dependent biases may artificially increase the hazard associated with treatment for example, selection biases related to illness severity/stage of illness; we have repeatedly made the point in the paper and in our previous response that severely ill patients were frequently diagnosed with influenza, late on in the course of the illness and that antiviral drugs were similarly started late in patients who by that stage may have had little chance of survival. Therefore performing time-dependent analyses will not only remove immortal time bias, it can reduce other forms of time-dependent bias. Thus, the results from our time dependent analysis are entirely realistic and our statistical approach is sound.
We hope we have assured Dr Jones that we have already conducted a proper time-dependent analysis. We accept the limitation of missing data on timing of treatment in our pooled sample. This is precisely why even though the Cox regression model is generally considered superior to logistic regression models (Kleinbaum and Klein, 2012) we opted to use a multilevel logistic regression model as our primary analysis strategy, so as to include all treated patients in the analysis. Even then we observed a statistically significant association between NAI antiviral use (irrespective of stage of illness at which they were administered) and reduced mortality [adjusted OR: 0.81 (95% CI: 0.70-0.93); p=0.0024]. In the case of early NAI antivirals administered within 2 days of illness onset, the association with reduced mortality was even more marked [adjusted OR: 0.50 (95% CI: 0.37 -0.67); p<0.0001].
References
Muthuri SG, Venkatesan S, Myles PR, et al. (2014) Effectiveness of neuraminidase inhibitors in reducing mortality in patients admitted to hospital with infl uenza A H1N1pdm09 virus infection: a meta-analysis of individual participant data. Lancet Respir Med; published online March 19. http://dx.doi.org/10.1016/S2213-2600(14)70041-4
Beyersmann et al. (2008) An easy mathematical proof showed that time-dependent bias inevitably leads to biased effect estimation. Journal of Clinical Epidemiology 61: 1216-1221.
Kleinbaum, D.G. and M. Klein, Survival Analysis: A Self-Learning Text. Third Edition ed. 2012, New York: Springer.
Competing interests: Co-authors of the paper Muthuri et al. (2014) that is being critiqued
Thank you for your explanation of how you conducted your time-dependent analysis. It is now clear why you have obtained such a biased estimate of treatment effect. By ignoring the time prior to NAI treatment you have increased the immortal time bias, not eliminated it. In fact you have not included NAI treatment as a time-dependent exposure; you have just begun the follow up at different points in time for each treatment group. You have failed to take into account that patients in both treatment groups needed to survive long enough to reach hospital. Once they reached hospital they needed to survive long enough to get the opportunity to receive NAI treatment. The way to eliminate immortal time bias is to begin follow up at the same time in each treatment group at the logical beginning of follow up which is hospital admission. Assuming all patients were untreated prior to admission then all patients begin follow up untreated. Once a patient begins NAI treatment they then join the NAI treatment group. This analysis can easily be conducted using a Cox regression model where NAI treatment is a time-dependent covariate that equals 0 while the patient is untreated, then becomes 1 when treatment begins. Van Walraven, et al state that time-dependent bias is always in favour of treatment and Beyersmann, et al prove this mathematically. This is why your result for your so-called time-dependent analysis (hazard ratio 0.51) is impossible compared to your analysis where treatment is assumed time-independent (relative risk 0.81).
The bias always favours the treatment group because the time from initiation of follow up to initial treatment exposure is incorrectly allocated to the treatment group in an analysis that incorrectly includes treatment exposure as time-independent. This has the effect of artificially inflating the number of patients at risk in the treatment group and conversely artificially deflating the number of patients at risk in the non-treatment group. This makes the estimates of mortality over time too low for the treatment group and too high for the non-treatment group. An analysis that correctly includes treatment exposure as time-dependent therefore diminishes any treatment effect in favour of treatment or even changes its direction.
In light of this I hope that you can conduct a proper time-dependent analysis and report your results. This I believe will provide a much more realistic estimate of the effect of NAI treatment on mortality. However it will not help with the problem of 35% missing data which also introduces bias because the patients with missing timing of treatment did most poorly of all. For the missing data you could use multiple imputation based on a regression model for predicting time from admission to treatment.
1. Van Walraven, et al. Time-dependent bias was common in survival analyses published in leading clinical journals. Journal of Clinical Epidemiology 57 (2004) 672–682.
2. Beyersmann, J., Gastmeier, P., Wolkewitz, M., Schumacher, M. , An easy mathematical proof showed that time-dependent bias inevitably leads to biased effect estimation. Journal of Clinical Epidemiology, 2008. 61: p. 1216-1221.
Competing interests: I am an unfunded researcher working on a Cochrane Review of neuraminidase inhibitors for influenza
A point-by-point response to Dr Jones’s critique by the authors of the Muthuri et al. (2014) paper follows:
Dr Jones’s critique: “A crude analysis of the data shows an increased risk of mortality associated with neuraminidase inhibitor treatment,”suggesting that the finding of a reduced risk of death was incorrect.
Author’s response: It is indeed the case that, based on simple number counts, more people in the antiviral treated group died compared to those who were not treated. However, this can be explained if people in the treated group had a higher baseline risk of dying as compared to the people in the untreated group. In this kind of work we encounter the issue of non-equivalent comparison groups.
Ideally, one would conduct a randomised controlled trial (experimental study) where equivalent patients are randomly assigned to treatment or placebo. This way if we observe any differences in patient outcome, we can be more confident that these could be attributed to treatment status alone. In a pandemic situation, it would have been unethical to randomly deny antiviral treatment to patients and indeed we uncovered no RCT data during our extensive search for data pertaining to the pandemic period. Thus, we only had the option of studying actual treatment practice and resultant patient outcomes during the 2009-10 pandemic.
In order to overcome the issue of comparing non-equivalent patient groups, we used statistical methods to ‘adjust’ for any patient differences to allow us to disentangle treatment effects from outcomes arising due to fundamental differences among patients. This is why the adjusted results are paramount not the crude (unadjusted) results which Jones has chosen to highlight. Notwithstanding, we considered it important to present the crude results in the interest of transparent scientific reporting.
Of rather more concern is the fact that Jones has ignored the clustering of effects by study centre in his calculations of the crude estimates. We amalgamated data from 78 studies; it is standard statistical ‘good practice’ to accounting for such clustering, and in our analysis we considered this to be essential.
‘Clustering of effects’ means that there may be differences at the study level such as differences in the way healthcare is provided, accessibility to treatment, payment for healthcare or prescriptions etc. that could introduce further differences (heterogeneity) in our study. This is why it is standard statistical practice to account for these study level differences by performing ‘clustered’ analyses, also known as multilevel modelling. As indicated in the paper, the crude and adjusted results were modelled using generalised linear mixed models to take into account the correlation among participants within the same study (clustering). These models were used to preserve the validity of our conclusions, since ignoring such correlation can lead to misleading clinical and statistical inferences. Where there is variation in the baseline event rate between the studies and the true effect size is clinically important (as within our study), simulations have shown naïve analyses, such as those performed by Jones, can bias effect estimates towards the null (Abo-Zaid, 2013). Therefore, it is not surprising that our valid results do not reflect the misleading results of Jones who, put simply, has undertaken the wrong analysis.
Dr Jones’s critique: The complex analysis does not take into account time-dependent bias.
Authors’ response: We have acknowledged the point about time-dependent treatment effects in our paper. Time-dependent treatment effects can impact findings such that treatment can appear to be favourable as compared to no treatment because of a bias termed ‘immortal time bias’ which is observed because patients who die early do not get an opportunity to receive treatment. This is precisely why, for the subset of our population, where dates of illness onset and antiviral administration were available, we used a time-dependent Cox regression shared frailty model. Even after using this approach we found an approximately 50% reduction in mortality associated with neuraminidase inhibitor (NAI) antiviral treatment.
Standard techniques for modelling treatment variables as time-dependent covariates involve splitting follow-up time into ‘time before the treatment’ and ‘time after the treatment’ (Kleinbaum and Klein, 2012). This is the approach we have followed for our survival analysis comparing NAI treatment to no treatment, in order to overcome immortal time bias. In essence then, we only include follow-up time after NAI treatment has been initiated to avoid counting apparent ‘survival time’ before NAI treatment was prescribed. This is further explained in the accompanying diagram.
Dr Jones’s critique: “The analysis that is reported to include NAI [neuraminidase inhibitor] treatment as a time-dependent exposure is incorrect, because the result is impossible, and the survival curves indicate a standard Cox regression has been fitted.”
Authors’ response: Jones would need to provide a more detailed explanation of why he thinks that our results are ‘impossible’. His assertion is predicated on the fact that time-dependent bias can only ever work in the direction of favouring treatment. In fact it was a frequent clinical observation during the 2009-10 pandemic period that in many instances, a diagnosis of influenza and the instigation of antiviral treatment occurred when the patient was relatively late in the illness and by this stage deteriorating rapidly; this would conceal a treatment benefit.
With regards the survival curves in our published paper, Figure 2 relates to a different survival analysis from the one reported in the text referring to the findings for treated versus non-treated patients. Figure 2 represents survival curves for treated patients only. The purpose of these survival curves is to explore the effects of treatment delay on survival. Therefore, to answer this rather different question, we do need to take into account the time between illness onset and NAI treatment initiation and a time-dependent analysis involving splitting of follow-up time into ‘time before treatment’ and ‘time after treatment’ as outlined in our previous point, is not relevant. Rather, the effects associated with time to initiation of treatment are explored by stratifying treatment by categories denoting ‘time to treatment initiation from illness onset’.
This has been made very clear in the manuscript with regards the analysis relating to treated patients only (emphasis added for the purpose of this rebuttal):
“When only treated cases were considered, there was an approximately 25% increase in the hazard rate with each day’s delay in initiating treatment up to day 5 as compared to treatment initiated within 2 days of symptom onset [adj. HR, 1.23 (95% CI, 1.18- 1.28)].”
Dr Jones’s call for release of our data for verification by independent researchers
Authors’ response: Finally, Jones asks the authors to release their data so that a full independent analysis can be done. Ideally, in the spirit of transparency, we would have a publicly accessible pooled dataset so that other researchers can validate our analyses. In reality though, we are subject to data sharing agreements with our data contributors and may not be given permission by individual research groups within the PRIDE consortium or their local ethics boards to share this data even in a pooled format. Our data sharing agreements include explicit clauses on this matter as follows:
“(The PRIDE study investigators) recognise the Depositor's rights in the Data. Save as expressly provided for in this Agreement, no rights to or property in the Data shall pass to the University and the Depositor reserves all such rights…shall not transfer, distribute or release the Data to any third party without the prior written consent of the Depositor”
We will revisit this discussion with all our data contributors once we have completed all our planned analyses outlined in the PRIDE study protocol (PROSPERO CRD 42011001273). We have however endeavoured to be transparent in the reporting of our methods and can make available our Stata code used for the analyses and log files to independent researchers who would like to scrutinise our approach on application to Professor Jonathan Nguyen Van-Tam (jvt@nottingham.ac.uk). We will always be happy to provide clarifications regarding our published findings, to the extent that the data allow.
Professor Jonathan Van-Tam (senior author and strategic lead for this study), on behalf of the Department of Health, England has already provided UK data from the FLUCIN hospital pandemic influenza surveillance cohort to Professor Chris Del Mar, Coordinating Editor, Cochrane Acute Respiratory Infections Group on the 5th of July 2012.
On the basis of our clarifications above, we entirely reject Jones’s assertion that our analysis was ‘flawed’. Our study has limitations inherent to observational data which we do not deny, and have clearly declared in the manuscript but within these constraints we have used the best possible analytical approach available to us and taken on board any further advice on statistical analysis from experts outside the PRIDE study collaboration, including a co-convenor of the Cochrane individual patient data meta-analysis methods group. In the absence of better evidence, our study provides highly credible evidence that should be considered by policy makers as they contemplate any independent decisions they make about replenishment of antiviral stockpiles as part of ongoing pandemic preparedness strategies.
In summary, in response to Jones’s critique of our analytical methods:
• A crude analysis of our data that ignores clustering of effects by study centres is inappropriate.
• We have taken into account time-dependent biases in our Cox regression shared frailty models and this reiterates the findings obtained from the generalised linear mixed models that NAI treatment was significantly associated with a reduction in mortality during the 2009-10 pandemic.
• The survival curves referred to by Jones relate to treated patients only to explore the effects of treatment delay on survival.
References
Muthuri SG, Venkatesan S, Myles PR, et al. (2014) Effectiveness of neuraminidase inhibitors in reducing mortality in patients admitted to hospital with infl uenza A H1N1pdm09 virus infection: a meta-analysis of individual participant data. Lancet Respir Med; published online March 19. http://dx.doi.org/10.1016/S2213-2600(14)70041-4
Abo-Zaid G, Guo B, Deeks JJ, Debray TPA, Steyerberg EW, Moons KGM, Riley RD. Individual participant data meta-analyses should not ignore clustering. Journal of Clinical Epidemiology 2013; 66(8):865-873.e4
Kleinbaum, D.G. and M. Klein, Survival Analysis: A Self-Learning Text. Third Edition ed. 2012, New York: Springer.
Competing interests: Co-authors of the paper being critiqued in this article.
Re: Study claiming Tamiflu saved lives was based on “flawed” analysis
Jonathan Nguyen-Van-Tam questions the way The BMJ handled its news reporting of his paper. By way of background, The BMJ regularly carries news reports on studies published in other journals, and our reports usually include comments from other researchers which are often critical of the study. We do not routinely offer a right of reply to the study's authors. However, given the extent of the critique in this case and the title of the news story, I can see that a right of reply would have been appropriate. I thank the authors for their response to the news report, and I hope readers will read the full correspondence in the accompanying rapid responses.
Competing interests: I am the editor of The BMJ and responsible for its policies and for everything it contains.