The effects of excluding treatments from network meta-analyses: surveyBMJ 2013; 347 doi: https://doi.org/10.1136/bmj.f5195 (Published 05 September 2013) Cite this as: BMJ 2013;347:f5195
- Edward J Mills, associate professor12,
- Steve Kanters, biostatistician1,
- Kristian Thorlund, associate professor23,
- Anna Chaimani, PhD student4,
- Areti-Angeliki Veroniki, PhD student4,
- John P A Ioannidis, professor and director2
- 1Faculty of Health Sciences, University of Ottawa, Ottawa, Canada
- 2Stanford Prevention Research Center, Stanford University, Stanford, USA
- 3Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Canada
- 4Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
- Correspondence to: E J Mills Faculty of Health Sciences, University of Ottawa, 43 Templeton Street, Ottawa, Canada K1N6X1
- Accepted 4 August 2013
Objective To examine whether the exclusion of individual treatment comparators, including placebo/no treatment, affects the results of network meta-analysis.
Design Survey of networks with individual trial data.
Data sources PubMed and communication with authors of network meta-analyses.
Study selection and methods We included networks that had five or more treatments, contained at least two closed loops, had at least twice as many studies as treatments, and had trial level data available. Investigators abstracted information about study design, participants, outcomes, network geometry, and the exclusion of eligible treatments.
Results Among 18 eligible networks involving 757 randomised controlled trials with 750 possible treatment comparisons, 11 had upfront decided not to consider all treatment comparators and only 10 included placebo/no treatment nodes. In 7/18 networks, there was at least one node whose removal caused a more than 1.10-fold average relative change in the estimated treatments effects, and switches in the top three treatments were observed in 9/18 networks. Removal of placebo/no treatment caused large relative changes of the treatment effects (average change 1.16-3.10-fold) for four of the 10 networks that had originally included placebo/no treatment nodes. Exclusion of current uncommonly used drugs resulted in substantial changes of the treatment effects (average 1.21-fold) in one of three networks on systemic treatments for advanced malignancies.
Conclusion Excluding treatments in network meta-analyses sometimes can have important effects on their results and can diminish the usefulness of the research to clinicians if important comparisons are missing.
Network meta-analysis (also called multiple or mixed treatment comparison meta-analysis, MTC) permits the evaluation of the comparative effectiveness of multiple interventions.1 2 This approach has an inherent appeal for clinicians and decision makers as new or existing interventions must be placed within the context of all available evidence.3 4 5 6 Often, those undertaking an MTC will selectively choose interventions to include in the analysis. For example, some MTCs exclude placebo or no treatment from consideration because it is sometimes believed that placebo trials vary over time or are set in favourable conditions to appease regulatory authorities.7 Other MTCs may include only the treatments available in particular settings (for example, a specific country), only those of perceived dose relevance, or (often in the case of industry submissions to health technology assessment bodies) only specific competing treatments.8 To obtain empirical evidence on whether these choices make a difference in the results such as treatment effect estimates and treatment rankings, we examined a sample of complex networks and reanalysed their data after excluding specific treatment nodes.
Eligibility criteria and retrieval of data from existing networks
We considered networks that had five or more treatments, contained at least two closed loops, had at least twice as many studies as nodes, and had individual trial level data or estimates available. The eligibility criteria aimed to generate a sample of networks that had many treatments and studies and sufficient data to explore the impact of exclusions. We used a systematic literature search that has been published previously that identified potentially eligible networks.9 We also attempted to contact study authors for missing individual data at trial level. We included an additional network from an MTC conducted by our team, where we had direct access to the primary data at trial level. In studies that considered more than one outcome using MTCs, we favoured the efficacy outcome over safety outcomes.
For each eligible network with available trial level data, we recorded whether the eligibility criteria excluded specific types of active or inactive/control (placebo, no treatment, best supportive care) treatment comparators, and the rationale for such exclusions. We recorded for each network the number of studies, treatments, and loops; the geometry of the network (the distribution of treatments and comparisons thereof in each network); the condition being treated; the primary outcome measure and the statistical effect measure used; and the range of node connectivity (the number of direct comparisons connected to each node). The supplementary figure displays the concepts of loops and connectivity.
Regardless of the analyses chosen in the original publications, we analysed each network using random effects Bayesian MTCs with uninformed priors, the most common analytical approach used for network meta-analysis.8 Details on code and specific analysis are available from the authors.
For each network we analysed the complete available data (full model) and also performed analyses excluding one or multiple treatment nodes—that is, disregarding in the calculations data from trials where the excluded nodes were comparators. Firstly, we investigated the effect of excluding the treatment node with the largest expected impact from each network. We used the Brier score to identify the treatment node with the largest expected impact on results.10 The Brier score is the average of the squared differences between the log ratios (odds, relative risk or hazard) estimated with the full treatment network data versus the treatment network data where one or more treatment nodes are excluded. Secondly, we investigated the effect of excluding other single treatment nodes that could be classified as active interventions (that is, not placebo/no treatment). Thirdly, we investigated the impact of excluding placebo/no treatment from the treatment network. Lastly, we focused on selected examples of situation specific exclusions that were chosen owing to perceived relevance for clinical practise or decision making. Specifically, for the network of thrombolytic therapy we excluded data from trials involving anisoylated plasminogen streptokinase activated complex (ASPAC) and urokinase because these treatments were not available in the United Kingdom and one previous UK based network meta-analysis had excluded them11; and for the networks of breast, colorectal, and ovarian cancer, we excluded nodes of treatment regimens that are currently uncommonly used (all those deemed miscellaneous older agents alone or in combination for breast cancer; fluorouracil monotherapy and older drug monotherapy for colorectal cancer; platinum monotherapy and non-platinum/non-taxane drug regimens for ovarian cancer).
We used odds ratios for dichotomous outcomes, rate ratios for Poisson outcomes, and hazard ratios for survival outcomes. From each network we estimated treatment effects for each treatment comparison and the probability of each treatment being the best treatment.12
To evaluate the impact of excluding treatment nodes we calculated the relative change in the estimated treatment effects of the remaining treatments; changes in the top three ranked treatments, according to the probability of being best; and changes in the probability of being the best treatment for the top treatment of the full model. Relative changes in the estimated treatment effects of remaining nodes are always expressed as a value greater than 1.00—for example, if a specific odds ratio is 0.80 in the full network and 0.88 in the network with one node excluded, then the relative change is 0.88/0.80=1.10-fold; if these odds ratios are 0.80 and 0.60, then the relative change is 0.80/0.60=1.25-fold. Given that most treatment effects are relatively small or modest,13 relative changes of 1.10-fold can be considered large and of 1.20-fold can be considered substantial. Choice of reference group does not affect the values of the fold change. As most networks have a natural choice for the reference group we chose to keep this node as a reference group to best mimic changes that would be observed by researchers. For each network with excluded nodes, we noted the maximum and geometric average of the relative change.
Analyses were conducted using WinBUGS version 1.4 (Medical Research Council Biostatistics Unit, Cambridge) and R version 2.15.1 (www.r-project.org/).
The search identified 890 relevant abstracts, of which 276 were assessed as potentially eligible and their full articles were screened. After adding an eligible network from our own available data, 18 networks with individual trial data met the inclusion criteria,14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 involving 757 randomised controlled trials with 750 possible treatment comparisons and 1 036 701 patients. Table 1⇓ shows the characteristics of the individual networks. Networks ranged in size from 11 studies on five treatments to 128 studies on 22 treatments (figure⇓).
Exclusion of treatment nodes in analysed networks
Table 1 shows the eligibility criteria that had been set by the original authors of these meta-analyses and that resulted in exclusion of specific treatment comparators from the network. Eleven of the 18 networks upfront decided not to consider all treatment comparators. One network mentioned that a previous network meta-analysis had excluded data on two treatments (ASPAC and urokinase for thrombolysis) that were not licensed in the United Kingdom, but the full data were available in the current analysis. Several networks deliberately excluded placebos or no treatment options (for example, best supportive care for cancers). Several other networks focused on specific types of treatments that they considered to be of clinical interest and made statements that other treatments were not considered at all, because they were of different class, old (for example, trials published before 1997), or to be considered in a separate review.
Changes after removal of treatment node with largest expected impact
When the node with the largest Brier score was removed (table 2⇓) the average relative change of treatments effects exceeded 1.10-fold in seven of the 18 networks.14 16 18 22 27 29 31 The largest observed relative change exceeded 1.10-fold (maximum 3.64-fold) in all but four networks. In three networks some non-significant effects became significant, and in 12 networks the opposite change was seen. Switches in the ranking of the top three treatments were observed in nine of the 18 networks.16 17 18 21 22 25 27 29 31 Table 3⇓ shows a worked example of the calculations and results for a single node removal in the Cooper 2006 network.18
The most influential nodes were typically highly connected. In 10 of the 18 networks they actually represented the most connected node. However, the most influential nodes were never the top ranked treatment of the networks and almost always (15/18) had a 10% or less probability of being the best treatment (0% probability in 11/18 networks). Thus standards of care, placebo, or other treatments that may easily be overlooked represent nodes of which the exclusion could be influential to the network results.
Changes of removal after other active intervention nodes
Removal of the remaining, less influential nodes caused minor changes (average relative change ≤1.03-fold) in 13 of the 18 networks. None the less, seven networks contained a node the exclusion of which caused a relative change greater than 1.10-fold. Of the nine networks with changes in the top ranks by the exclusion of the most influential node, three had an additional node whose removal affected the top ranks.16 18 29
Changes after placebo node removals or context specific node removals
Table 4⇓ presents the results after removal of placebo/no treatment nodes or after removal of context specific sets of treatment nodes. Placebos and no treatment nodes were removed from 10 networks that had originally included such nodes. This resulted in important changes in the results of four of these 10 networks,16 18 22 27 with average relative changes ranging from 1.16-fold to 3.10-fold, whereas in the other six networks the average relative changes were minor (1.01-1.03-fold). In one network, exclusion of the placebo node resulted in a switch in position between the first and second best treatment. In the other cases, the ranking of the top three treatments did not change, but modest to large changes were seen in the probability of being the best treatment, such as in Mills 2011 in which the probability of high dose nicotine replacement being the best treatment went from 58% to 90% (table 4).26
Removal of APSAC and urokinase from the network on thrombolysis and percutaneous intervention did not result in major changes. Removal of uncommonly used treatment regimens from the networks of treatments for malignancies resulted in a large change in estimated effects (on average by 1.21-fold) for ovarian cancer and small changes for colorectal and breast cancer. However, in the case of advanced breast cancer, these removals switched the ranks of novel non-taxane agents+taxanes and taxanes (combination regimen), the first and third ranked treatments, respectively.
In this metaepidemiological study involving 18 large network meta-analyses, we found that it is common for networks to exclude trials and specific comparators based on widely diverse criteria. Furthermore, we documented that exclusions occasionally can have an important impact on the results. The comparators that had the largest potential to change the results were typically those used in many trials; and they were unlikely to be the best treatments. We also explored the impact of exclusion of each other active treatment and placebo/no treatment as well as the impact of excluding several treatments based on scenarios that may be clinically relevant. Our results show, moreover, that some of these exclusions could affect substantially the estimated treatment effects and occasionally even affect the ranking of the top treatments.
These findings seem particularly relevant to network meta-analysis where analysts choose only to evaluate certain newer treatments or where they have chosen to exclude well established interventions or placebo, or both from a network. Readers of network meta-analysis should examine whether a network represents all available interventions. Excluding treatments from networks may reduce the applicability and usefulness of a multiple or mixed treatment comparison (MTC) if comparisons of interest to doctors are not considered. A substantial literature already exists on traditional pairwise meta-analyses, showing how eligibility criteria may result in differences in the results and conclusions of meta-analyses on the same topic.32 33 34 35 36 As meta-analyses have become popular, most of the topics of interest are addressed by two or more published meta-analyses and these may differ in their eligibility criteria.37 Some examples have started to accumulate where independent network analyses on the same topic may reach different conclusions.38 A key consideration may be which treatment nodes are considered eligible for analysis. For example, on evaluating the relative merits of second generation antidepressants, an MTC that did not consider placebo comparisons yielded different rankings from one that included only placebo controlled comparisons and one that considered both types of comparisons.17 39 40 41 We should also note that although treatment rankings and probabilities are arguably easy to interpret, their interpretability is limited by the fact that they are driven predominantly by the estimated effect sizes, and that standard errors play an unduly small role in determining their position.32 Readers are advised to observe the estimated effects first and use the rankings only as a supplementary measure.
Limitations of this study
There are some caveats to our analysis. Firstly, we excluded simple networks such as star networks (that cannot be analysed in an MTC)42 and poorly connected networks. Less well connected networks with less total evidence are likely to be more “fragile” when nodes are removed—that is, the results may change even more from what we documented here for well connected networks. Secondly, we applied a uniform approach—a random effects analysis using non-informative priors—to our reanalysis of each published network meta-analysis. Different analytical choices may introduce some further variability in the results. Thirdly, some of the examined networks had already excluded specific trials and treatments upfront, and it was not practical for us to try to retrieve the excluded trials and evaluate the impact of their inclusion. Conversely, we focused on the potential impact that further exclusions would have had on the results. Fourthly, we did not question the clustering of treatment regimens into specific nodes by the original meta-analysts. However, in some MTCs there may be some ambiguity on whether slightly or modestly different regimens (for example, different doses or schedules for administering drugs or drugs belonging to the same class) should be treated as a single node or separate nodes. This could introduce some additional variability in the results, depending on what choices are made. For example, in the MTC of systemic treatment for colorectal cancer,21 242 trials were identified but only 37 could be used, because the others were deemed to compare regimens that belonged to the same treatment node and thus would not be distinguishable. In all, given these reasons our estimates of the potential impact on the results of choices pertaining to the geometry of the network are probably conservative. To avoid potential bias, network meta-analysts should consider the inclusion of all relevant interventions for the condition of interest.
We should acknowledge that the choice of including all possible interventions that have ever been evaluated in randomised controlled trials on a topic can be daunting. Sometimes specific exclusions are clearly justifiable, and many of these exclusions may not have any substantial impact on the estimated treatment effects of the remaining “relevant” interventions. However, exclusions of potential nodes and decisions about eligibility criteria need to be carefully justified—for example, it may be argued that alternative medicine or non-mainstream comparators should not be included in the same network as standard interventions. It may even be reasonable to perform sensitivity analyses to examine the impact of the removal of specific nodes. Moreover, our findings indicate that the largest impact on the results occurs when well connected nodes are removed. It seems reasonable to advocate that the most evaluated treatments available for a condition should be considered necessary to include for a network to be valid. Typically, this includes older established standards of care, placebo, and well-tested interventions. In particular, some industry sponsored meta-analyses increasingly focus on target competitor agents that compete for market share.43 However, the exclusion of other interventions in a network can importantly affect the results.
What is already known on this topic
Network meta-analysis is an increasingly popular method that allows the comparative effectiveness of multiple treatments to be evaluated
Examples have started to accumulate where network analyses on the same topic have reached different conclusions based on the exclusion of treatment nodes
What this study adds
It is common for networks to exclude trials and specific comparators based on widely diverse criteria
Excluding treatments from a network meta-analysis can importantly change effect estimates and the probability rankings of being the best treatment
Well connected treatments that are unlikely to be the best treatment are the most likely to be influential
Cite this as: BMJ 2013;347:f5195
The authors appreciate the contribution of Georgia Salanti for providing data and comments on protocol and analysis.
Contributors: EJM, SK, KT, and JPAI conceived the study design. AC, A-AV, and SK acquired the data. SK, KT, and EJM conducted the analyses. EJM, SK, KT, and JPAI wrote the manuscript and all contributed to the writing. All authors approved the final manuscript. EJM is guarantor for the study.
Funding: This study was supported by the Drug Safety and Effectiveness Network (DSEN) of the Canadian Institutes of Health Research. No funding agency has seen this study.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: Not required.
Data sharing: The technical appendix and statistical code are available from the corresponding author at firstname.lastname@example.org.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.