Tocolytic therapy for preterm delivery: systematic review and network meta-analysisBMJ 2012; 345 doi: https://doi.org/10.1136/bmj.e6226 (Published 09 October 2012) Cite this as: BMJ 2012;345:e6226
- David M Haas, associate professor of obstetrics and gynecology1,
- Deborah M Caldwell, MRC fellow population health science2,
- Page Kirkpatrick, research associate1,
- Jennifer J McIntosh, medical resident1,
- Nicky J Welton, MRC research fellow2
- 1Department of Obstetrics and Gynecology, Indiana University School of Medicine, Indianapolis, IN, USA
- 2School of Social and Community Medicine, University of Bristol, Bristol, UK
- Correspondence to: D M Haas Wishard Memorial Hospital, 1001 W 10th Street, F5102, Indianapolis, IN 46202, USA
- Accepted 4 September 2012
Objective To determine the most effective tocolytic agent at delaying delivery.
Design Systematic review and network meta-analysis.
Data sources Cochrane Central Register of Controlled Trials, Medline, Medline In-Process, Embase, and CINAHL up to 17 February 2012.
Study selection Randomised controlled trials of tocolytic therapy in women at risk of preterm delivery.
Data extraction At least two reviewers extracted data on study design, characteristics, number of participants, and outcomes reported (neonatal and maternal). A network meta-analysis was done using a random effects model with drug class effect. Two sensitivity analyses were carried out for the primary outcome; restricted to studies at low risk of bias and restricted to studies excluding women at high risk of preterm delivery (those with multiple gestation and ruptured membranes).
Results Of the 3263 titles initially identified, 95 randomized controlled trials of tocolytic therapy were reviewed. Compared with placebo, the probability of delivery being delayed by 48 hours was highest with prostaglandin inhibitors (odds ratio 5.39, 95% credible interval 2.14 to 12.34) followed by magnesium sulfate (2.76, 1.58 to 4.94), calcium channel blockers (2.71, 1.17 to 5.91), beta mimetics (2.41, 1.27 to 4.55), and the oxytocin receptor blocker atosiban (2.02, 1.10 to 3.80). No class of tocolytic was significantly superior to placebo in reducing neonatal respiratory distress syndrome. Compared with placebo, side effects requiring a change of medication were significantly higher for beta mimetics (22.68, 7.51 to 73.67), magnesium sulfate (8.15, 2.47 to 27.70), and calcium channel blockers (3.80, 1.02 to 16.92). Prostaglandin inhibitors and calcium channel blockers were the tocolytics with the best probability of being ranked in the top three medication classes for the outcomes of 48 hour delay in delivery, respiratory distress syndrome, neonatal mortality, and maternal side effects (all cause).
Conclusions Prostaglandin inhibitors and calcium channel blockers had the highest probability of delaying delivery and improving neonatal and maternal outcomes.
Tocolytic therapy to delay preterm delivery is an important intervention in obstetrics. Although tocolytics have not been shown to improve neonatal outcomes, they can delay preterm delivery long enough for antenatal corticosteroids to be administered or for the mother to be transported to a tertiary care facility.1 In premature neonates, antenatal corticosteroids reduce morbidity and mortality.2 Tocolytic therapy may therefore have an important role in improving outcomes from preterm delivery. With over 500 000 preterm births in the United States alone (12.3% of all births in 2008)3 and 29% of these being less than 34 weeks’ gestation, preterm delivery is an important public health issue.
Many different classes of drugs have been used for tocolytic therapy.4 These include beta mimetics such as ritodrine and terbutaline; magnesium sulfate; prostaglandin inhibitors (for example, indomethacin, ketorolac); calcium channel blockers such as nifedipine; nitrates (for example, nitroglycerine); oxytocin receptor blockers (for example, atosiban), and others. Each tocolytic has a unique mechanism of action, side effects, and degree of complexity to administer.5 Several Cochrane reviews have compared individual tocolytic drugs with placebo or other tocolytics.6 7 8 9 10 A recent pooled meta-analysis and decision analysis of trials on tocolytics showed that to delay delivery for 48 hours and seven days, prostaglandin inhibitors were the best first line tocolytic.1 A standard pairwise meta-analysis, however, can only compare two treatments (or classes) that have been directly compared in head to head trials (direct evidence). Consequently, trials comparing two treatments from the same class are often excluded from class level meta-analyses. In the absence of a single high quality, randomized controlled trial comparing all tocolytic therapies, uncertainty remains about which is the most effective at delaying preterm delivery.11
For a complex condition such as preterm delivery with multiple competing treatment options, not all of which have been directly compared, a network meta-analysis may be better able to allow for comparisons and conclusions about which tocolytic is most effective. A network meta-analysis refers to networks of trial evidence in which all the available direct and indirect evidence on relative treatment effects are pooled simultaneously in a single coherent analysis.12 13 Indirect evidence is obtained when the relative effectiveness of treatment B versus treatment C is inferred through a common comparator A (see supplementary file for equation). Thus a network meta-analysis produces estimates of the relative effects of each treatment compared with every other in a network, even though some pairs may not have been directly compared, and has the potential to reduce the uncertainty in treatment effect estimates.12 It also allows for the calculation of the probability that each treatment, or class, is the best for any given outcome. Network meta-analysis can also be used to identify gaps in the evidence base.14 In an active area such as tocolysis for preterm delivery, with six trials published since 2009, a network meta-analysis has the potential to inform future research agendas. We systematically reviewed and analysed trials on tocolytics and carried out a network meta-analysis to determine the most effective agent for delaying preterm delivery.
Using the search terms “preterm labor”, “tocolytic”, and “obstetric labor, premature” we systematically searched the Cochrane Central Register of Controlled Trials (February 2012), Medline (1950-present), Medline In-Process/Daily Update (17 February 2012), Embase (1988-2012), and CINAHL (1982-2012) for published randomized controlled trials of tocolytic therapy. We limited the search to articles reporting trials in humans, and excluded duplicate trial entries. To ensure completeness, we cross referenced our search results with the Cochrane reviews of tocolytic medications, hand searching for additional titles. We did not register a protocol for the review. Based on the titles we read the abstracts of potentially relevant papers and obtained the full text articles for those that seemed pertinent. Included trials were those that reported a comparison between different medications or between a medication and a placebo or usual care for delaying preterm delivery. Trials were excluded if they were not randomized controlled trials, did not study women at risk of preterm delivery (defined by trial), did not study at least one tocolytic drug, used combination drug therapies for tocolysis, or did not report maternal or neonatal outcomes in relation to preterm delivery. As published abstracts did not contain enough information for complete data to be extracted we did not include them. We also excluded personal communications cited in Cochrane reviews. At least two reviewers read the articles and extracted data from the trials. Discordance between the reviewers was resolved by consensus. Abstracts of articles in non-English languages were reviewed. If the article was considered relevant, we obtained the full text and had it translated for possible data extraction. Published abstracts from conferences were not included.
Study quality was assigned utilizing the methodology and categories described in the Cochrane Collaboration Handbook.15 The Cochrane collaboration’s recommended tool for assessing risk of bias is neither a scale nor a checklist but a domain based evaluation, in which critical assessments are made separately for different domains. Briefly, the tool for assessing risk of bias addresses seven specific domains: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting, and other sources of bias. Each domain is assigned a judgment relating to the risk of bias for that study: low risk, high risk, and unclear (or unknown). For the purpose of a planned sensitivity analysis, we counted the total number of low risk scores (out of 7) for each study. When at least four domains had a low risk score, with at least one of the domains needing to be sequence generation or allocation concealment, we considered the overall study quality to be high.
At least two reviewers extracted data on study design, characteristics, number of participants, and outcomes reported. Relevant studies were those that reported on pregnant women being treated for preterm delivery or at risk of preterm delivery. Information extracted on the mothers included age, estimated gestational age at entry into the study and at delivery, and number with previous preterm births. Primary data extracted on maternal outcomes included the numbers of participants with delivery delayed by 48 hours and the number of side effects (all causes) from the tocolytics. Secondary maternal outcomes extracted included the number of participants with delivery delayed by seven days and until 37 weeks’ gestation, and the mean number of days by which delivery was delayed. These secondary outcomes are not considered in this paper owing to concerns about multiple statistical testing. We assigned a quality score to each study, the results of which are reported separately.16 We also extracted data on the use of antenatal corticosteroids, inclusion or exclusion of multiple gestations or ruptured membranes, and whether or not the tocolytic therapy was short term (a predefined length of time such as 48 hours or until contractions stopped) or long term (usually until 36 or 37 weeks estimated gestational age). Primary neonatal outcomes extracted were rates of respiratory distress syndrome and death. Secondary neonatal outcomes extracted were birth weight, chronic lung disease or bronchopulmonary dysplasia, fetal sepsis, intraventricular hemorrhage, necrotizing enterocolitis, hyperbilirubinemia, and premature closure of the ductus arteriosus. These outcomes were defined in the individual trials. Because of the way outcomes were reported in the trials we were unable to report an overall composite score for neonatal morbidity or mortality.
We classified the drugs as placebo (placebo or usual or standard care without a tocolytic drug); beta mimetics (ritodrine, terbutaline, nylidrin, salbutamol, fenoterol, hexoprenaline, isoxsuprine); calcium channel blockers (nifedipine, nicardipine); magnesium sulfate; nitrates (nitroglycerine, nitric oxide); oxytocin receptor blockers (atosiban, barusiban); others (alcohol, human chorionic gonadotropin, combination tocolytic drugs); and prostaglandin inhibitors (indomethacin, celecoxib, sulindac, ketorolac, rofecoxib). Two authors (DMH, PK, or JJMcI) then examined the complete set of trials to assess whether the characteristics of the trials and participants were similar enough to be combined in the network meta-analysis—that is, that the sets of trials did not differ for distribution of potential effect modifiers. This assumption of “consistency”17 underpins the validity of a network meta-analysis and is akin to an assumption that the direct and indirect evidence estimate the same underlying treatment effect variable.
The primary effectiveness outcome for the network meta-analysis was delivery successfully delayed for 48 hours. We chose this outcome as it was most commonly reported and is a surrogate for the ability to administer a complete course of antenatal corticosteroids or to allow for maternal transport to a tertiary care facility. The secondary outcomes were neonatal mortality, neonatal respiratory distress syndrome, and all cause maternal side effects. All are binary outcomes. From the analysis we excluded studies with zero or 100% events on all arms. A list of excluded trials is available from the authors.
Analyses were done within a Bayesian framework using WinBUGS 22.214.171.124 We carried out a random effects network meta-analysis19 20 21 to simultaneously compare the 18 treatments and eight tocolytic classes for each outcome. Where head to head data were available we also carried out pairwise “direct” meta-analyses using a random effects model. Heterogeneity was assessed using the posterior median between trial variance, τ2. However, for ease of interpretation we report the χ2 test for heterogeneity and I2 statistic for the pairwise meta-analyses (these were calculated using Stata). Owing to the lack of power associated with the χ2 test we used P=0.10 for our assessment of heterogeneity.15 In the case of two or fewer trials we carried out a fixed effect meta-analysis. The pairwise meta-analyses were done using the drug classes and not individual treatments as the subject of interest. Posterior median odds ratios and 95% credible intervals were calculated. For the primary outcome we carried out a planned sensitivity analysis based on risk of bias assessment. Additional sensitivity analyses examined were restricted to studies that excluded multiple gestations or excluded participants with ruptured membranes. A metaregression analyzed the impact of planned duration of treatment (acute or short term tocolysis versus prolonged therapy) on the results.
For the network meta-analysis we implemented a class effect model where each treatment effect in the same class is assumed to come from a family of treatment effects with a class specific mean effect and between treatment variability within class (assumed equal across all classes).22 23 Further details on alternative models evaluated are available from the authors together with the WinBUGS code. Goodness of fit was measured by the posterior mean of the residual deviance. In a well fitting model the residual deviance should be close to the number of data points.24 Owing to the way in which the residual deviance is calculated, zero cells on the baseline (control) arm can cause computational difficulties. For the purposes of model selection we removed these trials but included them in the final model on which the results are based. A key assumption of network meta-analysis is that of consistency between the direct and indirect evidence.17 We assessed whether there was inconsistency in each of the three networks by comparing a model assuming consistency with that of an inconsistency model25 using the deviance information criterion. A difference of 3 or more points is considered meaningful.26 Convergence was assessed using two chains and was achieved by 25 000 simulations for delivery delayed by 48 hours, 30 000 for neonatal mortality and respiratory distress syndrome, and 35 000 for maternal side effects (based on the Brooks-Gelman-Rubin diagnostic tool in WinBUGS). We run a further 50 000 updates after convergence for delivery delayed by 48 hours, 60 <000 for neonatal mortality and respiratory distress syndrome, and 70 000 for maternal side effects. All reported results are based on these further samples.
Of 3263 titles initially identified, 159 full text articles were retrieved, of which 95 satisfied the study inclusion and exclusion criteria. Fig 1⇓ summarizes the steps of the systematic review. Nine articles were translated (four in Chinese, two in French, and one each in German, Portuguese, and Spanish). One of the French articles was later excluded for not being a true randomized controlled trial, leaving eight reports in a language other than English (8%). Details of the characteristics of the trials and comparison of the quality of tocolytic studies retrieved are reported elsewhere.16 Twenty five trials contained a placebo arm (26%),27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 60 included beta mimetics (63%),27 28 29 31 34 35 36 37 39 43 46 48 50 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 29 included magnesium sulfate (31%),29 30 33 38 47 53 56 61 68 80 89 90 93 95 96 99 100 101 102 103 104 105 106 107 108 109 110 111 112 29 included calcium channel blockers (31%),43 52 59 63 64 70 71 72 74 76 78 79 84 85 86 91 92 94 101 102 103 104 106 112 113 114 115 116 117 18 included prostaglandin inhibitors (19%),40 41 42 49 57 73 75 81 88 99 107 109 110 111 116 118 119 120 13 included oxytocin receptor blockers (atosiban or barusiban) (14%),32 44 50 51 54 55 65 66 82 83 87 113 114 four included nitrates (4%),45 58 100 115 and five included other drugs (5%).47 50 77 88 108 For the outcome of delivery delayed by 48 hours 64 trials assessing 16 treatments from eight drug classes were eligible for inclusion in the network meta-analysis. For the outcome of respiratory distress syndrome 60 trials assessing 19 treatments from seven drug classes were eligible for inclusion, and for the outcome maternal side effects (all cause) 68 trials assessing 18 treatments from seven drug classes were eligible for inclusion. Trials included a mean of 111.9 (SD 108.8, range 20-708) participants and were published from 1966-2011 (see supplementary table 1 for details of the trials, along with their quality assessments). Fig 2⇓ presents the complete network of the 95 randomised controlled trials on tocolytics. No trials compared atosiban with magnesium sulfate.
For each outcome no meaningful differences in residual deviance or deviance information criterion values were observed between the inconsistency and consistency models. Furthermore, overlap was substantial between the direct estimates (where available) and network meta-analysis estimates (figs 3-6⇓ ⇓ ⇓ ⇓). This provides support for the assumption of consistency required for network meta-analysis—that is, the estimates for direct treatment effect agree with those generated from the network meta-analysis. Direct pairwise meta-analyses involve the trial results that only directly compare the classes, whereas results from the network meta-analysis utilize the network calculations described previously. The class effect model provided an adequate fit to the data, with a posterior mean residual deviance of 114.4 (112 data points) for the outcome of delivery delayed by 48 hours, 82.6 (81 data points) for neonatal mortality, 78.9 (81 data points) for respiratory distress syndrome, and 123.2 (118 data points) for maternal side effects.
Delivery delayed by 48 hours: efficacy
For the outcome delivery delayed by 48 hours 55 studies were included in the network meta-analysis and 54 in the pairwise meta-analysis (one trial was excluded as it compared two treatments from the same class). Direct evidence was available for 14 class versus class pairwise comparisons. Heterogeneity was evident in two of the random effects pairwise meta-analyses, both of which compared active treatment with placebo (beta mimetic v placebo, n=4 trials, P=0.08; and magnesium sulfate v placebo, n=3 trials, P=0.001); the I2 values for these two comparisons were 50% or more (95% for magnesium sulfate versus placebo). Where available, the supplementary file reports the full results from the pairwise meta-analyses; however, the posterior median odds ratios (95% credible intervals) are also shown in fig 3 alongside the posterior median odds ratios for the full suite of 28 class comparisons available from the network meta-analysis.
The table⇓ reports the effectiveness of tocolytic therapies and adverse events from the network meta-analysis using placebo as the reference class to which all other drug classes were compared. All active classes were superior to placebo in successfully delaying delivery by 48 hours, although the category for others and for nitrates did not achieve conventional significance. The results from the network meta-analysis also suggested that prostaglandin inhibitors had a greater beneficial effect than all the other active classes. However, uncertainty in these estimates was considerable (table, fig 3, and supplementary table 3). Calcium channel blockers and magnesium sulfate had a greater effect than oxytocin receptor blockers, nitrates, beta mimetics, others, and placebo. Fig 7⇓ shows the distribution of probabilities of each class being ranked at each of the possible eight positions. The most efficacious treatment class was prostaglandin inhibitors, which had an 83% probability of being the “best” class. This meant there was still a 17% probability of prostaglandin inhibitors not being the best class. The probability of being ranked in the top three most efficacious classes was 96% for prostaglandin inhibitors, 63% for magnesium sulfate, 57% for calcium channel blockers, 33% for beta mimetics, 24% for nitrates, 14% for oxytocin receptor blockers, 13% for others, and 0% for placebo. The probability of being ranked in the bottom three (least efficacious) was 99% for placebo, 79% for others, 52% for nitrates, 41% for oxytocin receptor blockers, 13% for beta mimetics, 10% for calcium channel blockers, 5% for magnesium sulfate, and 1% for prostaglandin inhibitors. See supplementary table 4 for the predicted odds ratios and intervals.
For the outcome neonatal mortality 40 trials were included in the network meta-analysis and 34 in the pairwise meta-analysis. Two trials were excluded from the pairwise analysis as they compared treatments from the same class (beta mimetics). Direct evidence was available for 13 class versus class pairwise comparisons (see supplementary appendix 2 for the full results). Heterogeneity was evident in one of the seven random effects analyses (beta mimetic v placebo, n=6 trials, P<0.001, I2=78%).
Fig 4 shows the posterior median odds ratios (95% credible intervals) from each pairwise meta-analysis alongside the posterior median odds ratios for the full suite of 21 class comparisons available from the network meta-analysis. The results from the direct, head to head meta-analyses were consistent with those of the network meta-analysis.
No clear evidence was found for the relative effectiveness of any tocolytic versus placebo being beneficial for neonatal mortality (table). The point estimates for the active versus comparisons suggested that calcium channel blockers were more effective than all the other classes of tocolytics (fig 4). However, the credible intervals for the comparisons crossed 1 and uncertainty in all estimates was considerable. This uncertainty is reflected in the rankograms shown in fig 4, which suggest that although calcium channel blockers are the best class for reducing neonatal mortality, the probability was only 41% and so indicative of considerable uncertainty. Indeed, there was a 59% probability that they were not the best class for reducing neonatal mortality. Prostaglandin inhibitors had the next highest probability of being the best (28%). The probability that calcium channel blockers were ranked in the top three classes for reducing neonatal mortality was 85%, followed by beta mimetics (58%), oxytocin receptor blockers (56%), and prostaglandin inhibitors (54%). The worst performing classes were placebo (1%) and others (2%), followed by magnesium sulfate, with a 3% probability of being the best class for reducing neonatal mortality. These results reflect uncertainty around which class is associated with the fewest neonatal deaths.
Neonatal respiratory distress syndrome
For the outcome neonatal respiratory distress syndrome 42 trials were included in the network meta-analysis and 37 in the pairwise meta-analyses. Five trials were excluded from the pairwise meta-analyses as they compared treatments from within the same class. Direct evidence was available for 13 class versus class pairwise comparisons (see supplementary appendix 2 for the full results). Heterogeneity between trials was not evident in any comparison.
Fig 5 shows the posterior median odds ratios (95% credible intervals) from each pairwise meta-analysis alongside the posterior median odds ratios for the full suite of 21 class comparisons available from the network meta-analysis for respiratory distress syndrome.
The results from the network meta-analysis (table and fig 5) suggested no evidence of a difference between the classes in reducing respiratory distress syndrome and no clear effect compared with placebo. The rankograms (fig 7) suggested that calcium channel blockers were the best class for reducing respiratory distress syndrome; however, this probability was only 47% and so indicative of considerable uncertainty. The probability that calcium channel blockers were ranked in the top three classes for reducing respiratory distress syndrome was 80%. The worst performing class was the others, with only an 11% probability of being in the top three classes for reducing respiratory distress syndrome.
All cause maternal side effects
For the outcome all cause maternal side effects 58 trials were included in the network meta-analysis and 55 in the pairwise meta-analyses. Three trials were excluded from the pairwise meta-analyses as they compared treatments from the same class. Direct evidence was available for 16 class comparisons versus class pairwise comparisons.
Fig 6 shows the posterior median odds ratios (95% credible intervals) from each pairwise meta-analysis (see supplementary appendix 2 for full results). Heterogeneity was statistically significant in four of the pairwise meta-analyses: magnesium sulfate versus prostaglandin inhibitors (n=3 trials), magnesium sulfate versus betamimetic (n=6), magnesium sulfate versus calcium channel blockers (n=5), and calcium channel blockers versus beta mimetics (n=15). The I2 statistic for these four comparisons was 70% or more.
The network meta-analysis provided treatment effect estimates for all 21 of the pairwise comparisons that are possible from the seven drug classes (fig 6). Although the point estimates indicated that placebo was responsible for fewer maternal side effects than all active classes, some evidence suggested that prostaglandin inhibitors and oxytocin receptor blockers were reasonably well tolerated, with credible intervals consistent with a reduction in all cause maternal side effects (table). Fig 7 reports the probabilities of every drug class being ranked at each of the possible seven positions. For all cause maternal side effects, placebo was ranked first, with a probability of 61%, suggesting that women given placebo experienced fewer maternal side effects. Placebo had a 98% probability of being ranked in the top three drug classes. The closest active competitor for reducing all cause maternal side effects was prostaglandin inhibitors, with a probability of 79%, followed by oxytocin receptor blockers (70%). Calcium channel blockers had only a 15% probability of being ranked in the top three drug classes for maternal side effects.
The four outcomes analyzed were the most commonly and consistently reported across the network of trials. It was not possible to fit either pairwise or network meta-analysis models to all the outcomes extracted for the systematic review. This was because of concerns about multiplicity and because of the small number of studies that consistently reported other outcomes such as delivery delayed for seven days, delivery delayed until 37 weeks’ gestation, bronchopulmonary dysplasia, fetal sepsis, intraventricular hemorrhage, necrotizing enterocolitis, hyperbilirubinemia, and premature closure of the ductus arteriosus. Only seven trials reported neonatal composite outcomes.42 50 62 106 108 117 120 The components of the composites in each of these trials were different. Because of this and the way outcomes were reported by individual trials, no overall result for composite outcome could be analyzed.
Two sensitivity analyses were carried out for the primary outcome; restricted to studies at low risk of bias and restricted to those excluding women at high risk of preterm delivery (women with multiple gestation and ruptured membranes). Neither analysis changed the finding that prostaglandin inhibitors were the best class for delaying preterm delivery by 48 hours. Excluding these trials removed the class other from the analysis. Class treatment effects were not modified by using metaregression to explore the effect of length of treatment delivery (whether it was of short duration or prolonged).
Balancing the results relating to benefits and harms, this systematic review and network meta-analysis on trials of tocolytics found that prostaglandin inhibitors and calcium channel blockers have the highest probability of being the best therapy for preterm delivery on the basis of the four outcomes: delivery delayed by 48 hours, neonatal mortality, neonatal respiratory distress syndrome, and maternal side effects (all causes). Of all the classes considered, prostaglandin inhibitors had the highest probability of being the most effective class for delaying preterm delivery and had the most favorable maternal side effect profile. They did not, however, perform as well for the neonatal outcomes. When the probability of being ranked in the top three treatments for delaying delivery was considered, calcium channel blockers also performed reasonably well, with a 57% probability of being the best class. For the two neonatal outcomes, calcium channel blockers also had the highest probability of being the best class. Uncertainty was, however, considerable and the benefit from calcium channel blockers must be considered in light of a somewhat higher probability of being associated with maternal side effects. To fully encapsulate the uncertainty arising from the random effects analysis we also calculated the predicted treatment effect for a new study on a randomly chosen drug from each class reported with a 95% prediction interval (see supplementary table 4). For the outcome of delivery being delayed by 48 hours the credible intervals were wider; only prostaglandin inhibitors continued to show a statistically significant beneficial treatment effect. For maternal side effects, only beta mimetics continued to show a statistically significant harmful effect, when compared with placebo. For all other outcomes and all comparisons the prediction interval crossed the line of null effect, indicating the true extent of the uncertainty. Weighing the balance of the results seems to indicate that prostaglandin inhibitors would be reasonable first-line agents, followed by calcium channel blockers. This conclusion is true even when the analysis is limited to studies with the least risk of bias and with the most clinically homogenous participant populations. In the evidence base considered here, however, only one, small (n=79) head to head trial compared prostaglandin inhibitors with a calcium channel blocker.116 The findings in this trial suggest that nifedipine (calcium channel blocker) was more effective than indomethacin (prostaglandin inhibitor) for rapid treatment effect but that the delay in delivery was similar between women in both groups who initially responded to treatment. Respiratory distress syndrome was not reported in the study and two neonatal deaths occurred in the indomethacin group compared with none in the nifedipine group.116 Given the small amount of direct evidence and considerable uncertainty we identified for the neonatal outcomes, the findings from our network meta-analysis suggest that a head to head trial of these agents is needed to investigate further the effectiveness, adverse effects, and costs of these regimens to women. We therefore plan on carrying out an expected value of information analysis.
The finding that these two classes of medications have the best outcomes was similar to the findings of a recent pooled meta-analysis and decision analysis.1 In our analysis, however, we used a hierarchical class model, which also retains the individual identity of the within class treatments. Therefore we were not able to include all the available evidence. The consistency of findings despite two different methods for meta-analysis strengthens the argument for these agents being the first-line choice for tocolysis. Similar to other findings, tocolysis has been shown to be beneficial for delaying delivery for at least 48 hours compared with no tocolysis.121
Prostaglandin inhibitors have been studied widely but their use is limited in practice. Some data indicate a possible association between neonatal complications and antenatal prostaglandin inhibitors, including reversible premature closure of the ductus arteriosus.5 122 123 A Cochrane review of trials on prostaglandin inhibitors to prevent preterm delivery found that data, albeit limited, did not show increased adverse neonatal outcomes.8 One of the reasons why prostaglandin inhibitors may be effective in delaying delivery is because of the large proportion of women at risk of preterm delivery owing to intrauterine inflammation and infection.124 Clinically, practitioners who utilize indomethacin typically limit its use to pregnancies under 32 weeks’ gestation. Because of the limited trial data, in our network meta-analysis we were unable to determine a difference in the rate of premature closure of the ductus arteriosus with prostaglandin inhibitors. Indomethacin is commonly utilized to close a patent ductus in the newborn period.125 126
To our knowledge this is the first application of network meta-analysis in obstetrics, particularly in tocolysis for preterm delivery. In obstetrics, where multiple treatment options are usual, often only compared pairwise, network meta-analysis has several benefits. We were able to combine all the available evidence on tocolytic treatments in a single pooled analysis, even if none had been compared in a direct trial. Network meta-analysis makes the assumption of consistency, which should always be checked.25 We found no evidence of inconsistency, and empirical studies comparing direct and indirect evidence have found no systematic differences.127 Where differences have been found, they seem to have been in situations where the doses or treatment combinations in the direct and indirect evidence were not comparable.128 129 As long as the assumption of consistency is fulfilled, all relevant treatments, even those that are deemed outdated or ineffectual, can be included in the network to utilize the information on relative treatment effects and inform the rankings.130 Furthermore, a randomized controlled trial with all of these arms designed to answer the same questions presented here would require the randomization of many more participants.1 131
Strengths and limitations of the study
A strength of our systematic review was the inclusion of several trials in non-English. These trials are often excluded from meta-analyses but help to inform the model. We therefore believe that we have included all relevant randomized controlled trials on tocolytics up to the search date.
Our analysis was limited by the data in the included studies and the structure of the reported data. For example, neonatal mortality was included as an outcome because of its high clinical importance when evaluating the harms and benefits of tocolytic therapy. However, few neonatal deaths were reported across the network of trials. Indeed, even after excluding trials with zero deaths on both arms, 24/40 trials (60%) reported one or fewer deaths on at least one arm. Meta-analysis of rare events is known to be problematic.132 133 This is further compounded in network meta-analysis if there are also few trials per comparison, as here for the neonatal mortality outcome (median 3 (range 1-8) trials per comparison). While we understand the importance of this outcome and realize that excluding these trials may lead to a problematic overestimation of rates of neonatal mortality, including these trials in the analysis invalidated the statistical models. Therefore, extra caution should be exercised when interpreting the treatment rankings for this outcome. Furthermore, outcomes were not consistently reported across the ensemble of trials. For example, the most commonly reported efficacy outcome was delivery delayed by 48 hours. However, some studies reported delays at 72 hours or at seven days. Similarly, some studies reported effectiveness as delivery after 34 weeks, 36 weeks, or after 38 weeks’ gestation. This is essentially the same underlying variable, and with better reporting these data would allow a more thorough synthesis.134 A further finding from our systematic review was the need for consistent outcome reporting across trials on tocolytics.
We were also unable to utilize the maternal and neonatal data for several studies that did not include the same outcome measures as the other trials—that is, they were not included in the network meta-analysis although they met the eligibility criteria for the systematic review. This is one of the reasons we were unable to create a composite outcome. Owing to the way data were presented by the individual studies, it was impossible to discern if neonates had more than one outcome, such as respiratory distress syndrome, intraventricular hemorrhage, and necrotizing enterocolitis. Different components of a composite outcome were reported by only seven trials.42 50 62 106 108 117 120 Attempting to extrapolate a composite score for the individual studies would have essentially duplicated the analysis on respiratory distress syndrome. An individual participant meta-analysis might overcome this limitation. It is also possible that some trials, such as potentially some non-English language trials, were not included. We diligently searched the accessible literature on tocolytic studies. The trials included represent the major accessible published literature on tocolytic therapy.
The results may also be limited by the modeling assumptions. We preferred a class effect model because of clinical characteristics and mechanisms of the drugs within the class. When compared with a distinct treatment effect model, the class model reported here showed reasonable fit and provided some support for the assumption of a tocolytic class effect. In addition to the class model reported, we also explored a model with the strong assumption of a single class effect, such that all treatments within a class had the same effect. This model gave an adequate fit to the data. However it did not allow estimation of individual treatment effects, which we considered desirable. We also attempted to relax the assumption of equal within class variances, by fitting a model assuming each individual treatment has a distinct effect, but from a common class, with common class effect and a class specific variance. However, we were unable to obtain results owing to computational problems from insufficient data to estimate the variances.
Because of the multitude of doses used for many trials, we were unable to stratify for dose of drug. However, to deal with potential concerns about heterogeneity caused by dose, we carried out a metaregression by treatment duration, which did not alter the findings. During our systematic review, we assessed the risk of bias in the retrieved studies. To aid in doing a sensitivity analysis by risk of bias, we arbitrarily combined the risk of bias assessments into a composite score. Although this is often discouraged,135 it was necessary for our planned sensitivity analysis. To facilitate a more global view of the risk of bias in the presented studies the full assessments are presented in the supplementary file. In addition, some of these drugs are not licensed in some countries and may not be available to practitioners.
The data contained in the different trials highlight inconsistencies in reporting of outcomes. Delay of delivery was consistently reported and thus the model fits this outcome best. It was more difficult when focusing on maternal safety and on important neonatal outcomes. Long term morbidity and mortality outcomes were inconsistently reported. In future trials, a standard list of both maternal safety and neonatal short term and long term outcomes should be reported to allow researchers to understand the benefits or lack of benefits of tocolytic therapy.
Tocolytic therapy can delay delivery and has an impact on short term neonatal outcomes. In this network meta-analysis, prostaglandin inhibitors and calcium channel blockers had the highest probability of delaying delivery and improving neonatal outcomes.
What is already known on this topic
Tocolytics are used to delay preterm delivery to allow antenatal corticosteroids to be administered to improve neonatal outcomes
Many different drugs have been utilized as tocolytic therapy, but a standard first line drug has not emerged
A multitude of trials have compared a few drugs with each other but no comprehensive trial has compared all commonly used drug classes
What this study adds
In this network meta-analysis of tocolytic therapy, prostaglandin inhibitors and calcium channel blockers had the highest probability of delaying delivery and improving neonatal outcomes
Network meta-analysis techniques can be applied to obstetric interventions with heterogeneous treatment options
Reporting of clinically relevant outcome comparisons should be improved and made more consistent
Cite this as: BMJ 2012;345:e6226
Contributors: All authors contributed equally to the study design and preparation and approval of the manuscript. DMH, PK, and JJMcI carried out the literature searches extracted the data. DMC and NJW did the statistical analysis. DMH is guarantor.
Funding: This study was supported by grants: NIH-NICHD K23HD055305 (DMH) and the Indiana University-Purdue University-Indianapolis Signature Center grant to the Indiana University Center for Pharmacogenetics and Therapeutics Research in Maternal and Child Health (PREGMED). The funding agencies had no role in the study design, implementation, or preparation of results.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; and no other relationships or activities that could appear to have influenced the submitted work. DMC and NJW have taught for Pfizer and provided training on network meta-analysis to a research organization that undertakes systematic reviews and network meta-analysis for industry. None of these activities directly conflicts with the network meta-analysis presented.
Ethical approval: This study was approved by the Indiana University-Purdue University Indianapolis-Clarian institutional review board.
Data sharing: No additional data available.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.