Induction of labour at 41 weeks versus expectant management until 42 weeks (INDEX): multicentre, randomised non-inferiority trialBMJ 2019; 364 doi: https://doi.org/10.1136/bmj.l344 (Published 20 February 2019) Cite this as: BMJ 2019;364:l344
- Judit KJ Keulen, PhD student, midwife1,
- Aafke Bruinsma, PhD student, midwife1,
- Joep C Kortekaas, PhD student, resident in gynaecology2,
- Jeroen van Dillen, obstetrician gynaecologist, maternal fetal medicine2,
- Patrick MM Bossuyt, professor of clinical epidemiology3,
- Martijn A Oudijk, obstetrician gynaecologist, maternal fetal medicine14,
- Ruben G Duijnhoven, trialist, statistician1,
- Anton H van Kaam, professor of neonatology5,
- Frank PHA Vandenbussche, professor of obstetrics and fetal medicine2,
- Joris AM van der Post, professor of obstetrics and gynaecology1,
- Ben Willem Mol, professor of obstetrics and gynaecology6,
- Esteriek de Miranda, midwife, assistant professor1
- 1Department of Obstetrics and Gynaecology, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, Netherlands
- 2Department of Obstetrics and Gynaecology, Radboud University Medical Centre, Nijmegen, Netherlands
- 3Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam UMC, University of Amsterdam, Netherlands
- 4Department of Obstetrics and Gynaecology, University Medical Centre, Utrecht, Netherlands
- 5Department of Neonatology, Emma Children’s Hospital, Amsterdam UMC, University of Amsterdam, Netherlands
- 6Department of Obstetrics and Gynaecology, Monash University, Clayton, Victoria, Australia
- Correspondence to: E de Miranda
- Accepted 21 January 2019
Objective To compare induction of labour at 41 weeks with expectant management until 42 weeks in low risk women.
Design Open label, randomised controlled non-inferiority trial.
Setting 123 primary care midwifery practices and 45 hospitals (secondary care) in the Netherlands, 2012-16.
Participants 1801 low risk women with an uncomplicated singleton pregnancy: randomised to induction (n=900) or to expectant management until 42 weeks (n=901).
Interventions Induction at 41 weeks or expectant management until 42 weeks with induction if necessary.
Primary outcome measures Primary outcome was a composite of perinatal mortality and neonatal morbidity (Apgar score <7 at five minutes, arterial pH <7.05, meconium aspiration syndrome, plexus brachialis injury, intracranial haemorrhage, and admission to a neonatal intensive care unit (NICU). Secondary outcomes included maternal outcomes and mode of delivery. The null hypothesis that expectant management is inferior to induction was tested with a non-inferiority margin of 2%.
Results Median gestational age at delivery was 41 weeks+0 days (interquartile range 41 weeks+0 days-41 weeks+1 day) for the induction group and 41 weeks+2 days (41 weeks+0 days-41 weeks+5 days) for the expectant management group. The primary outcome was analysed for both the intention-to-treat population and the per protocol population. In the induction group, 15/900 (1.7%) women had an adverse perinatal outcome versus 28/901 (3.1%) in the expectant management group (absolute risk difference −1.4%, 95% confidence interval −2.9% to 0.0%, P=0.22 for non-inferiority). 11 (1.2%) infants in the induction group and 23 (2.6%) in the expectant management group had an Apgar score <7 at five minutes (relative risk (RR) 0.48, 95% CI 0.23 to 0.98). No infants in the induction group and three (0.3%) in the expectant management group had an Apgar score <4 at five minutes. One fetal death (0.1%) occurred in the induction group and two (0.2%) in the expectant management group. No neonatal deaths occurred. 3 (0.3%) neonates in the induction group versus 8 (0.9%) in the expectant management group were admitted to an NICU (RR 0.38, 95% CI 0.10 to 1.41). No significant difference was found in composite adverse maternal outcomes (induction n=122 (13.6%) v expectant management n=102 (11.3%)) or in caesarean section rate (both groups n=97 (10.8%)).
Conclusions This study could not show non-inferiority of expectant management compared with induction of labour in women with uncomplicated pregnancies at 41 weeks; instead a significant difference of 1.4% was found for risk of adverse perinatal outcomes in favour of induction, although the chances of a good perinatal outcome were high with both strategies and the incidence of perinatal mortality, Apgar score <4 at five minutes, and NICU admission low.
Trial registration Netherlands Trial Register NTR3431.
Post-term pregnancy, defined as a pregnancy extended to or beyond 42 weeks, or 294 days or more, is associated with increased perinatal morbidity and mortality.12345678910 The World Health Organization and various guidelines throughout the world therefore recommend induction of labour after 42 weeks.101112131415 Although the overall probability of favourable perinatal outcomes between 40 and 42 weeks is good in high resource settings, the risk of adverse perinatal outcome increases gradually after 40 weeks.16171819
Several studies concluded that induction of labour from 41 weeks onwards improves perinatal outcomes, and this has been confirmed in a meta-analysis.16171920 These results need to be interpreted with caution, however, because of heterogeneity between trials as a result of different outcome measures, protocols, and time frames of comparison because several trials compared induction beyond 41 weeks or starting induction at 42 weeks with a policy of expectant management far beyond 42 weeks.21
The obstetric management of women with a pregnancy exceeding 41 weeks varies considerably between and within countries. Although induction at 41 weeks has now become an accepted policy in many countries, in some others no consensus exists on the timing of induction in late term pregnancy. In Sweden and the Netherlands, for example, expectant management until 42 weeks is considered standard of care in women with an uncomplicated pregnancy.1522 In Norway, induction is started no later than 42 weeks, and in Denmark delivery takes place before 42 weeks. Guidelines from the Royal College of Obstetricians and Gynaecologists/National Institute for Health and Care Excellence recommend that women should be offered induction between 41 and 42 weeks.23
We compared two strategies: induction of labour at 41 weeks (+0 days/+1 day) and expectant management until 42 weeks (+0 days) with subsequent induction if necessary. We anticipated that a policy of expectant management at 42 weeks, being the simpler strategy, would be acceptable for a low risk population if it did not lead to a substantially higher proportion of women with adverse perinatal outcomes compared with induction at 41 weeks.
Because induction of labour at 41 weeks as well as expectant management until 42 weeks are practised in the Netherlands, our study was designed to investigate non-inferiority of expectant management. We conducted a multicentre, open label, randomised controlled non-inferiority trial to investigate the effect of INDuction of labour at 41 weeks with a policy of EXpectant management until 42 weeks (INDEX trial) on adverse perinatal outcomes. Women were recruited at 123 primary care midwifery practices and 45 hospitals (secondary care) equally distributed across the Netherlands. Twenty six of these 45 hospitals actively recruited participants, and 19 supported the study by inducing labour in women who had been recruited in a primary care setting and were allocated to induction. In the Netherlands obstetric care is provided by primary care (midwives) for low risk women and secondary care (clinical midwives, residents, and obstetricians) for women with an increased risk of adverse maternal or perinatal outcome, or both. Low risk women in primary care can give birth at home or in an outpatient setting (birth centre or hospital), whereas women in secondary care give birth in hospital. For most low risk women, independent primary care midwives provide obstetric care. If risk factors are present during pregnancy, labour, or the postpartum period, women are referred to secondary care (obstetrician or gynaecologist). Secondary care may also be provided by clinical midwives or trainee obstetricians under the responsibility of an obstetrician.24252627282930
Our protocol has been published previously.31 The study was performed within the Dutch Consortium for Healthcare Evaluation and Research in Obstetrics and Gynaecology in cooperation with the Midwifery Research Network Netherlands.
Women were eligible for the study if they had a low risk, uncomplicated singleton pregnancy with the child in a stable cephalic position at a certain gestational age of 40 weeks+5 days to 41 weeks+0 days and no contraindications to expectant management until 42 weeks. Gestational age had to be determined by ultrasonography before a gestational age of 16 weeks. Exclusion criteria for the study were age younger than 18 years, ruptured membranes or in labour, or both, non-reassuring fetal status (eg, no fetal movements, or abnormal fetal heart rate and/or expected intrauterine growth restriction), known fetal abnormalities (including abnormal karyotype) that could influence perinatal outcome, contraindications to induction (including previous caesarean section), or contraindications to expectant management (eg, pregnancy induced hypertension).
Randomisation and masking
Eligible women were informed about the study at the 40 week antenatal check. At their next visit (40 weeks+5 days to 41 weeks+0 days) the women were counselled by the community midwife, secondary obstetric caregiver, or research-nurse or research-midwife of the participating centres collaborating in the Dutch Obstetric Research Consortium. After written informed consent had been obtained, the study participants underwent digital vaginal examination to determine the Bishop score which is used to assess the ripeness of the cervix before planning of induction of labour. It rates position, consistency, and dilation of the cervix and engagement of the fetal head (station) in a single score. Sweeping of the membranes was optional. Participants were randomly allocated by a web based program (ALEA) using randomly permuted block sizes of 4 and 2, stratified by centre to induction of labour at 41 weeks+0 days-1 week or to expectant management with subsequent induction if necessary at 42 weeks+0 days. Owing to the nature of the intervention it was not possible to blind the women or caregivers to treatment allocation.
Women allocated to induction were scheduled for the procedure at 41 weeks+0 days-41 weeks+1 day. All women were primed or induced, or both according to local protocols. Women with a Bishop score of less than 6 received cervical priming with prostaglandin E1 (misoprostol, oral or vaginal), prostaglandin E2 (dinoprostone), Foley catheter or double balloon catheter, or a combination of these until amniotomy could be performed. Amniotomy was followed by intravenous oxytocin if required.
Women in primary and secondary care who were allocated to expectant management awaited spontaneous onset of labour until 42 weeks+0 days in their initial care setting, with monitoring according to local protocol. Monitoring typically involved a combination of cardiotocography, and sonographic assessment of amniotic fluid in secondary care at 41-42 weeks. Women in the expectant management group with ongoing pregnancies were scheduled for induction at 42 weeks+0 days in secondary care, following a similar induction protocol to the intervention group.
In both groups, labour was induced if the maternal or fetal condition was no longer reassuring—for example, reduced fetal movements, non-optimal cardiotocography findings, or oligohydramnios. Labour was also induced if prelabour rupture of membranes had occurred more than 24 hours previously or meconium stained amniotic fluid was present.
The caregivers systematically collected information on perinatal and maternal condition, as well as protocol deviations and the reasons for these. Every case report form was checked on completion and inconsistency. Trained staff entered data in an online digital case report form (Oracle Clinical, version 18.104.22.168.1). Anonymised source documents were collected at the midwifery practice or hospital to check adverse perinatal and maternal outcomes. Serious adverse events were reported on a case by case basis to an independent Data Safety and Monitoring Board and to the Dutch national internet portal for the submission, review, and disclosure of medical-scientific research with participants (www.toetsingonline.nl).
The primary outcome was a composite of perinatal mortality and neonatal morbidity.
Perinatal mortality was defined as fetal death, intrapartum death, and neonatal death until 28 days. Neonatal morbidity was defined as having an Apgar score <7 at five minutes and/or an arterial umbilical cord pH <7.05 and/or meconium aspiration syndrome and/or plexus brachialis injury and/or intracranial haemorrhage and/or or being admitted to a neonatal intensive care unit (NICU). Though a neonate could suffer from more than one adverse event, it is counted as one composite adverse perinatal outcome (neonatal level).
We defined meconium aspiration syndrome as respiratory distress after birth in the presence of meconium stained amniotic fluid. NICU admissions were reviewed to reveal final diagnosis and presence of congenital anomalies.
The cut-off for Apgar score <7 at five minutes was based on the committee opinion of the American College of Obstetricians and Gynecologists and American Academy of Pediatrics (ACOG/AAP), 2006. October 2015, after trial registration and during inclusion for this study, the ACOG/AAP committee released an update, which stated that the inappropriate use of the Apgar score in outcome studies had led to an erroneous definition of asphyxia.32 Although it is incorrect to use Apgar score alone to diagnose birth asphyxia, an Apgar score <4 at five minutes “can be considered as a non-specific sign of illness.” Because of this mid-trial change of cut-off value, we also planned an additional analysis of the primary outcome including Apgar scores <4 instead of <7 at five minutes.
Secondary perinatal outcomes consisted of maternal outcomes: instrumental delivery (instrumental vaginal delivery, caesarean section), pain treatment (epidural, remifentanyl, pethidine), postpartum haemorrhage, and severe perineal injury (third or fourth degree perineal tear (obstetrical anal sphincter injuries (OASIS)). Other neonatal outcomes included admission to medium care, congenital abnormality, hypoglycaemia, neonatal infection or sepsis, and small for gestational age (<10th centile) or large for gestational age (>90th centile). We also added a composite of adverse maternal outcome and other delivery outcomes.
The composite adverse maternal outcome included postpartum haemorrhage (≥1000 mL), manual removal of the placenta, third or fourth degree perineal tear (obstetrical anal sphincter injuries), and admission to an intensive care unit (ICU). Other delivery outcomes concerned onset of labour, pain treatment during labour, use of tocolytics, maternal intrapartum infection, meconium stained amniotic fluid, gestational age at delivery, mode of delivery, episiotomy, total postpartum blood loss, and blood transfusion. Though a woman could experience more than one adverse event, it is counted as one composite adverse maternal outcome.
For both the perinatal and the maternal composite outcomes, we also compared the individual components.
Before the start of the trial, we formed an expert panel, consisting of midwives, gynaecologists, and paediatricians, and methodologists to conceive the design, content, and execution of the trial. Using data on adverse perinatal outcomes in the Netherlands from the Perined registry (www.perined.nl/), we expected an incidence of 3% for the primary composite adverse perinatal outcome with both strategies. The panel made a reasoned choice about the acceptable difference in adverse perinatal outcome and feasibility of the trial. As a result, the non-inferiority margin (∆) was defined as a 2% risk difference in incidence of the composite outcome favouring induction to justify a possible change in management strategy of pregnancies reaching a gestational age of 41 weeks+0 days.
With a one sided α of 0.05, the study could achieve a power (β) of more than 0.80 if 900 women were recruited in each trial arm (1800 women in total). Non-inferiority would be concluded if the lower limit of the 95% confidence interval of the risk difference excluded a 2% higher proportion of women with an adverse perinatal outcome in the group allocated to expectant management. We established a Data Safety Monitoring Board to review the accumulating data of the trial. Interim analyses were conducted on safety after 517 and 1088 women had been recruited.
The statistician who performed the analyses was blinded to the allocation of the participants and performed the analysis according to a predefined analysis plan. The analysis of the primary outcome was done for both the intention-to-treat groups and the per protocol groups. For the per protocol analysis, we selected all randomised women with start of cervical ripening or spontaneous onset of labour at 41 weeks+0 days or more. Subsequently we defined the per protocol induction group as women allocated to induction who received induction before 41 weeks+2 days or who had a spontaneous onset of labour before induction could be started (<41 weeks+2 days). The per protocol expectant management group included women allocated to expectant management with spontaneous onset of labour until 42 weeks+0 days, women with a medical reason for induction before 42 weeks+0 days during expectant management, and women with induction at 42 weeks+0 days or more.
For all outcomes we estimated relative risks (RR) or median or mean differences, with 95% confidence intervals. As appropriate, we investigated significance using χ2 test, Fisher’s exact test, t test, or Mann-Whitney U test statistics. We plotted Kaplan-Meier curves for the time between randomisation and birth. The log-rank test statistic was used to evaluate the difference in time to birth. Birth centiles were determined using national reference data for the Netherlands on birthweight, ethnicity, parity, and gestational age by week and day. Analyses were performed using SAS software for Windows, version 9.4 (SAS Institute, Cary, NC).
Patient and public involvement
No patients were asked for input in the creation of this article. Patient representatives will be asked to join a multidisciplinary working group consisting of (representatives of) obstetric caregivers (primary and secondary care) and neonatologists to create a new nationwide guideline addressing the management of late term pregnancy. Patients will also be involved in writing patient information brochures and a patient decision aid on this topic.
Between 14 May 2012 and 17 March 2016, 6088 eligible women were invited to participate in the INDEX trial, of whom 4273 declined owing to a maternal preference for induction of labour or expectant management, or refusal to let randomisation determine the management strategy. After randomisation but before analysis, one woman (induction group) withdrew her consent, and 13 women did not to meet the eligibility criteria (n=6 induction and n=7 expectant management). Of the remaining 1801 participants, 900 were randomly allocated to the induction group and 901 to the expectant management group (fig 1). Baseline characteristics were comparable between the groups, except for nulliparity: induction 50.8% (457/900) and expectant management 56.7% (511/901). (table 1).
In the induction group, 28.9% (260/900) of the women had a spontaneous onset of labour before the planned induction, and 71.1% (640/900) underwent induction, of whom 59.7% (382/640) had cervical ripening (fig 2). In the induction group, 4.8% (43/900) of the women were not induced at 41 weeks+0 days–41 weeks+1 days but at 41 weeks+2 days or later.
In the expectant management group, 73.7% (664/901) of the women had a spontaneous onset of labour and 26.3% (237/901) were induced (55.7% (132/237) underwent cervical ripening). In the expectant management group, 35.9% (85/237) underwent induction at 42 weeks for post-term pregnancy, and 27.4% (65/237) underwent induction before 42 weeks due to medical reasons (eg, fetal condition in 15.6% (37/237), maternal condition in 9.7% (23/237)), whereas 36.7% (87/237) in the expectant management group underwent induction on request. The median gestational age at time of delivery was 287 days (interquartile range 287-288 days) corresponding with 41 weeks+0 days (interquartile range 41 weeks+0 days-41 weeks+1 day) for the induction group and 289 days (interquartile range 287-292 days), corresponding with 41 weeks+2 days (41 weeks+0 days-41 weeks+5 days) in the expectant management group (table 2). In both groups three quarters of the women had a Bishop score <6 at study entry. Figure 3 shows the time to delivery for both groups.
In the per protocol induction group, 15.1% (92/611) of the women had spontaneous onset of labour before the planned induction. Of these women, 11.1% (67/611) had a spontaneous onset of labour at 41 weeks+0 days and 4.1% (25/611) at 41 weeks+1 day. In the per protocol induction group, 84.9% (519/611) of the women were induced: 62.5% (382/611) at 41 weeks+0 days and 22.4% (137/611) at 41 weeks+1 day. In the per protocol expectant management group, 80.9% (524/647) of the women had a spontaneous onset of labour at 41 weeks+0 days or later and 19.0% (123/647) were induced: 5.4% (35/647) because of concerns about fetal condition at 41 weeks+0 days or later, 3.4% (22/647) because of maternal condition 41 weeks+0 days or later, 0.6% (4/647) because of rupture of the membranes more than 24 hours previously at 41 weeks+0 days or later, and 9.6% (62/647) because of post-term pregnancy (≥42 weeks+0 days).
Table 3 presents the perinatal outcomes in the intention-to-treat analysis. Fifteen women in the induction group (1.7%) and 28 in the expectant management group (3.1%) had a composite adverse perinatal outcome (absolute risk difference −1.4%, 95% CI −2.9% to 0.0%; number needed to treat (NNT) 69, 95% CI 35 to 3059). The P value for non-inferiority was 0.22, indicating that we could not exclude that expectant management leads to 2% or more adverse perinatal outcomes compared with induction. All neonates in the expectant management group with a composite adverse perinatal outcome were born in secondary care. In these cases, women either had labour started in secondary care or were transferred during labour from primary to secondary care.
The per protocol analysis showed a 1.6% risk of an adverse perinatal outcome (10/611) in the induction group compared with 2.9% (19/647) in the expectant management group (risk difference −1.3%, 95% CI −3.0% to 0.4%, P=0.21 for non-inferiority; see supplementary appendix).
Additional analysis of the composite primary outcome including Apgar score <4 at five minutes instead of <7 resulted in 0.4% (4/900) adverse perinatal outcomes in the induction group and 1.3% (12/901) in the expectant management group (absolute risk difference −0.9%, −1.9% to 0.2%; NNT 113, 57 to 4624, P=0.02 for non-inferiority).
The additional per protocol analysis of the composite primary outcome including Apgar score <4 at five minutes showed a 0.5% risk (3/611) of an adverse perinatal outcome in the induction group versus 1.2% (8/647) in the expectant management group (risk difference −0.7%, −2.0% to 0.5%, P=0.02 for non-inferiority; see supplementary appendix).
Three perinatal deaths (stillbirths) occurred: one in the induction group and two in the expectant management group. There were no neonatal deaths. The stillbirth in the induction group was in a 30 year old multiparous woman who was randomised at 40 weeks+5 days and scheduled for induction at 41 weeks+1 day. She had reduced fetal movements at 40 weeks+6 days, and fetal death was diagnosed at consultation. She delivered a neonate weighing 3595 g (20th to 50th centiles). Investigations, including a postmortem examination, did not explain the stillbirth. In the expectant management group, stillbirth was diagnosed in a 36 year old nulliparous woman at 41 weeks+3 days, when she was admitted to hospital in labour. She delivered a neonate weighing 2945 g (5th to 10th centiles). Investigations, including placental examination, did not explain the stillbirth, and the parents declined a postmortem examination. The second stillbirth in the expectant management group was diagnosed in a 32 year old multiparous woman at 41 weeks+4 days during a regular consultation in secondary care for impending post-term pregnancy. She delivered a neonate weighing 3715 g (20th to 50th centiles). No postmortem examination was performed, but the placenta showed signs of chorioamnionitis.
The main contributor to the composite adverse outcome was an Apgar score <7 at five minutes: 1.2% (11/900) of neonates in the induction group and 2.6% (23/901) in the expectant management group (RR 0.48, 95% CI 0.23 to 0.98). Three of these neonates, all in the expectant management group, had an Apgar score <4 at five minutes. The first neonate was born at 41 weeks+6 days after spontaneous onset of labour and an operative vaginal delivery (vacuum), because of fetal distress and failure to progress in second stage. The diagnosis was meconium aspiration syndrome, and the neonate was admitted to the NICU (table 4). Sepsis after spontaneous onset of labour at 40 weeks+6 days and rupture of membranes of more than 24 hours was diagnosed in the second neonate. The third neonate, weighing 4320 g, was born after cervical ripening that started at 41 weeks+6 days and failure to progress of second stage followed by a caesarean section at 42 weeks+2 days. The diagnosis in this neonate was airway obstruction caused by vernix caseosa. Both these neonates were admitted to a medium care unit for observation. All three neonates recovered without complications. Admission to an NICU was reported in 0.3% (3/899) of neonates in the induction group versus 0.9% (8/899) in the expectant management group (RR 0.38, 95% CI 0.10 to 1.41). Of the 11 children admitted to the NICU, six (three in each group) had a diagnosis of severe congenital disorder. Meconium aspiration syndrome was diagnosed twice, but only in the expectant management group, and both neonates recovered fully. No plexus brachialis lesions and no intracranial haemorrhage were diagnosed in the study population. In two admissions because of a (suspected) infection, one neonate had group B streptococcus and the other had a negative culture result. One neonate was admitted because of a pneumothorax.
Arterial pH measurements were not recorded systematically and therefore could not be included in the analysis. Imputing was not possible owing to many missing data (62.0% induction v 70.0% expectant management). However, when we analysed data including the available pH measurements, the composite adverse perinatal outcome was 27/900 (3.0%) in the induction group versus 37/901 (4.1%) in the expectant management group (risk difference −1.11%, 95% CI −2.84% to 0.63%, P=0.16 for non-inferiority). For the per protocol analysis, the composite adverse perinatal outcome including the available pH measurement was 3.1% (19/611) in the induction group versus 4.0% (26/647) in the expectant management group (risk difference −0.91%, −2.98 to 0.01%, P=0.15 for non-inferiority).
When stratifying by parity, we observed 2.4% (11/457) nulliparous women with a composite adverse perinatal outcome in the induction group and 4.1% (21/511) in the expectant management group (RR 0.59, 95% CI 0.29 to 1.20). In multiparous women the incidence of adverse perinatal outcome was lower in both groups compared with nulliparous women: 0.9% (4/443) in the induction group and 1.8% (7/390) in the expectant management group (RR 0.50, 95% CI 0.15 to 1.71). In logistic regression analysis, no interaction was found between parity and induction or expectant management.
Table 3 shows the secondary perinatal outcomes in the intention-to-treat groups. No difference was found in medium care admissions, 6.6% and 6.7% (induction 59/899 v expectant management 60/899). Small for gestational age (<10th centile), according to Dutch birthweight centiles, was similar between the groups: 6.8% (61/900) in the induction group versus 6.9% (62/901) in the expectant management group. Overall, 9.6% (86/900) of infants in the induction group were large for gestation age (>90th centile) versus 11.0% (99/901) in the expectant management group. The incidence of congenital abnormalities was similar between groups: 1.8% in the induction group (16/900) versus 2.1% in the expectant management group (19/901).
Table 2 summarises the characteristics of labour and mode of delivery. Oxytocin was given significantly more often in the induction group than in the expectant management group (59.2% (533/900) and 39.4% (355/901) (RR 1.50, 95% CI 1.36 to 1.66)). Meconium stained amniotic fluid occurred significantly less often in the induction group compared with expectant management group (16.3% (147/900) and 22.8% (205/901) (RR 0.72, 95% CI 0.59 to 0.87). Ninety seven women in each group (10.8%) had a caesarean section (RR 1.00, 95% CI 0.77 to 1.31), mainly for non-progressive labour at the first stage of labour (table 2).
Table 5 shows the results of adverse maternal outcomes in the intention-to-treat groups. The composite adverse maternal outcome occurred in 13.6% (122/900) of the women in the induction group versus 11.3% (102/901) in the expectant management group (RR 1.20, 95% CI 0.94 to 1.53). Postpartum haemorrhage ≥1000 mL was the main contributor to the composite adverse maternal outcome and occurred in 9.1% (82/900) of women in the induction group versus 8.0% (72/901) in the expectant management group (RR 1.14, 95% CI 0.84 to 1.54). Manual removal of the placenta occurred in 5.1% (41/803) in the induction group versus 4.1% (33/804) in the expectant management group (RR 1.24, 95% CI 0.79 to 1.95). Obstetrical anal sphincter injuries were diagnosed in 3.5% (28/803) of women in the induction group versus 3.9% (31/804) in the expectant management group (RR 0.90, 95% CI 0.55 to 1.49). Three mothers (0.3%) in the induction group and two (0.2%) in the expectant management group were admitted to an ICU post partum (RR 1.50, 95% CI 0.25 to 8.97), all after postpartum haemorrhage. Blood loss in these women was 3000 mL, 5100 mL, and 7000 mL in the induction group and 3390 mL and 5000 mL in the expectant management group. No maternal deaths occurred. During labour, 29.4% (265/900) of the women in the induction group received epidural anaesthesia compared with 25.6% (231/901) in the expectant management group (RR 1.15, 95% CI 0.99 to 1.33).
This randomised controlled trial compared the effect of induction of labour at 41 weeks with expectant management until 42 weeks with subsequent induction if necessary on perinatal and maternal outcomes in women with an uncomplicated pregnancy. A policy of induction resulted in a median reduction in gestational age at delivery of two days. We found a 1.4% difference in composite adverse perinatal outcome favouring induction, although the absolute risk of severe adverse perinatal outcome (perinatal mortality, Apgar score <4 at five minutes, admission to a neonatal intensive care unit (NICU) without severe congenital anomalies) was low in both groups.
Most of our primary composite outcomes can be attributed solely to the component Apgar score <7 at five minutes 73.3% (11/15) in the induction group v 64.3% (18/28) in the expectant management group), which means that these neonates did not have any other adverse outcome besides the Apgar score being <7 at five minutes. We performed a post hoc analysis of the composite outcome including Apgar scores <4 instead of <7 at five minutes owing to the American College of Obsetricians and Gynecologists/American Academy of Pediatrics (ACOG/AAP) mid-trial change in recommended cut-off value for Apgar score at five minutes indicating a non-specific sign of illness. A considerable lower incidence of adverse perinatal outcome was found in both groups (0.4% induction and 1.3% expectant management), with an absolute risk difference of −0.9% (95% confidence interval −1.9% to 0.2%) favouring induction, showing non-inferiority of expectant management with respect to the predefined margin of 2% (P=0.02 for non-inferiority).
Comparison with other studies
The incidence of perinatal death in our study was one after induction compared with two after expectant management. The corresponding risk ratio for perinatal death (RR 0.50, 95% CI 0.05 to 5.51) is comparable with that of the four studies (n=998) starting induction at 41 weeks (n=501) versus expectant management with varying upper limits of gestational age (n=497) included in a Cochrane systematic review (RR 0.33, 95% CI 0.03 to 3.17).192133
Congenital anomaly accounted for a substantial part of the NICU admissions in our trial, although it was an exclusion criterion at study entry. It is unknown if the outcome for these children would have been better if they had been born earlier, although it is unlikely (table 4). For these reasons we also analysed the primary composite outcome using an Apgar score <4 at five minutes and NICU admission without severe congenital anomalies. With these adapted adverse outcomes (perinatal mortality, Apgar score <4 instead of <7 at five minutes, and NICU admission without severe congenital anomalies), the absolute risk on the composite adverse perinatal outcome was substantially lower in both groups, with a still significant difference in favour of induction (0.1% (1/897)) versus expectant management (1.0% (9/898)): absolute risk difference −0.9%, 95% CI −1.6% to 0.2%; P=0.01 for non-inferiority; P=0.02 for Fisher’s exact test; and NNT of 112 (95% CI 63 to 491)).
Since in our trial all women in the 41 week induction group received obstetrician led intrapartum secondary care whereas in the expectant management group until 42 weeks 68.7% of the women received midwifery led primary care at start of labour and 34.3% at time of birth, it could be suggested that our study is prone to performance bias (different care) and measurement bias (different assessment of neonates). Several studies, however, showed that Apgar scoring does not differ significantly between midwives and obstetricians.343536 Furthermore, in our trial all neonates in the expectant management group with an adverse outcome were born in secondary care—the women had started labour in secondary care or were referred from primary to secondary care during labour. Various studies have shown that it is safe for low risk women in the Netherlands to deliver in midwifery led care, and the level of care does not seem to influence delivery outcome for these women.2425262728 Although this study could be considered as a comparison between obstetrician led care with labour induction and midwife led care with a policy of expectant management, we cannot adjudicate whether the difference in the composite adverse perinatal outcome is due to the level of care (performance bias) or to a possible difference in Apgar scoring (measurement bias). We do not, however, expect bias to be a major factor.
In our study, meconium aspiration syndrome occurred in two neonates in the expectant management group. In a randomised controlled trial with a comparable time frame, Gelisen et al reported meconium aspiration syndrome in 16/600 neonates of whom 12/300 were in the expectant management group. We found a 10 and 20 times lower rate of meconium aspiration syndrome (0.0% and 0.2% versus 1.3 and 4%) in the induction and expectant management groups compared with the study by Gelisen et al. Since these authors did not specify meconium aspiration syndrome, the difference in magnitude could be attributed to a difference in definition. Despite this, Gelisen et al found no difference in NICU admissions (4.3% induction v 5.0% expectant management), which is expected to be associated with meconium aspiration syndrome. We found a lower rate of NICU admissions compared with the Cochrane systematic review on induction of labour at more than 41 weeks: 0.3% induction and 0.9% expectant management (INDEX trial) v 11% induction and 12% expectant management (systematic review). The systematic review lacked details on NICU admission, such as diagnosis, potential association with gestational age, or presence of congenital anomalies, which hampers a clear comparison.1633
We did not find differences in caesarean section or operative vaginal delivery rates, which is consistent with other large studies on induction of labour.3738 In the only study that compared the same timeframes as our study, the risk ratio for caesarean section was comparable for both groups, although the absolute risk was twice as high compared with that of our study. This could be due to other inclusion criteria (Gelisen et al, Bishop score <5) or differences in policy during labour, as reflected by differences in national overall caesarean rates in Turkey (53% v 16% in the Netherlands).3539 The Cochrane systematic review concluded that induction at or beyond 41 weeks is associated with lower caesarean section rates. The largest contribution to this outcome was from a randomised controlled trial in which women in the control group were induced only with oxytocin according to study protocol, whereas prostaglandin use was allowed in women with low Bishop scores in the induction group.16 Two other systematic reviews including the same trial concluded that the difference in caesarean section rate is possibly due to the influence of this study with incomparable study arms.172040 Population based cohort studies showed conflicting results on the effect of induction on caesarean section rates.414243 In the recently published ARRIVE (A Randomized Trial of Induction Versus Expectant Management) trial, low risk nulliparous women were randomised in the 39th week of pregnancy to be induced at 39 weeks+0 days to 39 weeks+4 days or to expectant management until 41 weeks. No statistically significant difference was found in perinatal outcome (RR 0.80, 95% CI 0.64 to 1.00), although fewer caesarean sections took place in the induction group (18.6% v 22.2%; RR 0.84, 95% CI 0.76 to 0.93). Our study comprised more white women (86% v 44%), with a higher median age (30 v 24 years) and a lower percentage of body mass index ≥30 (12% v 52%), whereas 46% of the participants in our study were multiparous women and ARRIVE included only nulliparous woman. Caesarean section rates in our nulliparous low risk women were comparable between the groups: 18.6% in the induction group and 18.0% in the expectant group. This could be due to the differences in gestational age, baseline characteristics, indication for induction of labour, or indication for a caesarean section (suspected fetal distress or failure to progress).44
Our trial had some notable results besides those for the main outcomes. Around 85% of participating women were of white ethnicity. The risk of perinatal mortality beyond term has been shown to be higher in women of South Asian, African, and Mediterranean origin compared with white women.45 In our study, we were not able to assess the effect of induction in women of non-white ethnicity owing to the low number of women of other ethnic origin. Also, we were unable to assess the effect of age on adverse perinatal outcome because of the low number of participating older mothers (>35 years).
As in other studies on pregnancies at or beyond 41 weeks, most women in our study had an unfavourable cervix, with a Bishop score of <6 at randomisation. Although induction was planned one or two days after randomisation, 28.9% of the women in the induction group had a spontaneous onset of labour before induction started, compared with 73.7% in the expectant management group. Despite women with suspected or established intrauterine growth restriction being ineligible for inclusion in the study, the birthweight for 7% of the children was less than the 10th Dutch centile (61/900 induction and 62/901 expectant management), confirming the difficulty in diagnosing growth restricted babies at term. In the induction group, 2/61 infants had a birthweight less than the 10th centile and an adverse perinatal outcome: one neonate, weighing 3100 g (<10th centile), had an Apgar score of 6 at five minutes after operative vaginal delivery by forceps because of fetal distress. The other neonate, weighing 2595 g (<2.3rd centile), had an Apgar score of 6 at five minutes after caesarean section because of fetal distress, with an umbilical cord pH of 6.87, possibly due to hypotension of the mother after epidural analgesia for pain relief or multiple entanglement of the umbilical cord. In the expectant management group, 3/62 infants weighed less than the 10th centile at birth and had an adverse perinatal outcome: one (birthweight 2945 g) was a stillbirth, one (2980 g) was admitted to the NICU because of a pneumothorax, and one (3040 g) had an Apgar score of 6 at five minutes that was attributed to pethidine use in the mother.
Strengths and limitations of this study
A major strength of our study is that it concerns a nationwide multicentre randomised controlled trial of a well defined obstetrical population at low risk; the largest trial to date to compare induction of labour at 41 weeks with expectant management until 42 weeks.46 No cases were lost to follow-up.
In the Netherlands, expectant management until 42 weeks is the standard of care in the low risk obstetrical population at 41-42 weeks according to the Dutch Obstetrical Indication List, although there is wide variation in practice because of women and caregiver preferences, which complicated inclusion.15 Not all eligible women were invited, and not all women who were asked participated, because of a preference for induction or expectant management. Despite this selective participation, our trial offers the best possible representation of pregnant women reaching 41 weeks+0 days in the Netherlands.
We are aware of some potential limitations of our trial. We chose to use a composite adverse perinatal outcome instead of a single outcome like perinatal mortality. We considered any major adverse perinatal outcome in an otherwise uncomplicated pregnancy as undesirable. It is debatable if all the included adverse perinatal outcomes in our composite outcome measure are relevant to identify real severe adverse perinatal outcome with an effect on an infant’s short term or long term health status. However, if we included an Apgar score of <4 instead of <7 at five minutes, according to the ACOG/AAP criteria, and excluded severe congenital abnormalities, induction of labour resulted in a statistically significant risk reduction of 0.9%, although with a substantially lower incidence of the composite adverse outcome in both groups.
We chose the non-inferiority design because we did not expect the Dutch standard policy of expectant management in our low risk obstetrical population to be inferior to a policy of induction of labour but acceptable or preferable if leading to comparable outcomes.47 It is good practice to use a per protocol analysis in non-inferiority trials, as an intention-to-treat analysis carries a risk of falsely rejecting the null hypothesis of inferiority. Because we did not reject the null hypothesis and do not conclude non-inferiority, we presented the intention-to-treat analyses first, since such analyses are more common in reports of clinical trials. We also reported the per protocol outcome of the primary outcome (see supplementary appendix for the other per protocol analyses).
We did not stratify randomisation by parity, because we expected a balanced allocation in both groups owing to the large study population. However, it did result in an imbalance between groups: 50.8% of nulliparous women in the induction group compared with 56.7% in the expectant management group. After stratifying by parity in an additional analysis, we observed similar results. A higher incidence of the composite adverse perinatal outcome was seen in the nulliparity group in both the induction group (nulliparous 2.4% v multiparous 0.9%) and the expectant management group (nulliparous 4.1% v multiparous 1.8%), which is in concordance with other studies.48 Furthermore, we saw no interaction between parity and induction of labour or expectant management in logistic regression analysis.
The measurement of arterial pH is not possible in primary care, and pH measurement is no standard policy for uncomplicated birth in most hospitals in the Netherlands. Because of the high number of missing pH measurements (60-70%) and the impossibility to impute, we could not include umbilical artery pH in the composite outcome, which could have led to selection bias. Including the available data on umbilical arterial pH in the analyses, however, did not alter the results.
The results of our study can be interpreted in different ways, which might have implications for standard practice. If the composite outcome is interpreted straightforwardly, there is a small benefit of induction at 41 weeks that could justify standard induction at 41 weeks. It could be argued, however, that a change of policy to earlier induction, concerning roughly one fifth of all women with a singleton pregnancy, is too rigorous in light of the relatively low incidence of perinatal mortality, gestational age associated NICU admission, and Apgar score <4 at five minutes as indicator for encephalopathy. This could justify expectant management if women want to avoid induction. On both sides of the spectrum, caregivers are challenged to provide neutral, evidence based counselling of low risk women in late term pregnancy on the pros and cons of induction. In a recent report by Walsh et al, women felt they were not offered a real choice when it came to management of their prolonged pregnancy, and this is confirmed by other studies; induction of labour is often presented as an inevitable next step to women, without information provided on alternative management strategies.495051
Conclusions and policy implications
Our large trial compared induction of labour at 41 weeks with expectant management until 42 weeks and subsequent induction if necessary. Substantial larger trials are needed to evaluate differences in rare outcomes, such as perinatal mortality and NICU admission. A systematic review or individual participant meta-analysis on the comparison between 41 weeks and 42 weeks could then be performed including findings from those studies as well as those of our own study. Future research could also focus on long term adverse perinatal outcome of both strategies, although this requires long term follow-up of children.46 In addition, a more tailored approach will need identification of women who could maintain pregnancy until 42 weeks or are at increased risk of adverse perinatal outcomes (eg, relational model).
The incidence of late term pregnancy varies between countries because of different management strategies.52 Women need to be counselled on the desired policy in late term pregnancy. In this trial, induction of labour at 41 weeks resulted in less overall adverse perinatal outcome than a policy of expectant management until 42 weeks, although the absolute risk of severe adverse outcome (perinatal mortality, NICU admission, Apgar score <4 at five minutes) was low in both groups. As with every intervention in the natural birth process, the decision to induce labour must be made with caution, as the expected benefits should outweigh possible adverse effects for both mother and child.53 The results of our study should be used to inform women approaching a gestational age of 41 weeks, so they can weigh the respective outcomes and decide whether to be induced at 41 weeks or to continue pregnancy until 42 weeks.
What is already known on this topic
A policy of labour induction at or beyond term compared with expectant management is associated with fewer perinatal deaths and fewer caesarean sections; but more operative vaginal births (Cochrane review)
Aggregated results of trials need to be interpreted with caution because of trials heterogeneity caused by different outcome measures, protocols, and time frames of comparison
Evidence is lacking for the recommendation to induce labour at 41 weeks instead of 42 weeks for the improvement of perinatal outcome
What this study adds
Induction of labour at 41 weeks resulted in less overall adverse perinatal outcome than a policy of expectant management until 42 weeks, although the absolute risk of severe adverse outcome (perinatal mortality, NICU admission, Apgar score <4 at five minutes) was low in both groups
We thank the women who participated in this study; the midwives, gynaecologists, residents, and nurses of the participating centres; and those who participated in the setup and execution of the trial.
Contributors: JKJK, AB, and JCK are joint first authors and contributed equally to the study. EdM and BWM initiated this study. EdM and JP supervised this study. JKJK, AB, and JCK wrote the first and subsequent drafts of the paper. RD conducted the statistical analyses and takes responsibility for the integrity of the data and accuracy of the data analyses. PB advised on statistical issues and interpretation of the results. AK is the neonatologist who reviewed all anonymised NICU admissions on case level with JCK and EdM. All authors have approved the final version of this manuscript submitted for publication. JKJK, AB, JCK, JD, JP, and EdM are guarantors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding: This study was supported by a grant from the Netherlands Organisation for Health Research and Development ZonMw (grant No 171202008). This funding source had no role in study design, data collection, data analysis, data interpretation, writing of the report, or the decision to submit the paper for publication. The authors had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Competing interests: All authors have completed the ICMJE uniform disclosure form (available on request from the corresponding author) and declare: BWM is supported by a National Health and Medical Research Council practitioner fellowship (GNT1082548) and reports consultancy for ObsEva, Merck, and Guerbet; no support from any other organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years, no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: This trial was approved by the ethics committee of the Academic Medical Centre, Amsterdam (No NL38455.018.11). The board of directors of each of the participating centres approved local execution of the study.
Data sharing: The full dataset is available from the corresponding author at firstname.lastname@example.org on reasonable request.
Transparency: The corresponding author (EdM) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported, no important aspects of the study have been omitted, and any discrepancies from the study as planned have been explained.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.