Accuracy of cervicovaginal fetal fibronectin test in predicting risk of spontaneous preterm birth: systematic reviewBMJ 2002; 325 doi: http://dx.doi.org/10.1136/bmj.325.7359.301 (Published 10 August 2002) Cite this as: BMJ 2002;325:301
- Honest Honest, research fellow ()a,
- Lucas M Bachmann, research fellowb,
- Janesh K Gupta, senior lecturera,
- Jos Kleijnen, professorc,
- Khalid S Khan, consultanta
- aAcademic Department of Obstetrics and Gynaecology, Birmingham Women's Hospital, Birmingham B15 2TG
- bHorten Centre, University of Zurich, Bolleystrasse 40, CH-8091, Zurich, Switzerland
- cNHS Centre for Reviews and Dissemination, University of York, YorkYO10 5DD
- Correspondence to: H Honest
- Accepted 13 March 2002
Objective: To determine the accuracy with which a cervicovaginal fetal fibronectin test predicts spontaneous preterm birth in women with or without symptoms of preterm labour.
Design: Systematic quantitative review of studies of test accuracy.
Data sources: Medline, Embase, PASCAL, Biosis, Cochrane Library, Medion, National Research Register, SCISEARCH, conference papers, manual searching of bibliographies of known primary and review articles, and contact with experts and manufacturer.
Study selection: Two reviewers independently selected and extracted data on study characteristics, quality, and accuracy.
Data extraction: Accuracy data were used to form 2×2 contingency tables with spontaneous preterm birth before 34 and 37 weeks' gestation and birth within 7-10 days of testing (for symptomatic pregnant women) as reference standards. Data were pooled to produce summary receiver operating characteristic curves and summary likelihood ratios for positive and negative test results.
Data synthesis: 64 primary articles were identified, consisting of 28 studies in asymptomatic women and 40 in symptomatic women, with a total of 26 876 women. Among asymptomatic women the best summary likelihood ratio for positive results was 4.01 (95% confidence interval 2.93 to 5.49) for predicting birth before 34 weeks' gestation, with corresponding summary likelihood ratio for negative results of 0.78 (0.72 to 0.84). Among symptomatic women the best summary likelihood ratio for positive results was 5.42 (4.36 to 6.74) for predicting birth within 7-10 days of testing, with corresponding ratio for negative results of 0.25 (0.20 to 0.31).
Conclusion: Cervicovaginal fetal fibronectin test is most accurate in predicting spontaneous preterm birth within 7-10 days of testing among women with symptoms of threatened preterm birth before advanced cervical dilatation.
What is already known on this topic
Spontaneous preterm birth is a major cause of neonatal morbidity and mortality
If spontaneous preterm birth can be predicted, effective therapeutic strategies can be used to improve neonatal outcomes
Though the cervicovaginal fetal fibronectin test has been proposed as a predictive test, estimates of its accuracy are variable
What this study adds
The cervicovaginal fetal fibronectin test is most accurate in predicting spontaneous preterm birth within 7-10 days of testing among women with symptoms of threatened preterm birth before advanced cervical dilatation
After a positive test result 17 symptomatic women at 31 weeks' gestation would need to be treated with antenatal steroids to prevent one case of respiratory distress syndrome
Spontaneous preterm birth occurs in 7-11% of pregnancies before 37 weeks' gestation 1 2 and in 3-4% of pregnancies before 34 weeks' gestation.3 Most neonatal deaths of normally formed infants occur when they are born before 34 weeks' gestation. Many of the surviving preterm infants, especially those from the earlier gestations, suffer serious morbidity such as bronchopulmonary dysplasia, intraventricular haemorrhage, retrolental fibroplasia, neurodevelopmental problems, and cognitive difficulties. 4 5 Advances in perinatal health care have not altered the incidence of spontaneous preterm birth, but there is effective management to reduce the associated complications. For example, the landmark Cochrane review showed that antenatal steroids significantly reduced morbidity and mortality.6 Timely institution of such treatment in clinical practice depends on accurate prediction of spontaneous preterm birth.
Many tests have been purported to predict spontaneous preterm birth including cervicovaginal fetal fibronectin testing. Fetal fibronectin is a glycoprotein found in amniotic fluid, placental tissue, and the extracellular substance of the decidua basalis next to the placental intervillous space. It is thought to be released through mechanical or inflammatory mediated damage to the membranes or placenta before birth.7 Swabs can be taken from the ectocervix or posterior vaginal fornix, and an enzyme linked immunosorbent assay (ELISA) containing FDC-6 monoclonal antibody can be used to detect fetal fibronectin.7 The results may indicate the likelihood of spontaneous preterm birth.8 In clinical use, however, factors such as contamination of the sample with maternal blood,9 sampling within 24 hours after intercourse,10 and pre-eclampsia11 may reduce the accuracy of the test and give false positive results.
If the test could be used to identify those women who, though asymptomatic, may be at high risk antenatal care may be optimised (for example, by instituting closer antenatal surveillance) with view to maintaining the pregnancy past 34 weeks' gestation, which is now an established milestone in perinatal outcome. 4 5 On the other hand, if the test could predict imminent birth among women with symptoms of threatened spontaneous preterm birth but before advance cervical dilatation then antenatal steroids, tocolytics, and in utero transfer (to optimise neonatal care) may be used accordingly. Antenatal steroids are most effective in the two to seven days after they are given,6 and tocolytics can delay birth for at least two days. Therefore, among symptomatic women we are mostly interested in predicting the likelihood of spontaneous preterm birth occurring within 7-10 days after the test because this knowledge is likely to influence subsequent management.
Many primary studies claim that the cervicovaginal fetal fibronectin test can accurately predict spontaneous preterm birth in a clinical setting. However, these studies have not generally had enough participants to provide precise estimates of accuracy. In addition, existing systematic reviews have been restricted to a few databases,12–15 their study selection has often been limited by language, 12 13 15 and often they have not assessed study quality.12–14 These factors are known to introduce potential for bias.16 We conducted a comprehensive and rigorous systematic review to obtain reliable estimates of accuracy. We defined asymptomatic women as those without uterine tightenings or contractions and symptomatic women as those with uterine tightenings or contractions and cervical dilatation of <2-3 cm.
Identification of studies
Our electronic searches targeted all diagnostic procedures among studies on prediction of spontaneous preterm birth. We searched general bibliographic databases: Medline (1966-2000), Embase (1980-2000), PASCAL (1973-2001), and BIOSIS (1969-2001). We also searched specialist computer databases: the Cochrane Library (2000:4), MEDION (1974-2000) (a database of diagnostic test reviews set up by Dutch and Belgian researchers), National Research Register (2000:4), SCISEARCH (1974-2001), and conference papers (1973-2000). Our electronic search strategy is described in detail elsewhere.19 We contacted individual experts and the manufacturer of fetal fibronectin test to uncover grey literature. We also checked reference lists of known reviews and primary articles to identify cited articles not captured by electronic searches.
Study selection and data extraction procedures
Our selection criteria were studies in asymptomatic or symptomatic pregnant women, cervicovaginal fetal fibronectin testing before 37 weeks' gestation, known gestation at spontaneous birth, and observational cohort design. Studies were selected in a two stage process. Two us (HH and LMB) independently scrutinised the electronic searches and obtained full manuscripts of all citations that were likely to meet the predefined selection criteria. Final inclusion or exclusion decisions were then made after we examined these manuscripts. In cases of duplicate publication we selected the most recent and complete versions. We had no language restrictions, but we excluded case-control studies. Two of us (HH and LMB) independently assessed English, French, and Spanish manuscripts. LMB assessed German manuscripts, while other language manuscripts were assessed by people who had command of the language to allow data extraction from the manuscripts. We resolved any disagreements about inclusion or exclusion by consensus or arbitration by a third reviewer (KSK).
We extracted study characteristics, quality, and accuracy of results from each selected article. Study characteristics consisted of women's risk classifications, test characteristics, and reference standards of the test. In studies where multiple tests were performed, we considered any positive result as a positive result overall. Accuracy data were used to construct 2×2 tables of test results and spontaneous preterm birth, which served as the reference standard. We extracted data for asymptomatic and symptomatic women on spontaneous preterm birth before 34 and 37 weeks' gestation. In addition, for symptomatic women we extracted data on spontaneous preterm birth within 7-10 days of testing. We piloted and tested the data extraction form for repeatability on the first eight manuscripts.20–27 Overall, the observer agreement regarding the various components of the data extraction form was 90-100%, with κ values ranges from 0.9 to 1.0.
Assessment of study quality
We assessed all manuscripts that met the selection criteria for study quality. We defined quality as the confidence that the study design, conduct, and analysis minimised bias in the estimation of test accuracy. Bias can be associated with case-control study designs, lack of blinding of carer to test results, non-consecutive patient enrolment, non-prospective data collection, inadequate test description, use of different reference tests, partial verification, and lack of description of either the population or the reference test.28 The last four items, however, are not relevant to our review because they refer to delivery of neonates (preterm or term births). Therefore, we considered a study to be of good quality if it used a prospective design, consecutive enrolment, adequate test description (to allow replication by others), and blinding of the test result from clinicians managing the patients.29
We synthesised data separately for studies on asymptomatic and symptomatic women with spontaneous preterm birth before 34 and 37 weeks' gestation. For symptomatic women we also synthesised data for spontaneous preterm birth within 7-10 days of testing. We assessed heterogeneity of diagnostic odds ratios graphically (using forest30 and Galbraith plots31) and statistically (using χ2 test) to help us to decide how to proceed with quantitative synthesis.32 For each outcome within the two populations there was significant heterogeneity. We explored possible sources of heterogeneity by meta-regression analysis16 using various independent explanatory variables defined a priori. These variables were risk classifications (high or low as defined by the authors), multiple gestation (included or excluded), type of recruitment (consecutive or others), digital examination before testing (yes or no), sexual intercourse within 24 hours preceding testing (yes or no), bleeding before testing (yes or no), methods of testing (laboratory or bedside), serial testing (yes or no), gestation at testing for asymptomatic women (before or after 24 weeks), blinding of test results (yes or no), study design (prospective or retrospective), and publication language (English or other). When a variable was not explicitly mentioned, it was treated as “no” in the meta-regression analysis. As our meta-regression analysis failed to explain the observed heterogeneity we proceeded with meta-analysis using random effects model.33 Consequently, the pooled results should be interpreted with caution. To aid in interpretation we examined the estimate of accuracy of the highest quality studies included in our review.
We used summary receiver operating characteristic (ROC) curves34 as measures of accuracy for all included studies regardless of their thresholds. The area under the curve provides an average measure of accuracy from the combined studies (especially when there are different test thresholds) and a convenient way of comparing accuracy of the test for different outcomes.35 We used summary likelihood ratios as measures of accuracy for studies using 50 ng/ml as the threshold. These ratios indicate by how much a given test result will raise or lower the probability36 of having a spontaneous preterm birth. Using summary ratios we determined probabilities after the test by Bayes's theorem as follows36:
post-test probability= ratio×pretest probability/
[1−pretest probability × (1− ratio)].
In this way, ratios are more clinically meaningful than sensitivities or specificities, for which meta-analysis are generally not recommended.37 To detect publication and related bias, we undertook funnel plot (diagnostic odds ratio v reciprocal of its standard error) analysis.38 All statistical analyses were performed with SPSS version 10 and Stata 7.0 statistical packages.
Literature identification and study quality
Figure 1 summarises the process of literature identification and selection. Sixty four primary articles met the selection criteria. (The references we excluded from analysis can found on webextra.) They consisted of 28 accuracy studies in asymptomatic women and 40 studies in symptomatic women, with a total of 26 876 women. The webextra table summarises each study's salient features according to whether the women were asymptomatic or symptomatic and their risk classifications. Figure 2 summarises the quality of methods. Thirteen (19%) studies, seven among asymptomatic39–45 and six among symptomatic women,46–51 fulfilled all four criteria for good quality. All studies except three52–54 (which accounted for 0.28% of the 22 390 women in our review) used thresholds of 50 ng/ml to indicate an abnormal test result.8
Fibronectin test in asymptomatic women
In women without symptoms three studies examined the accuracy of the test using bedside methods and 26 used laboratory methods. Thirteen studies examined single testing and 16 looked at serial testing. Eight studies examined the use of fibronectin as a screening tool in low risk pregnancy and nine as a selective screening tool in high risk pregnancy. Most studies were carried out during the second trimester or early in the third trimester. Meta-regression analysis showed the accuracy of the test did not depend on the method of testing, how often the test was done, classification of risk, or gestation at testing.
The estimates of the accuracy of the test in predicting spontaneous preterm birth for the various gestations of interest varied considerably. Figure 3 shows the summary receiver operating characteristic curve for asymptomatic women. Figures 4 and 5 show individual study results used to create the summary curve. Figure 6 shows the pooled estimates of likelihood ratios. Figures 7 and 8 show details from individual studies.
When we examined study quality as a source of heterogeneity we found no significant differences in estimates of accuracy in studies with high and low quality features. The estimates of accuracy of studies that fulfilled all four of the quality criteria were generally consistent with the pooled results. For example, the median likelihood ratios for predicting spontaneous preterm birth before 34 weeks' gestation among the five highest quality studies were 3.99 (interquartile range 1.73-10.18) for a positive result and 0.38 (0.10-0.69) for a negative result.
Fibronectin test in symptomatic women
In women with symptoms 11 studies examined the accuracy of the test using bedside methods and 30 used laboratory methods. Thirty five examined occasion testing, and five looked at serial testing. Meta-regression analysis showed that the accuracy of the test did not depend on the method of testing, how often the test was done, or classification of risk. As for asymptomatic women, the accuracy of the in predicting spontaneous preterm birth for the various gestations of interest varied considerably. Figure 9 shows the summary receiver operating characteristic curve for symptomatic women. Figures 10, 11, and 12 give details of individual results used to create the summary curve. The pooled estimate of the likelihood ratios can be found in figure 6, with details of individual studies in figures 13, 14, and 15.
When we examined study quality as a source of heterogeneity we found no significant differences in estimates of accuracy in studies with high and low quality features. The estimates of accuracy of studies that fulfilled all four of the quality criteria were generally consistent with the pooled results. For example, the median likelihood ratios for predicting for predicting spontaneous preterm birth within 7-10 days of testing among the four highest quality studies were 6.16 (4.53-7.33) for a positive result and 0.32 (0.01-0.45) for a negative result.
Funnel plot analysis showed no evidence of asymmetry that would indicate presence of publication or related bias for the main outcomes.
Our results show that the accuracy of the cervicovaginal fetal fibronectin in predicting various spontaneous preterm birth outcomes varies. The test is most accurate in predicting spontaneous preterm birth within 7-10 days after testing among women with symptoms of threatened preterm birth before advanced cervical dilatation.
Quality of our review
The strength of our inferences depends on the rigour of our methods. In contrast with the previous four systematic reviews12–15 we identified 64 studies (at least twice as many studies as the largest previous review14) because we did not limit our search to a single database 13 15 nor did we apply language restrictions.13 Because meta-analysis of studies that examine test accuracy are fraught with difficulty owing to poor methodological quality of the primary studies, we scrutinised the selected studies for their quality, an assessment undertaken in only one previous review.15 Methodological issues that may overestimate accuracy such as case-control design, absence of test descriptions, and different reference tests,28 were not applicable to the studies we reviewed. Our assessments of quality were affected by poor reporting in some instances, though quality did not significantly explain differences between their results. Assessment and exploration for reasons behind heterogeneity were planned a priori. In the presence of unexplained heterogeneity we pooled data with a random effects model, which produces a wider confidence interval.16 However, due to the large number of studies the estimates of accuracy were generally more precise compared with previous reviews.
The clinical impact of the estimates of accuracy that we have produced depends on how the resultant changes in probabilities due to fibronectin testing alter therapeutic effectiveness in decision making.55 We can illustrate this impact with an example of decision making about the use of antenatal steroids in women with symptoms of threatened preterm birth at 31 weeks' gestation (table).6 The absolute effect of antenatal steroids depends on the risk of spontaneous preterm birth after presentation. The higher the risk, the lower the number of women that needed to be treated to prevent one case of respiratory distress syndrome and vice versa. The risk, and hence the therapeutic benefits, depends not only on the gestational age at presentation but also on the post-test probabilities of spontaneous preterm birth associated with fibronectin testing. As shown in the table, if steroids were to be used for all symptomatic women at this gestation without fibronectin testing then we would need to treat 109 women with antenatal steroids to prevent one case of respiratory distress syndrome. If we treated only those women with a positive test result we would need to treat 17, a figure considerably lower than that without testing
This approach will allow clinicians to make explicit decisions on the basis of more realistic probabilities generated by fibronectin testing and provides a framework for the use of diagnostic evidence in therapeutic decision making. Specifically, our results enable clinicians to make a more rational approach to decision making regarding inpatient admission, administration of antenatal steroids, and in utero transfer in women with threatened spontaneous preterm birth. Future research should focus on undertaking high quality primary studies of test accuracy to improve our ability to predict spontaneous preterm birth.
We thank Fujian Song, Malgorzata Adamcyzck, and Pavlina Jungova for their help in extracting relevant data from Chinese, Polish, and Czech manuscripts, respectively. We also thank Julie Glanville and Stephen Duffy at the NHS Centre for Reviews and Dissemination at York for contribution to the database searches. We are grateful to Professor R Zimmermann, Professor M J Whittle, and Mr H Gee for their critical review of the manuscript and for suggestions for improvement.
Contributors: KSK, JKG, and JK conceived the review. HH, LMB, and KSK collected, analysed, and interpreted the data and drafted the manuscript. JK and JKG made critical revisions. KSK, LMB, and HH are the guarantors.
Funding WellBeing grant No K2/00.
Competing interests None declared.