- Nerys F Woolacott, research fellow ()1,
- Milo A Puhan, research fellow2,
- Johann Steurer, director2,
- Jos Kleijnen, director1
- 1 Centre for Reviews and Dissemination, University of York, York YO10 5DD
- 2 Horten Centre, University of Zurich, Zurich, Switzerland
- Correspondence to: N F Woolacott
- Accepted 6 April 2005
Objective To assess the accuracy and effectiveness of the screening of all newborn infants for developmental dysplasia of the hip (DDH) using ultrasound imaging, as is standard practice in some European countries but not in the United Kingdom, the United States, or Scandinavia.
Design Systematic review.
Data sources Twenty three medical, economic, and grey literature databases (to March 2004), with no limitations of design or language; some references were provided by experts.
Selection of studies Only diagnostic accuracy studies and comparative studies conducted in an unselected newborn population were eligible for the review. Two reviewers independently selected the studies and performed the quality assessment.
Results The review identified one diagnostic accuracy study, and this was of limited quality. In this study the reference standard was treatment up to age of 8 months or an abnormal ultrasound finding at age 8 months. Ultrasound screening had a sensitivity of 88.5% (95% confidence interval 84.1% to 92.1%), specificity of 96.7% (96.4% to 97.4%), a positive predictive value of 61.6% and a negative predictive value of 99.4%. Ten studies evaluated the impact of ultrasound in screening, but these too had various methodological weaknesses, limiting the reliability of their findings. Compared with clinical screening, general ultrasound screening in newborns may increase overall treatment rates, but ultrasound screening seems to be associated with shorter and less intrusive treatment.
Conclusions Clear evidence is lacking either for or against general ultrasound screening of newborn infants for DDH. Studies that investigate the natural course of the disorder, the optimal treatment for DDH, and the best strategy for ultrasound screening are needed.
The term developmental dysplasia of the hip (DDH) refers to an abnormal relation between the femoral head and the acetabulum. At birth the femoral head and the acetabulum are mainly cartilaginous, and a normal adult hip joint depends on their correct development. During the newborn period unstable hips are common, but most of these develop normally.1 If subluxation or dislocation persists, anatomic changes develop, and eventually the correct positioning of the femoral head within the acetabulum (reduction) can be achieved only with surgery. Early detection of DDH can enable less invasive and potentially more effective corrective procedures.
Various screening strategies are available for early detection and treatment of DDH. Clinical screening of newborns includes ascertainment of the medical history (family history, pregnancy) and a clinical examination using Ortolani and Barlow manoeuvres. With ultrasound screening, an imaging technique developed by, in particular, Graf,2 Harcke,3 and Terjesen,4 5 cartilage can be visualised, and this allows detection of abnormal positioning of the femoral head within the acetabulum, instability, and dysplasia at a very young age. The timing of the ultrasound screening is an ongoing focus of debate6: some argue that all newborns should be screened within the first week of life,7 whereas others favour screening after two or three months because at an earlier age most hips with abnormal ultrasound findings subsequently develop normally.8 Early non-invasive interventions in newborns or infants suspected of being at risk of DDH after clinical or ultrasound screening, include broad diapering, splinting, overhead extensions, or the Pavlik harness.9 10 However, evidence on the effectiveness of these interventions is scarce.11
Some believe that DDH detected on ultrasonography should be treated very early or should be followed up intensively. The assumption of proponents of ultrasound screening is that untreated cases will have an adverse outcome,7 whereas others believe that the risk of overtreatment is considerable and that the cost-benefit equation for ultrasound screening is not favourable enough.10 12 Consequently, the screening of all newborn infants at birth for DDH using ultrasound imaging is standard practice in some European countries, such as Germany and Switzerland, but has not been accepted in the United Kingdom, the United States, or Scandinavia.13 14 Therefore, we conducted a systematic review to determine the diagnostic accuracy of ultrasonography for detecting DDH in a unselected population of newborns and to assess the impact of ultrasound screening of newborn infants.
Literature search and study selection
The literature search using the terms “ultrasonography”, “hip dysplasia”, and “new-born” (with their synonyms and closely related words) involved a range of 23 medical, economic, and grey literature databases including Medline, Embase, Biosis, Science Citation Index, the Cochrane controlled trials register, plus five websites. All searches were last updated in March 2004. The searches were not limited by study design or by language. We identified further studies by examining the reference lists of all included articles. In addition, some literature was provided by Swiss Federal Office for Social Security (which commissioned this review) and by individual experts. The full list of sources and the search strategy is available from the authors.
Two reviewers (NFW, MAP) independently appraised each reference according to the inclusion and exclusion criteria. Any disagreements were resolved by consensus. Studies eligible for inclusion were diagnostic accuracy studies in an unselected newborn population or studies comparing an ultrasound screening regimen with another screening strategy that reported on outcomes such as overall treatment rates, rates of operative intervention, rates of abduction splinting, rate of delayed diagnosis, time to treatment, duration of treatment, rate of treatment complications, false diagnostic labelling, and any long term functional outcomes (such as osteoarthritis). To avoid any spectrum bias that may arise from the selection of participants15 we aimed to review only studies of an unselected population of newborns, rather than infants with suspected or frank DDH or notable risk factors for DDH.
Data extraction and analysis
We extracted data on to predesigned forms. All relevant data were extracted by one reviewer (NFW) and independently checked for accuracy by a second reviewer (MAP). We did not have a general policy of contacting authors for study details because the time allowed by the commissioning body was limited. We did, however, request specific data for two trials where the total for the screened population was required,16 17 but these data were unavailable. Diagnostic accuracy studies were assessed for quality using the QUADAS checklist.18 For studies evaluating the impact of ultrasound screening on therapeutic decisions or patient outcomes, or on both, we created a checklist, which related to very general issues of study quality; this was done by combining the main elements of the checklists for cohort and randomised controlled studies given in a report by the NHS Centre for Reviews and Dissemination.19 Two reviewers independently assessed the quality of included studies and agreed on quality scoring in consensus. The included studies were combined in a narrative synthesis and treatment differences calculated (mean differences or absolute risk differences) with 95% confidence intervals. Findings were not pooled statistically because of the diversity of study designs, ultrasound techniques, and therapeutic management.
The search strategy generated 787 references. We selected 188 studies for full text assessment, of which 10 met the inclusion criteria. Of the excluded studies, about three quarters had not been conducted in a general (unselected) population of newborn infants, and about a quarter included unselected newborns but had no control group.
We identified one study that evaluated the diagnostic accuracy of ultrasound (table 1).20 The index test was ultrasonography at the age of 1, 2, and 3 months, and the reference standard was defined by the decision to treat or by an abnormal ultrasound finding at the age of 8 months. The quality of the study (see table A on bmj.com) was limited because the reference test might not have correctly classified patients and was not independent of the index test. Because the reference test was the end of follow-up and therefore encompassed decision to treat at any age, some treated infants might have resolved spontaneously; such cases represent overtreatment. The calculated sensitivity of ultrasonography was 88.5% (95% confidence interval 84.1% to 92.1%), the specificity 96.7% (96.4% to 97.4%), the positive likelihood ratio 29.1, the negative likelihood ratio 0.12, the positive predictive value 61.6%, and the negative predictive value 99.4%.
Impact of ultrasound screening
We identified two randomised controlled trials (RCTs)21 22 and eight non-randomised studies comparing ultrasound screening of newborns with another screening regimen (table 1). One of these studies was the diagnostic accuracy study described earlier.20 The ultrasonography was done with Graf's basic technique in six studies,17 20 23–26 with a modified technique after Terjesen27 in three,21 22 28 and with a modified technique after Harcke29 in one study.16 The level of experience of the examiners could not be compared between the studies because experience was described in only two studies.16 21 The overall quality of the included studies was limited. Even the two RCTs21 22 were of limited quality: one was found to have an allocation to treatment that was not truly random,21 and in neither RCT were assessors blind to screening group. The main biases inherent in the studies are summarised in table 1 (further details of the quality assessment are in table B on bmj.com). The main findings of the studies are given in table 2.
Both RCTs21 22 and all but one of the other five studies that reported overall treatment rate17 20 24–26 found an increase associated with general ultrasound screening. However, ultrasound screening was associated with a reduction in surgical procedures or inpatient treatment for the correction of DDH.16 17 20 23
Duration of treatment
Two studies reported effects on treatment duration. One, conducted in Poland, used broad diapering, splinting, and, where necessary, overhead extensions as treatment and reported a reduction in treatment duration from 11.6 (standard deviation 6.5) months to 7.8 (3.7) months after the introduction of ultrasonogrpahy.24 The other study, conducted in Jordan, involved treatment with the Pavlik harness; it found that ultrasound screening at birth was associated with a shorter mean treatment duration (1.16 months) than screening at age 3-4 months of age (mean treatment duration 2.9 months).25
Rate of developmental dysplasia of the hip diagnosed late
Three studies defined “late” diagnosis as diagnosis after age 1 month.21 22 28 In two of these studies the rate of late diagnosed DDH after clinical screening plus ultrasonography was compared with that seen with clinical screening alone, with prevalences per 1000 of 1.4 (95% confidence interval 0.18 to 3.39) versus 2.6 (1.0 to 4.19),21 and 0.7 (0 to 1.41) versus 2.6 (1.8 to 3.39).28 Two of the studies (both RCTs) compared general ultrasound screening with clinical screening plus selective ultrasound screening and reported higher rates with selective screening, but in neither study was the difference significant.21 22 The differences between studies may be explained partly by the small absolute number of cases from which the rates are calculated, but they may also be a reflection of an increasing level of expertise with ultrasound imaging over time (the study with the lowest rates being the most recent study).
In the study by Roovers et al, in which “late” was defined as at or after age 8 months, the number of cases of DDH missed by the two screening programmes (that is, those identified only at the reference test) was 17 (0.8%) with clinical screening compared with 31 (0.6%) with ultrasound screening20; this difference was not significant (-0.2%; -0.75% to 0.17%).
Our systematic review identified three important findings. Firstly, there is insufficient evidence for the diagnostic accuracy of ultrasound imaging as a screening tool. Secondly, ultrasound screening is likely to increase overall treatment rates, which could represent overtreatment. Finally, duration and intrusiveness of interventions are likely to be lowered with ultrasound screening.
Major methodological shortcomings of the available studies, however, limit these findings. The one diagnostic accuracy study that was performed in an unselected population of newborns provided only limited information. The reference standard was flawed because it ignored the fact that early detected DDH is known to resolve spontaneously in many cases.1 Therefore, many of the “true” cases of DDH identified in this study may have been cases of overtreatment, so the accuracy may have been overestimated. The study by Malkawi et al hinted that an initial screen at 4 months might prevent this happening, but the quality of that study was limited and the results may not be reliable.25
The objective of screening for DDH is to prevent it being diagnosed late, when treatment is more invasive and can be less successful. The two best designed and reported studies (that is, the RCTs21 22) did report this as an outcome measure, but, unfortunately both had short follow-up periods and defined a late detected case as one detected after age 1 month. As a basis for assessing the relative benefits of screening programmes this end point presumes that it is essential to detect and treat as many cases of DDH as possible within the first month of life. However, the clinical validity of this outcome is debatable as DDH identified at 1 month is often not true disease.30 When late was defined as at or after age 8 months,20 there was no significant difference between the proportion of cases that were detected late with clinical screening compared with ultrasound screening.
Data from RCTs indicate that ultrasound screening that is started in the first few days of life is associated with an increased rate of treatment compared with clinical screening, and the most recent observational study by Roovers et al indicates that ultrasound screening started at age 1 month is also associated with an increased rate of treatment but achieved with a greatly reduced referral rate.20 Studies do suggest that the number and severity of surgical procedures for the correction of hip dysplasia is reduced under a regimen of general ultrasound screening.16 17 20 23 The importance of overall treatment rate as an outcome measure is debatable. Increased treatment rates can be taken as an indication that fewer cases of DDH are missed. They can also be interpreted, however, as a measure of overtreatment. Clearly the reduction in surgical procedures associated with ultrasound screening seems to be an important benefit, but the risk-benefit ratio of an increase in less invasive forms of treatment has not yet been clearly established.
The use of historical controls in many studies reviewed here means that the effects of ultrasonography cannot be differentiated from the effect of changing treatment practice. Also, in most of the studies of screening programmes treatment outcome was not reported. Our review was not of studies of the effectiveness of treatment for DDH, but it is acknowledged that the evidence base is not strong.11 Generally, abduction therapy (from example, use of Pavlik harness) is considered to be an effective and benign intervention. However, a systematic review of English language observational studies reported that 20% to 100% of infants who had had abduction therapy eventually required surgery.10 Recently published surveillance data collected over five years in Germany showed that although the incidence of first operative procedures for DDH was low (at 0.26 per 1000 live births), 55% of children having a first operative procedure had been detected by the early ultrasound screening programme31; these children therefore represent a degree of failure of the available conservative treatment. This experience is reflected in that reported in a UK study, which found that all children with abnormal hip radiographs at age 2 years had started treatment before the age of 8 weeks and that overall 12% of all children treated with abduction splinting before the age of 8 weeks subsequently required surgery.11 These data would suggest some publication bias in observational studies of ultrasound screening in which the reported success rates of treatment are much higher.32
Our review has been unable to provide information on the adverse effects of general ultrasound screening—either of the treatment or of the screening programme as a whole. Of the 10 studies we identified, none properly assessed adverse events. This is an important omission as avascular necrosis has been reported in 1-4% of all treated infants.10 Pressure sores, epiphysitis, femoral nerve palsy, inferior dislocation of the hip, and medial instability of the knee joint have also been reported,10 and potential psychological problems must be considered.33 34
Our review has confirmed the conclusions reached by the Canadian Task Force10 and the American Academy of Pediatrics12 that ultrasound screening cannot yet be recommended. To date, a huge body of literature describes ultrasound imaging as a useful and accurate diagnostic tool for DDH, but it fails to provide clear evidence either for or against its use in the general screening of newborn infants. A recently published decision model acknowledges the lack of evidence to support universal screening for DDH in newborns.35 This decision model—which used prevalence estimates based on historical data and treatment rates derived from observational studies—predicted that compared with clinical screening or selective use of ultrasound imaging, universal ultrasound screening would achieve the highest number of favourable outcomes and the lowest occurrence of avascular necrosis. Another decision model36 considered three different ultrasound screening strategies: general screening at age 1, 2, or 3 months; general screening at 1 and 3 months; and selective screening at 1 month. These were compared with clinical screening at 1 month (as currently practised in the Netherlands), and general screening at 3 months was found to perform best.
What is already known on this topic
Ultrasound imaging has become an accepted tool for accurately diagnosing developmental dysplasia of the hip (DDH) and for monitoring the development and treatment of the condition
Debate continues over whether DDH that is detected by ultrasonography is necessarily clinically relevant
Ultrasound screening at birth for DDH in all newborn infants is standard practice in some European countries but not in the United Kingdom, the United States, or Scandinavia
What this study adds
The diagnostic accuracy of ultrasound imaging for DDH in the screening population has not been investigated adequately
Evidence is insufficient to support or reject general ultrasound screening of newborns for DDH
Studies that investigate the natural course of the disorder, the optimal treatment for DDH, and the best strategy for ultrasound screening are needed
Good quality trials to establish the optimum treatment and management for DDH are needed. A randomised controlled trial incorporating optimum treatment and management and comparing general ultrasound screening at 1 month and at 3 months is warranted. In the meantime, the current status of the evidence for the general screening of newborn infants for DDH provides us with a good example of how early acceptance of an intervention or technology can inhibit or even preclude good quality research, resulting in long term if not permanent uncertainty.
Two further tables of data are on bmj.com
Contributors All authors conceived and designed the study. Kate Misso of the Centre for Reviews and Dissemination designed and did the searches of electronic databases. NFW and MAP collected the data. All authors analysed and interpreted the data. NFW and MAP drafted the manuscript, and all authors revised it. JK and JS obtained the funding. JK is the guarantor.
Funding This study was funded by Bundesamt für Sozialversicherung (BSV, the Swiss Federal Office for Social Security). The initial proposal for the study was initiated by BSV, and BSV received the full study report on which this manuscript is based. BSV has seen a draft of this manuscript but has made no contribution to its content.
Competing interests None declared.
Ethical approval Not needed.