The effectiveness of glucocorticoids in treating croup: meta-analysisBMJ 1999; 319 doi: http://dx.doi.org/10.1136/bmj.319.7210.595 (Published 04 September 1999) Cite this as: BMJ 1999;319:595
- Monica Ausejo, visiting fellowa,
- Antonio Saenz, visiting fellowa,
- Ba' Pham, statisticiana,
- James D Kellner, assistant professorb,
- David W Johnson, assistant professorb,
- David Moher, directora,
- Terry P Klassen, director, children's hospital ()a
- a Thomas C Chalmer's Center for Systematic Reviews, Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, Ontario K1H 8LI, Canada
- b Department of Pediatrics, University of Calgary, Alberta Children's Hospital, 1820 Richmond Road, Calgary, Alberta T2T 5C7, Canada
- Correspondence to: TP Klassen Department of Pediatrics, University of Alberta, 2C3.67 Walter C Mackenzie Health Sciences Center, Edmonton, Alberta T6G 2R7
- Accepted 2 June 1999
Objective: To determine the effectiveness of glucocorticoid treatment in children with croup.
Design: Meta-analysis of randomised controlled trials that examine the effectiveness of glucocorticoid treatment in children with croup.
Main outcome measures: Score on scale measuring severity of croup, use of cointerventions (adrenaline (epinephrine), antibiotics, or supplemental glucocorticoids), length of stay in accident and emergency or in hospital, and rate of hospitalisation.
Results: Twenty four studies met the inclusion criteria. Glucocorticoid treatment was associated with an improvement in the croup severity score at 6 hours with an effect size of −1.0 (95% confidence interval −1.5 to −0.6) and at 12 hours −1.0 (−1.6 to −0.4); at 24 hours this improvement was no longer significant (−1.0, −2.0 to 0.1). There was a decrease in the number of adrenaline treatments needed in children treated with glucocorticoids: a decrease of 9% (95% confidence interval 2% to 16%) among those treated with budesonide and of 12% (4% to 20%) among those treated with dexamethasone. There was also a decrease in the length of time spent in accident and emergency (−11 hours, 95% confidence interval −18 to 4 hours), and for inpatients hospital stay was reduced by 16 hours (−31 to 1 hour). Publication bias seems to play a part in these results.
Conclusions: Dexamethasone and budesonide are effective in relieving the symptoms of croup as early as 6 hours after treatment. Fewer cointerventions are used and the length of time spent in hospital is decreased in patients treated with glucocorticoids.
Most trials evaluating the treatment of croup are of high methodological quality and hence have a low risk of bias
Publication bias, however, seems to be a problem, making the results of this meta-analysis somewhat less certain
Glucocorticoids seem to bring about clinical improvement in children with croup within 6 hours
Nebulised budesonide or dexamethasone, given either orally or intramuscularly, is effective in treating croup
The use of glucocorticoids is associated with a lower rate of use of cointerventions and shortens the time spent in hospital
Croup (laryngotracheobronchitis) is a common cause of upper airway obstruction in children and is characterised by hoarseness, a barking cough, and inspiratory stridor. These symptoms are thought to occur as a result of oedema of the larynx and trachea which has been triggered by a recent viral infection. Parainfluenza virus type 1 is the agent most commonly identified in cases of croup.1
Although croup is a self limiting illness, it is a large burden on healthcare systems because of the frequent visits made to doctors and accident and emergency departments and, when necessary, hospitalisations. The annual incidence of croup in children younger than 6 years ranges from 1.5% to 6%.2 Admission rates for croup in children seen in outpatient settings range from 1.5% to 31% of cases seen; these figures vary widely, depending on hospital admission practices and the severity of the disease in the population being assessed.3 4
The standard management of croup includes mist treatment (that is, treatment with humidified air), although there is little evidence that this is effective.5 Racemic adrenaline (epinephrine), or L-adrenaline, has been shown to provide temporary relief to patients with croup but is not thought to have longer term benefits.6 Since the late 1980s it has been recognised that glucocorticoids provide some clinical benefit in children with croup. In 1989, Kairys et al published a meta-analysis of clinical trials examining the benefit of glucocorticoids.7 However, since then a number of randomised trials have been published, and there has been increasing interest in the use of glucocorticoids to treat outpatients with croup. The objective of this meta-analysis was to provide evidence to guide clinicians in their treatment of patients with croup, to examine the effectiveness of glucocorticoids in these patients, and to identify areas of uncertainty for future research.
We searched Medline from January 1966 to August 1997, exploding glucocorticoid treatment (and each of the terms for corticosteroids) and croup; we restricted the search to randomised controlled trials using a previously validated strategy (see appendix 1 on the BMJ's website) We searched Excerpta Medica and Embase from January 1974 to August 1997 (appendix 1). The Controlled Trials Register of the Cochrane Library was also searched; it includes studies identified by the Acute Respiratory Infection Review Group through the hand searching of key journals. We also sent letters to the authors of trials published in the past five years to enquire whether they knew of any other published or unpublished trials. Two researchers (TPK, MA) then selected studies as being potentially relevant based on a review of the titles and abstracts, if available. The complete text of these studies was then retrieved.
All studies that had been retrieved were reviewed independently by two reviewers (AS, TPK). To be eligible for inclusion in this review a study had to meet all of the following criteria: it had to have studied patients with croup; an intervention with glucocorticoid had to have been compared with either placebo or any other active treatment; clinically relevant outcome measures had to have been used, such as the clinical score, hospitalisation rate (in outpatient studies only), length of time in hospital, or additional interventions used; and patients had to have been randomly assigned to treatment groups. Studies written in any language were eligible for inclusion. The weighted κ score was used to measure interrater agreement. Differences over which studies should be included were resolved by consensus reached after discussion.
Once we identified studies as being relevant for review, they were masked by obscuring the authors' names and institutions, the location of the study, reference lists, and any other potential identifiers; this was done by an independent research assistant who was not involved with the abstraction of data. Data were extracted using a structured form that captured patient status (inpatient or outpatient), the intervention and its control, the name of drug, the route of administration, and the dose. Additionally, data were collected on the primary outcome measure; clinical croup score at baseline and at any subsequent assessment times; length of stay in hospital or accident and emergency in hours; whether the patient had improved (coded yes or no); and the use of additional interventions such as adrenaline, supplemental glucocorticoids, mist treatment, intubation, or antibiotic treatment. Data were extracted by one reviewer (MA) and checked for accuracy by a second reviewer (TPK).
Quality assessment of trials
We assessed quality using empirically derived items. We used the previously validated Jadad 5 point scale to assess randomisation (0-2 points), double blinding (0-2 points), and withdrawals and dropouts (0-1 point).8 For component assessment, concealment of allocation was described either as adequate, inadequate, or unclear.9 Sponsorship of studies was coded as either pharmaceutical company, other sources, or not mentioned.10 Two observers independently assessed quality (MA, JK), and interrater agreement was measured by the intraclass correlation.11 Differences were resolved by consensus.
All comparisons were performed between treatment and control groups thus preserving randomisation. The main outcome measure was the difference between treatment groups in the mean change from croup score at baseline. We derived the outcome measures from cross sectional summaries (for example, at baseline, 6 and 12 hours) in cases in which outcome measures were not reported directly. The variance of an effect size was derived from the common variance of a single croup score assuming a correlation of 0.5 between pre-treatment and post-treatment scores. Other variance imputations were performed according to the work of Follman et al.12 Variances of a single score were derived from the P values of the Mann-Whitney testw8 w9 w23 and from the measurement of confidence intervals.w4
The croup score was reported inconsistently because of the different scales used in each study, hence trial effect sizes were used in the pooled estimates.13 A treatment effect divided by its measurement variation (for example, a pooled standard deviation) gives an effect size. To aid in the interpretation of pooled results reported by standardised effect size, we converted the effect size scale back to the croup score using a subset of trials in which such scores were available. Another way to express the croup score is by determining a clinically important change in the score in the individual patient and then calculating the proportion of patients who had significant improvement among the patients treated with glucocorticoids or placebo.
In addition to funnel plots, we used the rank correlation test14 and a graphical method15 for the detection of publication bias.16 Adjustment for publication bias in the pooled estimates was performed using the graphical method,15 a selection model approach,17 and the trim and fill method.18 We used more than one method since the relative merits of the methods are not well established. Tests of homogeneity were performed with the χ2 statistic for between study variation.13 For the analyses of croup scores and secondary outcomes, fixed effect models were used to combine treatment effects if there was no evidence of heterogeneity across studies; otherwise, the more conservative estimates from random effect models were reported. For binary data (such as improvement in signs and symptoms and the presence of various additional interventions), rate differences and the number needed to treat were derived. For the number needed to treat, we inverted the differences in the proportions improved and their 95% confidence intervals.
Heterogeneity between studies was explored using sensitivity and subgroup analyses performed on the primary outcome of the change in croup scores from baseline at 6 hours. Westley scores were the scores most commonly used in the trials.19 Westley scores use a 17 point scale to assess air entry (2 points), stridor (2 points), intercostal retractions (indrawing of the chest wall between the ribs on inspiration)(3 points), cyanosis (5 points), and level of consciousness (5 points). Treatment differences in Westley scores were calculated in place of effect sizes to provide an approximate conversion between the two scales. Differences between estimates derived from Westley and other scores were assessed.
A trial effect size was defined as the difference between the two treatments in the mean change from croup score at baseline. We derived effect sizes from cross sectional summaries (for example, baseline, 6, 12 hours) for trials not reporting effect sizes directly. The standardised effect size (that is, an effect size divided by the common standard deviation of the change from baseline) was used to combine trials reporting different versions of the croup score. Sensitivity analyses were based on the type and dose of glucocorticoid administered. The quality score of the included trials was incorporated into the pooled estimates using the method proposed by Moher.20 In addition, the impact of the concealment of treatment allocation on the pooled estimates was assessed.9
Study identification and characteristics
Ninety seven studies were identified as being potentially relevant and retrieved Two of these studies were in press at the time of data extraction and have since been published.w11 w13 Forty four studies were excluded because they were reviews or commentaries, 12 did not study croup, nine had inadequate randomisation strategies, four were retrospective studies, two had no control group, one had no outcome of interest, and one was a duplication. Therefore, 24 studies were included (references and full details of these studies can be found in table A on the BMJ's website). The weighted κ score between two reviewers was 0.89, indicating substantial agreement.
Twenty two of the included studies had been published in English, one in French, and one in Spanish. Dexamethasone was evaluated in 17 trials, budesonide in nine, and methylprednisolone in three; some studies examined more than one drug. Five of the trials compared active treatments; 19 were placebo controlled. The mean age of the children in the different studies ranged from 13 months to 45 months; the minimum age was 4 months and the maximum was 12 years. Fourteen trials were conducted on inpatients, and 10 were conducted on outpatients. However, studies tended to be small with a median of 40 (interquartile range 36 to 60) participants. The pooled baseline rates using fixed effect models were reported.
Quality assessment of trials
The intraclass correlation between two reviewers was 0.63 for the Jadad scale, 0.98 for allocation concealment, and 1.0 for sponsorship, indicating at least substantial agreement in all cases. The median Jadad score was 3 (interquartile range 2.75 to 4) or 60% (55% to 80%) for the best quality of reporting. Allocation concealment was adequate in 11 (46%) of the studies, inadequate in one (4%), and unclear in 12 (50%). Pharmaceutical sponsorship was identified in three (13%) studies, support was from other sources in three (13%), and not mentioned in 18 (75%). Overall, the quality of studies was better than has been observed for other diseases.9 20 21
The most frequent outcome utilised in 13 studies was the clinical croup score based on a 17 point ordinal scale developed by Westley.19 Other scoring systems, none of which have been validated, were utilised in five studies; in six studies no clinical score was reported.
The improvement in the Westley croup score at 6 hours was 2.8 (95% confidence interval 2.2 to 3.5) for dexamethasone or budesonide versus 1.0 (0.3 to 1.7) for placebo. The difference in improvement in the Westley score between treatment arms at 6 hours was 1.6 (1.1 to 2.2). The pooled standardised effect size was 1 (0.6 to 1.5) at 6 hours and 1 (0.4 to 1.6) at 12 hours. From our data, a standard effect size of 1.2 (0.7 to 1.7) corresponded with an improvement of 1.6 (1.1 to 2.2) in a Westley score (fig 1) (see appendix 2 on the BMJ's website for a list of included trials). This change was not significant at 24 hours; however, fewer patients were evaluated at 24 hours and hence the lack of significance may be a reflection of a lack of statistical power. The magnitude of change of −1 is similar to that seen at earlier evaluation points but the 95% confidence interval crosses 0. A decrease in effect size of 1 from baseline is thought to be a clinically important change.
At 6 hours, the difference in risk was 15% (95% confidence interval 2% to 28%) with a number needed to treat of 7 (4 to 50). The baseline rate of clinical improvement was 41% (32% to 50%). At 12 hours the risk difference was 21% (9% to 33%) with a number needed to treat of 5 (1 to 11). The baseline rate of clinical improvement was 68% (58% to 77%). At 24 hours, the risk difference was 12% (3% to 22%) and the number needed to treat was 8 (5 to 33). The baseline rate of clinical improvement was 83% (75% to 91%). Although not all studies contributing to the effect size expressed their results as improved versus not improved, the degree of benefit of a number needed to treat of 5 to 7 patients (at different assessment times) would be sufficient to support the use of glucocorticoids over placebo.
There was no significant increase in the use of antibiotics among those treated with glucocorticoids as compared with those treated with placebo when expressed as the difference in risk. This was consistent for the dexamethasone group (4%, −20% to 27%) and the budesonide group (−2%, −17% to 13%). There was a significant decrease noted in the use of adrenaline in the glucocorticoid groups with a difference in risk of −9% (−16% to −2%) in the budesonide group (number needed to treat 10; baseline rate 16%) and −12% (−20% to −4%) in the dexamethasone group (number needed to treat 8; baseline rate 23%). There was no significant impact on the use of supplemental glucocorticoids among either those treated with dexamethasone (4%, −4% to 13%) or those treated with budesonide (−15%, −32% to 2%).
When any glucocorticoid was compared with placebo (11 studies, 1150 patients) there was no significant change in the rate of difference of intubation or tracheotomy −2% (−14% to 10%; baseline rate 3.2%, 2.9% to 3.5%).
Overall, a significantly shorter time was spent in accident and emergency when children were treated with a glucocorticoid as compared with placebo (5 studies, 596 patients); the weighted mean difference was −11 (−18 to 4) hours. For inpatients, the difference was −16 (−31 to 1) hours.
There was a non-significant decrease of −16% (−39% to 6%) in the rate of hospitalisation for patients treated with budesonide versus patients treated with placebo (baseline rate 32%, 24% to 39%). This was also true for patients treated with dexamethasone as compared with patients treated with placebo (−2%, −31% to 5%) or if any glucocorticoid was compared with placebo (−14%, −12% to 5%). The more conservative random effects model was used to derive the overall estimate of the difference in hospitalisation rates because there was significant heterogeneity between studies. If the fixed effects model estimate was used there was a significant decrease in hospital admissions between patients treated with budesonide and those treated with placebo (−15%, −20% to −10%).
Sensitivity and subgroup analyses
The sensitivity analysis showed that the method of scoring the severity of croup was important (fig 2). An effect size of −1.2 (−1.7 to −0.7) was identified when the Westley croup score was used (9 studies, 569 patients) as compared with an effect size that was 50% smaller when other croup scores were used (4 studies, 497 patients; −0.6, −1.5 to 0.3), a size that was no longer significant. The Westley score is the only method that has undergone validation and reliability testing and been shown to be sensitive to important changes in a patient's clinical status. The smaller treatment effect noted with non-Westley scores could be the result of sensitivity to change or perhaps a greater degree of variability caused by low reliability.
We were unable to compare the route of administration of glucocorticoids in a meaningful way because of the lack of standardisation of scores between studies. The quality weighting of the effect size did not change the estimate or the width of the 95% confidence interval; this is in part explained by the high methodological quality of the studies. The estimate derived from studies in which allocation was adequately concealed was −1.2(−1.9 to −0.5) and for the studies in which it was inadequately concealed or in which it was unclear was −0.9 (−1.4 to −0.3). These differences are probably not clinically or statistically significant.
We identified a marked publication bias, and there is also the possibility that small studies that showed that glucocorticoids had no effect were suppressed from publication. There was a significant correlation between treatment effect and sample size (for example, rank correlation test P=0.013; graphical method P=0.004). The Dear-Begg estimate of this correlation was 0.29. Pooled effect size at 6 hours calculated using the simple graphical method was −1.1 (−1.5 to −0.8); with the selection model it was −1.2 (−2.4 to −0.01); and with the trim and fill method it was −0.2 (−0.8 to 0.4). The trim and fill method suggested that seven small trials were suppressed because their results were not significant.
Efficacy of steroids
This meta-analysis has shown that treatment with glucocorticoids is effective in improving symptoms of croup in children by as early as 6 hours and for up to at least 12 hours after treatment. This is shown by the significant improvement in scores of croup severity, shorter hospital stays, and the fact that adrenaline was used less often as an additional intervention. Although the decrease in the rate of hospitalisation was not significant, this outcome criterion varies from hospital to hospital and the direction of the change was towards effectiveness The degree of benefit identified would merit the use of glucocorticoids, since 5 to 7 patients would need to be treated with glucocorticoids for one patient to experience a significant improvement in symptoms.
This finding did not change when the quality of the studies included was incorporated into our pooled estimate. We found a significant improvement even though almost half of the patients included were assessed using scoring tools that have not been validated and may be less sensitive to important changes in the patient's clinical status.
Of more importance is the fact that publication bias seems to be a modifier of this result, and it is likely that our analysis did not include smaller studies that had statistically negative results. Publication bias is an important threat to the validity of systematic reviews and is difficult to combat except through the registration of all randomised controlled trials on human participants. The existence of this bias suggests that this meta-analysis overestimates the effectiveness of treatment with glucocorticoids. The results indicate that the number needed to treat at 12 hours is 5 patients for one patient to experience improvement. If publication bias exists and has exaggerated the benefit of treatment, then the number needed to treat would be greater. Thus, clinicians will have to decide whether it is still worth treating patients for croup. Considering the comparative safety and low cost of dexamethasone, it probably makes sense to continue using glucocorticoids. In cases in which the effect of adopting treatment with glucocorticoids has been examined, there has been evidence for a decline in hospital admission rates, fewer admissions to the intensive care unit, and shorter lengths of stay.22 23
The small numbers of patients in each study and confounding variables make it difficult to make definitive recommendations regarding the superiority of any glucocorticoid, dose, or route of administration In the absence of further evidence, an oral dose of dexamethasone, probably 0.6 mg/kg, should be preferred because of its safety and efficacy. In a child who is vomiting, nebulised budesonide or intramuscular dexamethasone might be preferable.
Our results are mostly consistent with those of the meta-analysis by Kairys et al which found that glucocorticoids are beneficial in patients with croup7 but there are some important differences. Because of the lower probability of bias in such studies we included only randomised controlled trials; hence some studies that were included by Kairys et al were not included in our meta-analysis. These excluded studies tended to be older and used techniques of quasirandomisation, such as alternate allocation.24–26 Additionally, 15 randomised controlled trials on this topic have been published since 1989, many of them outpatients trials examining the effectiveness of budesonide or dexamethasone. The differences between inclusion criteria in our meta-analysis and that by Kairys et al may account for Kairys et al's finding that glucocorticoids significantly decrease the risk of intubation, whereas we did not observe a significantly decreased risk.
The quality of the studies included was good. The median Jadad score in our study was 3; in other studies the median is often 2 or less.20 21 In 46% of the trials allocation was adequately concealed whereas in most other studies about 10% to 15% of the trials being assessed have adequate concealment.9 20 Although quality assessment and methods of its incorporation into systematic reviews are controversial, we have recently shown the importance of such assessments in detecting bias and have proposed a method of quality weighting.20
Outcome measures are important in detecting significant change in a patient's clinical status. It is important that this measure is valid (that it measures what it ought to be measuring) and responsive (that it is sensitive to change).27 This meta-analysis supports the importance of using valid and responsive outcome measures since the magnitude of the effectiveness of the treatment in this study was dependent on which scoring method was used. We have shown that the Westley score is valid, responsive, and a reliable measure.28 Although further validation and modification could be made to the Westley score, it should remain the primary outcome measure in trials currently being conducted.
Future trials may want to explore which dose of dexamethasone is most effective: is 0.15 mg/kg really as effective as 0.6 mg/kg? This meta-analysis supports the use of glucocorticoids to treat any patient with croup who has any signs of respiratory distress.
Alison Jones was helpful in retrieving articles and in managing this project We also thank Jessie McGowen for helping with the electronic searches of bibliographic databases. We thank Jack Vevea for the use of a computer program which implemented the selection model for publication bias.
Contributors: MA and AS were responsible for most of the project management and were responsible for retrieving articles, assessing their relevance and quality, and extracting data. They also helped with writing the paper. BP helped with data management and in the statistical analysis of this project. DM helped in reaching the consensus decisions on relevance and quality assessment and provided methodological support and editorial comments. JDK assessed the quality of trials included and provided methodological support and editorial comments on the paper. TPK helped assess studies for their relevance for inclusion, checked the data for accuracy, and helped write the paper. Annie Walker, of the Child and Youth Clinical Trials Network, assisted in the preparation of this article. TPK is the guarantor for the study.
Funding This review was supported in part by the Health Research Fund from the government of Spain: MA received FIS grant No 97/5407 and AS received FIS grant No 97/5408.
Competing interests DWJ received research funding from Astra Pharmaceuticals, the manufacturer of budesonide, to complete a trial comparing treatment with budesonide, dexamethasone, and placebo (reference w12 on the BMJ's website).
website extra References for the trials included and other additional information can be found on the BMJ's website www.bmj.com