- Giuseppe G L Biondi-Zoccai, interventionist, PhD student ()12,
- Marzia Lotrionte, cardiologist3,
- Antonio Abbate, internal medicine resident4,
- Luca Testa, cardiology fellow3,
- Enrico Remigi, postdoctoral fellow5,
- Francesco Burzotta, assistant professor3,
- Marco Valgimigli, PhD student6,
- Enrico Romagnoli, interventionist7,
- Filippo Crea, director3,
- Pierfrancesco Agostoni, research fellow8
- 1 Hemodynamics and Cardiovascular Radiology Service, Policlinico San Donato, Via Morandi 30, 20097 San Donato Milanese, Italy
- 2 Institute of Medical Statistics and Biometrics, University of Milan, Milan, Italy
- 3 Institute of Cardiology, Catholic University, Rome, Italy
- 4 Department of Medicine, Virginia Commonwealth University, Medical College of Virginia Campus, Richmond, VA, USA
- 5 School of Engineering, University of KwaZulu-Natal, Durban, South Africa
- 6 Thorax Center, Erasmus University Medical Center, Rotterdam, the Netherlands
- 7 Department of Cardiology, Catholic University Hospital, Campobasso, Italy
- 8 AZ Middelheim Hospital, Antwerp, Belgium
- Correspondence to: G G L Biondi-Zoccai
- Accepted 14 November 2005
Objective To appraise multiple systematic reviews on the same clinical topic, focusing on predictors and correlates of quality of reporting of meta-analysis (QUOROM) scores.
Design Case study.
Setting Reviews providing at least individual quantitative estimates on role of acetylcysteine in the prevention of contrast associated nephropathy.
Data sources PubMed, the database of abstracts of reviews of effects, and the Cochrane database of systematic reviews (updated March 2005).
Main outcome measures Funding, compliance with the QUOROM checklist, scores on the Oxman and Guyatt quality index, and authors' recommendations.
Results 10 systematic reviews, published August 2003 to March 2005, were included. Nine pooled events despite heterogeneity and five recommended routine use of acetylcysteine, whereas the remaining studies called for further research. Compliance with the 18 items on the QUOROM checklist was relatively high (median 16, range 11 to 17), although shorter manuscripts had significantly lower scores (R = 0.73; P = 0.016). Reviewers who reported previous not for profit funding were more likely to score higher on the Oxman and Guyatt quality index. No association was found between QUOROM and Oxman and Guyatt scores (R = −0.06; P = 0.86), mainly because of greater emphasis of the Oxman and Guyatt scores on the appraisal of bias in selection and validity assessment (inadequate in five reviews).
Conclusions Multiple systematic reviews on the same clinical topic varied in quality of reporting and recommendations. Longer manuscripts and previous not for profit funding were associated with higher quality.
Systematic reviews incorporating meta-analytic methods provide a unique means of searching, appraising, and summarising the results of several studies to enable informed decision making.1 2 Since 1976 a large number of such systematic reviews have been published, varying widely in quality and standards.3–8 Anecdotal reports of multiple reviews focusing on the same clinical topic have differed in quality and methods used, leading to conflicting conclusions.7 9 10
In 1999 several authorities proposed the quality of reporting of meta-analysis (QUOROM) statement to improve and standardise reporting, as previously done with randomised controlled trials with the consolidated standards of reporting trials (CONSORT) statement.11 The aim of the QUOROM checklist was to improve the standards of reporting, not strictly to avoid duplicated systematic reviews.
Since then many journal editors and reviewers, including those involved in the Cochrane Collaboration, have pursued compliance with the QUOROM checklist to provide sound and reproducible results.1 The effect of the QUOROM guidelines on the design, conduct, and reporting of systematic reviews is, however, unclear.
Several randomised controlled trials have investigated the role of acetylcysteine in the prevention of contrast associated nephropathy, but with conflicting results. In light of this, a systematic review was carried out to provide more comprehensive and robust conclusions and reported in August 2003.12 In the following months other systematic reviews on the same topic were published, with different findings.
We appraised the cluster of duplicate systematic reviews on the role of acetylcysteine to prevent contrast nephropathy published in peer reviewed journals, focusing on predictors and correlates of QUOROM quality scores. We explored the association between compliance with the QUOROM statement and characteristics of the manuscripts, such as length, number of trials and patients, and funding source.
We searched for systematic reviews in PubMed according to a defined strategy,13 and in the Cochrane database of systematic reviews and the database of abstracts of reviews of effects (updated March 2005). Key terms included “acetylcysteine”, “contrast”, “meta-analysis”, “nephropathy”, and “systematic review”.14 We included systematic reviews of randomised controlled trials that compared acetylcysteine with control for the prevention of contrast associated nephropathy, reviews published as complete reports, and reviews that reported quantitative estimates at least for individual studies. We excluded systematic reviews embedded in editorials, correspondence, or reports of randomised controlled trials and those published only in abstract form. We also excluded incompletely reported reviews (for example, those reported in an editorial or in the discussion section of another study type or reported as part of a controlled trial) to avoid limiting the chance of the article complying with the QUOROM statement and thus undermining the quality of reporting of such reviews. We applied no language restrictions.
Study abstraction and appraisal
We abstracted data on the journal in which the systematic reviews were published, dates of initial submission and publication, number of authors, article length in pages search strategies (including database searched language restrictions), study design (randomised and non-randomised controlled trials), number of trials and patients, journal impact factor,15 and whether estimates were pooled (with pertinent methods, summary estimates, and results). We also checked for use of funnel plots to appraise small sample size effects, whether heterogeneity was explored and discussed, methods used, and whether formal metaregression had been carried out and the control event rate had been used as covariate.2 16 We contacted the corresponding authors for details of funding sources (any previous for profit or not for profit funding or specific funding for the review).
As a measure of study quality we appraised the compliance of each systematic review with the QUOROM checklist.11 We considered that the study had complied with any of the 18 specific items if over 50% of the requirements for each item had been met. We also used the Oxman and Guyatt index for scientific quality of research reviews, a validated score with established inter-rater agreement (see www.bmj.com).2 17 Finally, we assessed the main findings and conclusions and distinguished these as supporting the role of acetylcysteine in preventing contrast associated nephropathy, not supporting its role, or calling for further research. We then extracted excerpts from the authors' recommendations. Two unblinded reviewers (GGLB-Z, PA) independently appraised the studies. Discrepancies were resolved by consensus.
We used Fisher's exact test to compare the proportions of categorical variables (for example, funding source and authors' recommendations) and Mann-Whitney U or Kruskal-Wallis tests to compare the means (ranges) for continuous variables (for example, QUOROM scores according to funding source). Pearson and Spearman tests were used to assess correlations. We used Cohen κ and φ to determine the agreement between QUOROM scores, funding source, and authors' recommendations before consensus.18 Analyses were carried out using SPSS version 11.0.
Figure 1 shows the flow of reviews through the study. After exclusions from the 52 citations initially identified (double hits, non-systematic or incompletely reported reviews,19 reviews that lacked quantitative effect estimates, and an ongoing Cochrane review (due for publication in late 2006),20 10 articles were included (table 1).12 21–29 All were originally submitted between May 2003 and July 2004, published between August 2003 and March 2005, and with no overlap in authorship (49 investigators). Table 2 gives details of the randomised controlled trials that were pooled for each included systematic review. Reviews carried out later could identify and pool randomised controlled trials reported after the publication of the earliest reviews. In three cases the journal in which articles appeared was a general medicine journal, in three cases a cardiology journal, and in four cases a nephrology journal. The median impact factor was 3.9 (range 18.3 to 0). All the studies used the same primary end point of contrast associated nephropathy (defined as > 44.2 μmol/l or > 25% increase in serum creatinine concentration from baseline to 48 hours).
Funding was from for profit sources in five cases, not for profit sources in seven cases, and specific for the review in one case (table 1).
Search strategies varied from extensive (five databases in one case) to relying on only Medline. Summary estimates used for pooling were odds ratios (three reviews), relative risks (six), and risk differences (one), by means of random effects (eight) or fixed effects (one) methods (table 3). Quantitative results were reported using both P values and confidence intervals in all cases, the practice supported by most reviewers.1 2 Exploratory metaregression was carried out in light of statistical heterogeneity in six studies. Although most metaregressions included covariates likely to accommodate variability in underlying risk, none formally tested as covariate the role of control group event rates, a practice advocated by some authors and criticised by others.16 30
Five reviews recommended routine, or almost routine, use of acetylcysteine in the prevention of contrast nephropathy. The other five were more cautious and, although not dismissing the potential benefit and limited toxicity and costs of acetylcysteine, called for further well designed randomised controlled trials on the subject (table 4). These discrepancies occurred despite similar pooled effect estimates.
Compliance with the QUOROM statement
Agreement before consensus between reviewers was high for compliance with the QUOROM items (174/180 (91%), κ = 0.89, 95% confidence interval 0.80 to 0.97; φ = 0.89), and 100% (10 reviews) for authors' recommendations. The median compliance with the QUOROM checklist was 16 (range 11 to 17; table 5).
In only three cases could the report be identified from the title as a meta-analysis or systematic review of randomised controlled trials. A structured abstract was available in seven reviews, and objectives and conclusions were explicitly stated in 10 and nine, respectively. In the abstract sections, data sources, review methods, and results complied with the QUOROM statement in only six cases.
All the studies fulfilled the QUOROM requirements for the introduction and discussion sections, as they reported search strategies, selection criteria, and methods for quantitative data synthesis employed for the systematic review. Most of the studies complied with the recommendations for reporting of the assessment of the methodological quality of the primary studies (seven reviews), data abstraction (eight), and study characteristics (nine).
Despite thorough reporting of methods by most studies, no review quantified the agreement between reviewers on selection and appraisal of the methodological quality of the primary studies. A diagram showing the flow of the trial was provided in only three cases. Details on study characteristics and quantitative data synthesis in the results section were, however, provided in enough detail in all the reviews.
Quantitative analyses and predictors of QUOROM score
A significant association was found between number of published pages and overall QUOROM score (Pearson R = 0.73, P = 0.016, Spearman ρ = 0.64, P = 0.046), suggesting that the longer the manuscript, the greater the likelihood of complying with the QUOROM statement (2). Although this finding might have relevant implications for editorial policies, it should be interpreted with caution given the small sample size and the fact that two studies largely drive the hypothesis tests.
Language restrictions were associated with fewer trials analysed (7, range 5-11 v 15, range 7-21, P = 0.040) and fewer patients analysed (805, range 643-1207 patients v 1539, 805-2420; P = 0.033) (tables 1 and 3).
Studies from reviewers reporting previous not for profit funding were more likely to score higher on the Oxman and Guyatt index (6, range 3-7 v 2, range 1-4; P = 0.037), to search more databases for original articles (4, 3-5 v 1.3, 1-2; P = 0.014), and to be published in a journal with a greater impact factor (4.9, 3.1-18.3 v 1.3, 1.2-1.5; P = 0.020). Conversely, funding was not significantly associated with authors' recommendations supporting the routine use of acetylcysteine (43% (three of seven) for studies reporting previous not for profit funding compared with 66% (two of three) for the others; P = 0.42) or QUOROM score (16, range 11-17 for studies with previous not for profit funding compared with 14, range 12-16 for the others; P = 0.38).
Journal readership on the basis of type of journal was significantly associated with quality of reporting, with a significant trend towards increasing QUOROM scores from reviews published in cardiology journals compared with those published in general medicine and nephrology journals. Moreover, QUOROM scores and Oxman and Guyatt scores were not associated (Pearson R = −0.06, P = 0.86, Spearman ρ = −0.11, P = 0.76). In most cases this was due to lack of assessment of bias in selecting and abstracting studies (six reviews) or lack of appraisal of internal validity of individual randomised controlled trials (five).
Limits of the included studies
Clinical and statistical heterogeneity are a major source of discordance in meta-analyses. A thorough appraisal of this issue is paramount in any research synthesis. Moreover, some authorities consider heterogeneity (P < 0.05 to 0.10 with hypothesis tests or I2 > 50%) a potential hurdle to the completion of a formal meta-analysis.1 2 Heterogeneity was appraised in most of the included meta-analyses (nine reviews), by means of Cochran Q, Der Simonian-Laird, Breslow-Day, or χ2 tests. In all cases evidence for statistical heterogeneity was present (P values ranging from 0.05 to < 0.001), and in all studies except one23 reviewers computed pooled effect size, on the basis of random effects methods (seven reviews), on fixed effects (one), and on both fixed and random effects (one).
A major source of bias in meta-analyses is small study size, which predisposes to a higher probability of publishing, quoting, and disseminating the results of randomised controlled trials with significant or positive results. This leads to pooled effect estimates being skewed to more positive findings, as small but significant randomised controlled trials are not counterbalanced in the statistical pooling procedure by similarly small but negative or non-significant studies, as shown graphically by funnel plots.1 2 Bias due to small study size was acknowledged and explicitly tested in most (eight) reviews, using Begg, Egger, or Rosenthal tests. Such tests provided significant results, suggesting the potential presence of such bias in five cases, prompting reporting and discussion of funnel plots in four of the five, plus another review that had none the less reported a non-significant test (table 1). Despite checking for small study bias among the same pool of seven studies, Birck et al12 and Isenbarger et al21 reached disparate conclusions, respectively in favour and against the likely presence of such bias, a finding that can be partly explained by the use of risk ratios by Birck et al and odds ratios by Isenbarger et al. Moreover, despite such likelihood (P < 0.10) for small study bias at specific tests, all meta-analyses went forward with the pooled estimates thus providing potentially biased results. Duong et al carried out two separate analyses, the first limited to randomised controlled trials published as complete reports and the second also including randomised controlled trials available only as abstracts.29 They thus showed that bias due to small study size was likely when selecting studies published only as complete reports (P = 0.02), but that it was no longer evident when including studies reported only as abstracts (P = 0.22).29
Overall compliance with the QUOROM checklist was relatively good by the investigators of 10 overlapping systematic reviews on the role of acetylcysteine in the prevention of contrast associated nephropathy. Shorter manuscripts were of significantly lower quality and previous not for profit funding was associated with higher quality.
Less rigorous reviews have reported positive conclusions more often than more robust reviews focusing on the same topic, 10 even if other studies did not corroborate the hypothesis.5 31 32 More recently a qualitative comparison of overlapping yet heterogeneous systematic reviews on complementary medicine showed that reviews differed greatly, mainly because of subjective decisions during the planning, carrying out, and interpretation of the reviews.33 Despite these data, the phenomenon of overlapping systematic reviews seems to become more prevalent, as highlighted by the present study series. The time has come for an open debate on this issue.
Systematic reviews have the inherent drawback that several independent meta-analysts may conceive, carry out, and submit a review on the same topic at the same time. This phenomenon of overlapping systematic reviews has already occurred.34 35 Some may welcome this as a means of highlighting the subjective choices and different perspectives available in carrying out research synthesis. The risk of wasting resources and providing the general medical readership with contradictory conclusions should not, however, be dismissed.
Potential reasons for the publication of 10 overlapping systematic reviews in less than two years are the preference given by investigators, editors, and readers of systematic reviews,36 the lack of a pivotal large randomised controlled trial, and the potential for a dose-response gradient.37
In the present study series, 49 investigators were involved in analysing the same clinical topic, exemplifying the waste of time and duplication of efforts in such research. Too many systematic reviews are being produced for diseases with a limited global burden, whereas many other potentially more relevant clinical conditions are not undergoing a thorough and systematic research synthesis.38 Constraints on space reflected by manuscript length (number of pages in our study) and abstract length are one of the most important hurdles faced by meta-analysts wanting to document their work thoroughly. Our findings suggest that no meta-analysis can achieve optimal quality scores unless sufficient space is provided for all its sections. Countermeasures for this problem seem straightforward, and range from increasing limits on space for systematic overviews of randomised controlled trials to the provision of online data supplements whenever room is limited. The Cochrane Collaboration does not impose any space limit for its systematic reviews.
Funding might influence outcomes and quality of research.39 Reviewers who reported previous not for profit funding were more likely to carry out higher quality systematic reviews. The association is not, however, synonymous to causality, and other explanations for our results include greater experience and expertise among review teams that have already carried out systematic reviews.
Although the QUOROM statement concerns reporting quality, the Oxman and Guyatt index was designed to focus on methodological quality.17 The relative disagreement between the scores in our study obviously stems from the different purposes of the checklists. Indeed, lack of reporting on the assessment of the internal validity of single randomised controlled trials may restrict Oxman and Guyatt scores to the low 1-3 range, even for well reported meta-analyses.2
One major difference between the systematic reviews we appraised was the number of primary studies included, with some reviews analysing as few as five studies,27 despite being submitted for publication almost one year after publication of a review of 12 studies.22 Potential explanations for such discrepancies might be the extent of database searches (whether or not conference proceedings were handsearched), language restrictions, whether selection was restricted to randomised controlled trials or included non-randomised studies, and timing of review. We cannot exclude the possibility that some search strategies missed eligible trials.
Finally, almost all reviewers carried out and reported formal meta-analyses in the presence of heterogeneity and small study size. Although metaregression and sensitivity analyses were employed to tackle these issues, the conflicting recommendations may in part be a consequence of such discrepancies and, more likely in the present case, biased publication of primary studies.
What is already known on this topic
Multiple systematic reviews on the same clinical topic may produce conflicting results
The quality of reporting of meta-analysis (QUOROM) statement was developed to improve the quality of reviews, yet its effect is uncertain
What this study adds
Ten overlapping systematic reviews on acetylcysteine to prevent contrast nephropathy varied in quality of reporting and recommendations
Longer manuscripts and previous not for profit funding were associated with higher quality
Preventive strategies for multiple overlapping reviews
The best approach to avoid multiple overlapping reviews is probably that of the Cochrane Collaboration, which mandates prospective registration of titles and protocols.1 Another option might be online registration of meta-analyses already accepted for publication by a journal, as this would not disseminate findings before completion of the study but would forewarn other researchers involved in the same topic and allow investigators to refocus on their work. Such recommendations come from personal experience, as our group has already been involved in several potential cases of overlapping meta-analyses.40–42 Nonetheless, these preventive measures are likely to impose logistical and financial hurdles potentially proving counterproductive. Finally, some of the problems raised by our work are likely inherent practical drawbacks of systematic reviews.
Practical implications for the clinical use of acetylcysteine
Although the clinical use of acetylcysteine was not a formal goal of our work, some broad recommendations can be gleaned from the available evidence. The studies found no evidence of toxicity or adverse effects from using acetylcysteine. However the real preventive size effect cannot be definitively inferred, given the clinical and statistical heterogeneity and the lack of clear explanations for such an inconsistency.12 21 23 24 26 29
Limitations of our study include its design, its scope for generating the hypothesis, and the risk of α and β errors.43 Moreover, most of this study involved the appraisal of published meta-analyses using the QUOROM checklist. Although scientific evidence supports only a subset of the items on the checklist, we believe that it can and should be used as a standard in the preparation, reporting, and appraisal of meta-analyses of randomised controlled trials on the basis of its wide dissemination and endorsement and the relative ease with which compliance can be achieved. It should be borne in mind that the QUOROM checklist was not developed as a quality measurement tool, and validation of this novel application is lacking.
Description of Oxman and Guyatt index is on www.bmj.com
Competing interests None declared.
Contributors GB-Z and PA participated in the study design, data acquisition, analysis, and interpretation, and provided statistical expertise. GB-Z drafted the manuscript and is guarantor. PA critically revised the manuscript. AA, FB, ML, ER, LT, ER, FC, and MV participated in data analysis, interpretation, and critical revisions of the manuscript. AA and FB provided statistical expertise.
Ethical approval Not required.