- a National Health and Medical Research Council Clinical Trials Centre, Mallet Street Campus, University of Sydney, Sydney NSW 2006, Australia
- Correspondence to: Professor Simes
- Accepted 26 August 1997
Objectives: To determine the extent to which publication is influenced by study outcome.
Design: A cohort of studies submitted to a hospital ethics committee over 10 years were examined retrospectively by reviewing the protocols and by questionnaire. The primary method of analysis was Cox's proportional hazards model.
Setting: University hospital, Sydney, Australia.
Studies: 748 eligible studies submitted to Royal Prince Alfred Hospital Ethics Committee between 1979 and 1988.
Main outcome measures: Time to publication.
Results: Response to the questionnaire was received for 520 (70%) of the eligible studies. Of the 218 studies analysed with tests of significance, those with positive results (P<0.05) were much more likely to be published than those with negative results (P0.10) (hazard ratio 2.32 (95% confidence interval 1.47 to 3.66), P=0.0003), with a significantly shorter time to publication (median 4.8 v 8.0 years). This finding was even stronger for the group of 130 clinical trials (hazard ratio 3.13 (1.76 to 5.58), P=0.0001), with median times to publication of 4.7 and 8.0 years respectively. These results were not materially changed after adjusting for other significant predictors of publication. Studies with indefinite conclusions (0.05 P<0.10) tended to have an even lower publication rate and longer time to publication than studies with negative results (hazard ratio 0.39 (0.13 to 1.12), P=0.08). For the 103 studies in which outcome was rated qualitatively, there was no clear cut evidence of publication bias, although the number of studies in this group was not large.
Conclusions: This study confirms the evidence of publication bias found in other studies and identifies delay in publication as an additional important factor. The study results support the need for prospective registration of trials to avoid publication bias and also support restricting the selection of trials to those started before a common date in undertaking systematic reviews.
This retrospective cohort study of clinical research projects confirms the findings of publication bias found in previous studies
Delay in the publication of studies with negative results has been identified as an additional important factor in publication bias
With the recognised importance of evidence based medicine, these results have important implications for the selection of studies included in systematic reviews
Prospective registration of clinical research projects will avoid many of the problems associated with publication bias
However, it is also important to restrict inclusion in systematic reviews to studies started before a certain date to allow for the delay in completing studies with negative results
In evaluating the effectiveness of treatments, the highest level of evidence is believed to be obtained from a systematic review or meta-analysis of all randomised controlled trials. However, evidence of treatment effectiveness even from randomised trials can still be biased owing to a number of factors: bias within individual trials that are not properly randomised or not analysed according to intention to treat1; bias in selecting trials for inclusion in a meta-analysis,2 3 particularly when only published trials are included4; and bias in selecting treatment questions after examining the data.5 In particular, if trials with a positive effect of treatment are more likely to be published, a review limited only to published trials would give a more positive effect of treatment than a review based on all trials (published and unpublished).
The selection of trials for inclusion in a systematic review may be biased if selection is restricted to published trials (publication bias),4 to trials published in English language journals (language bias),6 to trials published in prestigious journals,7 or to trials cited by other authors such as in review articles (reference bias).8 Furthermore, even if all relevant trials are eventually published, the selection of trials may still be biased if it is restricted to trials that are published early.
Evidence for publication bias has been shown in many studies,9 10 11 12 13 14 15 16 17 18 19 and particularly in three cohort studies of protocols submitted to institutional ethics committees.20 21 22 All three studies showed a significantly greater likelihood for trials with significant results to be published than for those with negative results. These studies did not, however, examine whether there was a delay in publication for trials that were eventually published.
We determined the extent of publication bias for studies submitted to an Australian ethics committee and whether publication was delayed for studies with negative results in comparison with those with positive results.
Between September 1979 and December 1988, 801 submissions were received by the Royal Prince Alfred Hospital Ethics Committee for proposed medical research. Each was accompanied by a protocol providing a detailed outline of the proposed research, from which was obtained information on approval, details of the research design, the planned sample size, and the nature of any intervention. Research design was classed primarily as observational study, clinical trial, or non-trial experiment according to the definitions of Easterbrook et al.20
In July 1992 the principal investigator for each study was asked to complete a questionnaire providing information on the current status; starting date, closure of recruitment, and finishing date; sample size reached; the nature of funding (none, pharmaceutical, government, other (external), or other (internal)); the rating of scientific importance of the study; the status and date of the most recent analysis; the main research questions posed by the study at the outset; the results for the main research questions; and the publication status and date of initial publication as an article in a peer reviewed journal.
Eligible studies were defined as single studies approved by the Royal Prince Alfred Hospital Ethics Committee between September 1979 and December 1988 with more than one patient and with protocol information available.
Classification of study outcome
For quantitative studies, in which the main study outcome was assessed by using statistical methods with tests of significance, outcome was classed as significant (P<0.05), as showing a non-significant trend (0.05 P<0.10), or as non-significant or null (P 0.10). Examples of such studies include those comparing treatments and epidemiological studies examining evidence of association for risk factors.
For qualitative studies, in which the main study outcome was assessed subjectively by the principal investigator, the study was classed as showing striking, important and definite, or unimportant and negative findings. Such studies included uncontrolled phase 2 clinical trials examining response rate to treatment and studies with descriptive statistics of a population.
Potential factors predictive of time to publication were examined in a Cox regression analysis, in which time to publication was the time from the date that the study was approved by the ethics committee to the date of first publication. Unpublished studies were censored at the date the questionnaire was completed; studies that were in press were analysed as if published on the date of completion of the questionnaire. Studies for which no analysis had yet been undertaken and two studies whose early findings had first been published before approval from this ethics committee were excluded from the survival analysis. Funding sources were compared in two ways to allow comparison with earlier studies20 21: pharmaceutical v non-pharmaceutical and external v internal or none. We made no allowance for multiple comparisons. With the exception of the investigator's rating of scientific importance of the study, which we judged to be largely influenced by study results, all other factors were examined in a multivariate Cox regression to determine the relative importance of study results on time to publication adjusted for any other significant factors. Quantitative and qualitative studies were examined in the same model by also including a factor for method of assessment of study outcome. A subsidiary logistic regression analysis was performed to permit a direct comparison of the results of this study with earlier cohort studies of clinical research projects, which were analysed using odds ratios for publication.
The association between the various study characteristics and study status was examined by calculating the χ2 statistic for an rx c contingency table.
A total of 801 protocols for 810 separate studies were submitted for approval between September 1979 and December 1988. Of these, 748 separate studies in 741 submissions were approved during this period and included in the study. We excluded studies that had not been approved during this period, 19 duplicate study submissions, four submissions that were not formal studies, five studies on one patient alone, and five studies whose submission protocols or relevant data were not available.
Questionnaires were completed for 520 (70%) of the eligible studies, with a larger proportion of questionnaires completed for the more recently approved studies (P=0.0007). Apart from this, the returned questionnaires were representative of the total sample of eligible studies for all factors for which data were available from the study protocols.
Table 2 shows the status of the studies at the time of questionnaire completion. The main reasons for not starting in 100 studies were lack of funding (45), lack of support from colleagues (9), departure of staff (6), and impractical methods (6). Seventy studies (of which 50 were clinical trials) were abandoned after starting, mainly because of difficulties with patient accrual (23), funding problems (6), and technological problems (6) and because the early results that had been obtained were negative (6, all clinical trials).
Association between study characteristics and completion of study
Analysis of the association between the various study characteristics and completion of the 520 studies for which questionnaires were completed showed a higher rate of not starting for non-trial experiments (31/114; 27%) than for observational studies (23/129; 18%) or for clinical trials (43/277; 16%), and a higher completion rate for observational studies (85/129; 66%) than for clinical trials (156/277; 56%) or for non-trial experiments (46/114; 40%) (P<0.0001). There was also a far higher completion rate for studies that were undertaken as part of a degree compared with those that were not (85/107; 79% v 202/396; 51%) (P<0.0001). Single centre studies were more likely never to have started or to have been abandoned than were multicentre studies (123/337; 36% v 28/163; 17%) (P<0.0001). Studies with a small sample size reached had a higher rate of abandonment (<100 (40/231; 17%) v 100 (0/141)) (P<0.0001), which may simply reflect being abandoned because of poor accrual. Clinical trials with a more rigorous study design were far more likely to be completed (randomised (87/130; 67%) v non-randomised (69/147; 47%), P=0.002; placebo controlled (31/44; 70%) v non-placebo controlled (125/233; 54%), P=0.06; and blinded (43/58; 74%) v non-blinded (113/219; 52%), P=0.004).
Study outcome and method of analysis
Of the 520 studies with completed questionnaires, 321 had had analysis undertaken with results available and were included in further analysis of the association between study outcome and time to publication.
Table 3 summarises study outcome and method of analysis. 67% of quantitative studies, had a significant result for the primary research question, whereas only 26% of qualitative studies were in the most positive category. Accordingly, quantitative and qualitative studies were considered separately in subsequent analyses.
Study characteristics and publication
Table 4 shows the hazard ratios for publication for selected variables. Factors predictive of publication were significant results; research design using non-trial experiments; a high scientific importance rating of the study by the investigator; external funding; and studies with non-comparative study groups and clinical trials that were non-randomised. Other variables tested and found not to be significant were whether the study was undertaken as part of a degree; the number of data collection sites (single v multicentre); pharmaceutical funding; the research department undertaking the study; the year of study approval; the classification of study outcome (qualitative v quantitative); and, for clinical trials, placebo control and blinding. For the 218 quantitative studies, positive studies were much more likely to be published than negative studies (hazard ratio 2.32 (95% confidence interval 1.47 to 3.66), P=0.0003) This finding was even stronger for the subgroup of 130 quantitative clinical trials (hazard ratio 3.13 (1.76 to 5.58), P=0.0001).
Figure 1 shows the survival curve plots for time to publication for quantitative studies and figure 2 those for quantitative clinical trials. The median time to publication was 4.82 (3.87 to 5.72) years for studies with significant results, 7.99 (6.91 to ∞) years for studies with null results, and not reached for studies with indefinite conclusions. The median time to publication was 4.69 (3.75 to 5.72) years for clinical trials with significant results and 7.99 (7.02 to ∞) years for clinical trials with null results.
The survival curve plots for sample size (not shown) show that studies with a larger sample size (100) continued to be published right until the end of the study period, at which time about 90% of studies and clinical trials had been published, compared with only about 64% of studies and clinical trials with a sample size <100.
The odds ratio for publication quantitative studies was 2.66 (1.32 to 5.35) (P=0.003) for significant compared with null studies. For quantitative clinical trials it was 4.19 (1.71 to 10.32) (P=0.0004).
Figure 3 shows the results of multivariate analysis. For quantitative studies the adjusted hazard ratio for publication for positive compared with negative studies was 2.34 (1.47 to 3.73) (P=0.0004); for studies with intermediate results it was 0.43 (0.15 to 1.24) (P=0.12). For quantitative clinical trials the adjusted hazard ratio for publication for positive compared with negative trials was 3.29 (1.84 to 5.90) (P=0.0001); for trials with intermediate results it was 0.50 (0.14 to 1.74) (P=0.27). For quantitative studies the adjusted odds ratio for publication for positive compared with negative studies was 2.93 (1.49 to 5.74); for studies with intermediate results it was 0.34 (0.17 to 0.67). For quantitative clinical trials the adjusted odds ratio for publication for positive compared with negative trials was 4.57 (1.96 to 10.63); for trials with intermediate results it was 0.44 (0.10 to 1.91).
Quantitative studies with significant study outcomes were much more likely to be submitted for publication than studies with a null study outcome (114/146; 78% v 28/52; 54%) (P=0.0009). By contrast, 130 of the 148 quantitative studies submitted had been published or were in press, and the publication rate for those studies with significant outcomes (99/114; 87%) were similar to those with null results (23/28; 82%) (P=0.54).
We found publication bias, after allowing for confounding factors, in a cohort of studies that were approved by an Australian ethics committee, and our results confirm those of American and British studies.20 21 22 To our knowledge, ours is the first study to show the delay in publication of studies with negative results. The evidence was definite for quantitative studies, with a publication rate 2.3 times greater for studies with significant compared with null results. Within the subgroup of clinical trials, the evidence of publication bias was even stronger, with a publication rate 3.3 times greater. These figures represent a higher estimated publication rate at 5 years of 52% v 27% respectively for quantitative studies, and 54% v 23% respectively for clinical trials. Although the trend for qualitative studies was not significant (P=0.21), this may relate to the smaller number of qualitative studies. Other factors in a multivariate analysis that were associated with publication were research design (observational studies being more likely to be published than clinical trials and non-trial experiments) and source of funding (externally funded studies being more likely to be published). The impact of study results on publication was essentially the same after adjusting for these factors. The impact of other factors on publication rate was examined only for the studies with analysed results. Interestingly, clinical trials that used better study designs (randomisation, placebo control, or blinding) were not published more often than other studies but were more likely to be completed and hence more likely to be analysed than other studies. Consequently, these factors may still positively influence the publication rate when all studies, whether analysed or not, are considered.
Other studies reporting on publication bias have analysed quantitative and qualitative studies as a single group,20 21 22 but we considered them separately. This was because the categories were not comparable in terms of outcomes and because there was evidence of interaction between these subgroups and the degree of publication bias (P=0.02).
The validity of our study was not affected by including studies with only interim findings because the findings of publication bias were not materially altered by excluding these studies. Consequently, the bias relates to delay in publication of studies with negative results rather than just the premature publication of positive results.
Quantitative studies and clinical trials with an indefinite study outcome (0.05 P<0.10) were less likely to be published than studies and clinical trials with a non-significant study outcome (P 0.10). A similar trend was seen in the study of Easterbrook et al, who found an adjusted odds ratio of 0.61 (0.26 to 1.59) for non-significant trend studies compared with null studies.20 Thus an indefinite study outcome may be even more likely than a definitely negative outcome to deter researchers from submitting their studies for publication, or editors from accepting them.
The data also imply that publication bias is primarily due to the failure of researchers to submit negative studies for publication rather than to the failure of journal editors to publish them after they have been submitted. However, we cannot exclude that the failure of investigators to submit negative studies for publication may partly relate to editorial bias; previous experience in the submission of negative studies may have conditioned them to expect rejection of such studies for publication.
Berlin et al have argued that large scale studies are eventually published and that a meta-analysis restricted to large studies may be free of publication bias.23 They also argue that small studies should be excluded from meta-analyses on the grounds that they have greater random fluctuation in their estimates and that they may be more subject to bias than large studies.23 We found that studies with a sample size 100 continue to be published, with more than 90% published at the end of the study period. However, at any time point, since some studies are still in progress, there is always publication bias due to a delay in the publication of negative studies, provided that the results from large studies are not systemically different from other studies and that large negative studies are subject to the same delay in publication as other studies. We did not find significant interaction between study results and sample size; in addition, publication bias was still evident when analysis was restricted to the 103 quantitative studies with a sample size 100, with a hazard ratio for publication of positive compared with negative studies of 2.00 (1.09 to 3.66) (P=0.02). Hence, a strategy that excludes small trials is not a solution to the problem of publication bias.
Implications for the conduct of meta-analysis
The use of meta-analysis in medical research is becoming more common; its results seem to be precise and convincing, and it is beginning to have an impact on clinical practice and on the planning of future research. Consequently, it is important that modest differences in outcome can be validly attributed to the effect of treatment rather than bias. Although an increasing number of meta-analyses are based on controlled clinical trials, a computerised literature search for 1982-9 by Easterbrook et al20 found that about two thirds of meta-analyses are based on observational studies, which are more prone to bias than controlled clinical trials.
Our results have four important implications for the conduct of meta-analysis.
Firstly, the studies used in any meta-analysis should not comprise solely published studies. This is not an unbiased sample, and meta-analysis based on published studies may result in bias in favour of positive results.
Secondly, meta-analyses should be restricted to studies that have started before a certain date. This avoids the problem of the delayed but eventual publication of studies with negative results. To include all studies, regardless of starting date, is likely to result in a selection that is biased in favour of studies with positive results.
Thirdly, retrospective collection of data is difficult—the lower questionnaire completion rate for studies that were undertaken in the early years confirms the findings of Hetherington et al.24 Meta-analyses that are based on retrospective data collection are likely to be subject to significant selection bias.
Finally, restricting meta-analyses to large scale studies, even if they are all eventually published, does not by itself solve the problem of publication bias, since it does not take into account the delay in publishing negative studies.
Prospective registration—a solution to the problem of publication bias
A strategy that provides a solution to publication bias is to register prospectively all trials before their results are known and to select trials from such a registry when undertaking any systematic review.4 This approach has been supported by the Cochrane Collaboration and has led to the statement on the need for prospective registration of all controlled trials developed at the second annual Cochrane colloquium in 1994. For evidence based medicine, the support of government bodies to ensure that mechanisms and funding are provided to ensure universal registration is essential and can be argued to be in the interests of developing the highest level of evidence. There are also strong ethical arguments on behalf of participating patients to ensure that the results of all studies are eventually published.25 26 The mechanisms for the establishment of universal registration already exist—it has been mandatory since 1985 in Australia, as in many other countries, for all research projects on human subjects to be approved by an institutional ethics committee.27 The identification of studies for national registration at the time of initial approval by each institutional ethics committee is therefore a fairly simple matter. The need for universal prospective registration was recognised over 10 years ago and is being actively pursued by the Cochrane Collaboration, but it is yet to be implemented. With the increasing recognition of the importance of evidence based medicine, the establishment of universal prospective registration is an important and urgent priority.
We thank the ethics committee and investigators from the Royal Prince Alfred Hospital for their support, and Davina Ghersi for insightful comments on the manuscript.
Funding: This study was supported by a unit grant from the National Health and Medical Research Council, Australia.
Conflict of interest: None.