Understanding why evidence from randomised clinical trials may not be retrieved from Medline: comparison of indexed and non-indexed recordsBMJ 2012; 344 doi: https://doi.org/10.1136/bmj.d7501 (Published 03 January 2012) Cite this as: BMJ 2012;344:d7501
- L Susan Wieland, research associate12,
- Karen A Robinson, assistant professor34,
- Kay Dickersin, professor and director2
- 1University of Maryland School of Medicine, Center for Integrative Medicine, Baltimore, MD 21201, USA
- 2US Cochrane Center, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- 3Department of Medicine, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
- 4Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Correspondence to: L Susan Wieland
- Accepted 7 November 2011
Objective To explore why reports that seem to describe randomised controlled trials are sometimes not indexed (“tagged”) with RCT (randomised controlled trial) [pt] (publication type) in Medline.
Design Cross sectional study.
Setting The Cochrane Collaboration and US National Library of Medicine worked together to identify and retag records of randomised controlled trials with RCT [pt], 1994 to 2006.
Data source Published reports entered into Medline in 2005.
Main outcome measures Type of trial information presented (for example, main results, design, and methods), trial design, and other Medline indexing terms applied.
Results 572/591 (97%) untagged records and 578/594 (97%) tagged records contained information from randomised controlled trials. Type of trial information and design differed between untagged and tagged reports. Fewer than half (234/572, 41%, 95% confidence interval 37% to 45%) of untagged reports but most tagged reports (526/578, 91%, 89% to 93%) described the main results of the trial. Untagged reports were more likely than tagged reports to contain information on design and methods, baseline characteristics, long term follow-up, and secondary analyses. Untagged reports of main results were more likely than tagged reports to be from trials using a crossover design (36% v 10%, difference 25%, 95% confidence interval 19% to 32%). The Medical Subject Heading “Randomized Controlled Trials” was the most common clinical trial term applied to untagged reports, although more than half of untagged reports had no indexing related to trials.
Conclusion Based on the results for 2005, at least 3000 records describing randomised controlled trials but not indexed using RCT [pt] may have been entered into Medline between 2006 and 2011. Researchers and healthcare decision makers relying on using RCT [pt] may be missing important evidence in their searches, particularly for design and methods, baseline characteristics, long term follow-up, and secondary data analyses.
Complete identification of randomised controlled trials is critical for the consideration of all relevant evidence from such trials for systematic reviews,1 2 the decision to initiate a new randomised controlled trial, and the characterisation of the conduct and reporting of all trials in a specific area. Identification of randomised controlled trials has been aided immensely by the US National Library of Medicine’s introduction of Medline indexing terms for publication type, specifically RCT (randomised controlled trial) [pt] (publication type) introduced in 1991 and CCT (controlled clinical trial) [pt] introduced in 1995.
Medline’s application of publication types to relevant records has been important in the identification of trials for Cochrane reviews.3 All records tagged with RCT [pt] or CCT [pt] in Medline and indexed as human studies are regularly downloaded from Medline for inclusion in the Cochrane central register of controlled trials (CENTRAL), available through the Cochrane Library.4 In addition, people who are affiliated with the Cochrane Collaboration may also contribute records to the register that they identify through hand searching journals or by other means. (The Cochrane Collaboration is developing the Cochrane register of studies, which will link all records from one trial and will replace CENTRAL in 2012.)
To ensure that searches of Medline and CENTRAL are comprehensive, the Cochrane Collaboration carried out a project with the US National Library of Medicine during 1994 to 2006, electronically retagging randomised controlled trials in Medline that had not been indexed with RCT [pt].5 6 The goal was to capture both randomised controlled trials indexed before the introduction of RCT [pt] in 1991 and randomised controlled trials indexed after 1991 but not tagged with RCT [pt] by US National Library of Medicine indexers. The US Cochrane Center identified untagged randomised controlled trials for the Cochrane retagging project for the publication years 1966 to 1984 and 1998 to 2006 by implementing phases I and II of the Cochrane highly sensitive search strategy,1 examining the titles and abstracts of citations.4 The UK Cochrane Centre carried out this work for the publication years 1985-97. The advantage of using validated search strategies, such as the Cochrane highly sensitive search strategy, to identify trials (for example, for systematic reviews) is that the strategies have been tested against a reference standard to obtain optimal performance characteristics.7 8 In the case of the Cochrane highly sensitive search strategy, search terms beyond the RCT [pt] were included to allow capture of trial reports not indexed under the term.
Between 1994 and 2006 the Cochrane retagging project identified 39 189 Medline records for randomised controlled trials that were not indexed with RCT [pt], and these were forwarded to the US National Library of Medicine for retagging.4 No retagging activities have been carried out by the Cochrane Collaboration since 2006 and the classification of RCT [pt] in Medline now relies solely on indexing done by the US National Library of Medicine. We explored why reports that seem to describe randomised controlled trials are not being indexed with RCT [pt] in Medline, to provide information that may be useful to those using publication type to identify randomised controlled trials and to those at the US National Library of Medicine responsible for indexing by publication type.
Records were eligible for our study if they had been added to Medline between 1 January and 31 December 2005; indexing had been completed by the US National Library of Medicine; and the studies were about humans, included an abstract, and contained the term “random” or a variant in the title or abstract. Untagged records had been identified through the Cochrane retagging project as randomised controlled trials but not indexed with RCT [pt]. Tagged records were indexed by the US National Library of Medicine with RCT [pt].
Identification of tagged and untagged records
As part of the Cochrane and US National Library of Medicine’s Medline retagging project, we carried out searches of PubMed during January to October 2006. We used phases I and II of the Cochrane highly sensitive search strategy1 9 to identify records added to PubMed during 2005 that were likely to be randomised controlled trials (see web extra for search strategy). We deleted from our search results all records already indexed in Medline with RCT [pt] or CCT [pt] (see box 1 for definitions of publication types), and records without abstracts. Trained staff at the US Cochrane Center read all titles and abstracts and assessed whether the record described a randomised controlled trial. One reviewer initially assessed for randomised controlled trial status. A second reviewer then checked all the records identified by the first reviewer as definite or possible randomised controlled trials, together with a sample comprising one 10th of all records identified as non-randomised controlled trials. The records were assessed on the basis of abstract and title alone, and the decision of the second reviewer was final. The term “random” or a variant in the record title or abstract was necessary, but not sufficient, to indicate that the record described a randomised controlled trial.
For the purposes of this study we deleted from our set of records describing randomised controlled trials all records for which indexing by the US National Library of Medicine was “in progress” (that is, had not been completed). The records classified by Cochrane staff as definite randomised controlled trials that had completed indexing by the US National Library of Medicine constituted our sample of untagged records.
Box 1: Definitions of publication types and Medical Subject Headings [MeSH]*
Randomized Controlled Trial [pt]
Work consisting of a clinical trial that involves at least one test treatment and one control treatment, concurrent enrolment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random numbers table.
This heading is used as a publication type; for original report of the conduct or results of a specific randomized controlled trial; a different heading Randomized Controlled Trials [MeSH] is used for general design, methodology, economics, etc, of randomized controlled trials.
Controlled Clinical Trial [pt]
Work consisting of a clinical trial involving one or more test treatments, at least one control treatment, specified outcome measures for evaluating the studied intervention, and a bias-free method for assigning patients to the test treatment. The treatment may be drugs, devices, or procedures studied for diagnostic, therapeutic, or prophylactic effectiveness. Control measures include placebos, active medicine, no-treatment, dosage forms and regimens, historical comparisons, etc. When randomization using mathematical techniques, such as the use of a random numbers table, is used to assign patients to test or control treatments, the trial is characterized as a RCT [pt]. This heading is used as a Publication Type; for original report of the conduct or results of a specific controlled clinical trial; a different heading Controlled Clinical Trials [MeSH] is used for general design, methodology, economics, etc of clinical trials.
Clinical Trial [pt]
Work that is the report of a pre-planned clinical study of the safety, efficacy, or optimum dosage schedule of one or more diagnostic, therapeutic, or prophylactic drugs, devices, or techniques in humans, selected according to predetermined criteria of eligibility and observed for predefined evidence of favorable and unfavorable effects. This heading is used as a publication type; for original report of the conduct or results of a specific clinical trial; a different heading Clinical Trials [MeSH] is used for general design, methodology, economics, etc, of clinical trials.
Randomized Controlled Trials [MeSH]
Clinical trials that involve at least one test treatment and one control treatment, concurrent enrollment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random numbers table. For general design, methodology, economics, etc, of randomized controlled trials; a different heading RCT [pt] is used for reports of a specific randomized trial.
Clinical Trials [MeSH]
Pre-planned studies of the safety, efficacy, or optimum dosage schedule (if appropriate) of one or more diagnostic, therapeutic, or prophylactic drugs, devices, or techniques selected according to predetermined criteria of eligibility and observed for predefined evidence of favorable and unfavorable effects. For general design, methodology, economics, etc, of clinical trials; a different heading Clinical Trial [pt] is used for reports of a specific clinical trial.
Random Allocation [MeSH]
A process involving chance used in therapeutic trials or other research endeavor for allocating experimental participants, human or animal, between treatment and control groups, or among treatment groups. It may also apply to experiments on inanimate objects. Do not add with Randomized Controlled Trial [pt] or other Clinical Trial [pt].
Cross-Over Studies [MeSH]
Studies comparing two or more treatments or interventions in which the subjects or patients, upon completion of the course of one treatment, are switched to another. In the case of two treatments, A and B, half the subjects are randomly allocated to receive these in the order A, B and half to receive them in the order B, A. When pertinent add Clinical Trial [pt] or Randomized Controlled Trial [pt].
*Definitions are quoted directly from the US National Library of Medicine scope note and annotation for each term, found by searching the MeSH database at www.nlm.nih.gov/mesh/MBrowser.html (2011)
To identify a comparison sample of tagged records, we carried out a search of PubMed in September 2009 for records that were indexed with RCT [pt], and we used the Entrez Date search field to restrict results to records added to PubMed in 2005 (see web extra). We downloaded all retrieved records into a reference management database, alphabetised records by first author, and systematically selected 1/k records (see Results section for detailed explanation) to obtain a sample of tagged records of about the same size as the number of untagged records.
Classification of records
Two authors (LSW and KD) worked collaboratively, each using the first 100 untagged records ordered alphabetically by first author, to develop a classification scheme for type of trial information presented in each record. The types of information were main results; design and methods (for example, study protocols, details of interventions); baseline trial data; secondary analyses of trial data in which intervention and control groups were compared for effectiveness (for example, cost effectiveness analyses, subgroup analyses); observational studies utilising trial data and in which no comparison was made between the intervention and control groups (for example, calculation of baseline risks, nested case-control studies); and long term follow-up of trial participants (see box 2 for examples of types of data other than main results). Trials presenting data from diagnostic test interventions were also noted, as previous research has indicated that it may be difficult to identify diagnostic randomised controlled trials.15
Box 2: Text examples (verbatim) of types of information (other than main results) in reports from randomised controlled trials
Design and methods
The study design and methods of this multicentre pragmatic randomized parallel-group open trial are presented here.10
METHODS: A description of the baseline characteristics of patients randomized in the CARE-HF trial.11
The Multicenter InSync Randomized Clinical Evaluation (MIRACLE) investigators assessed the efficacy of CRT [cardiac resynchronization therapy] in patients with CHF [congestive heart failure] with QRS durations > or = 130 ms and found that CRT lead [sic] to improvement in several measures of functional capacity and exercise tolerance . . . We divided patients enrolled in the MIRACLE trial into three subgroups according to conduction abnormality—LBBB, [left bundle branch block] right bundle-branch block (RBBB), and nonspecific interventricular conduction delay (IVCD)—and compared the response among and within these groups to CRT or no CRT at baseline and 6-months follow-up.12
This nested case-control study is drawn from the 11 incident PVD [peripheral vascular disease] events reported in the cohort of the Secondary prevention with antioxidants of cardiovascular disease in end-stage renal disease (SPACE): a randomized placebo-controlled trial.13
Long term follow-up data
Originally, 52 women with urodynamic stress urinary incontinence were randomly assigned to home or intensive exercise. After 6 months, 60% in the intensive group were almost or completely continent, compared with 17% in the home group. Fifteen years later, all original study subjects were invited to complete a postal questionnaire assessing urinary symptoms (using validated outcome tools) and current pelvic floor muscle training.14
Once the classification scheme was set, two authors (LSW and KD) independently classified the remainder of the untagged records (n=491). Most classifications were based on title and abstract, with the full text consulted in cases of uncertainty when the two raters disagreed on a classification and when either rater believed that the record was not related to a randomised controlled trial in any way. Each record was first classified as describing or not describing an individual randomised controlled trial. For each record classified as describing an individual randomised controlled trial we answered a series of questions to classify the type of information contained in the record (see web extra figure). Two authors (LSW and KR) classified all tagged records using the same scheme.
Two authors (LSW and KR) also assigned a “type of randomised study design” (that is, n-of-1, split body, cluster, crossover, or parallel) to all untagged and tagged records that we classified as describing the main results of randomised controlled trials.
For each untagged and tagged record one author (LSW) coded whether the journal of publication was a general medical or specialty journal and downloaded year and language of publication. Finally, we examined untagged records to see whether alternative MeSH terms had been applied by indexers. We downloaded the following indexing terms when they were present: Clinical Trial [pt], Randomized Controlled Trials [MeSH], Clinical Trials [MeSH], Random Allocation [MeSH], and Cross-Over Studies [MeSH] for all records in the untagged sample (see box 1 for definitions of publication types and MeSH terms).
Identification of samples
Staff of the US Cochrane Center identified 1176 records added to PubMed in 2005 and not tagged with RCT [pt] or CCT [pt] that they classified as describing randomised controlled trials. The US National Library of Medicine had completed indexing 591 (50%) of these records, constituting the untagged records in this study. Medline was searched using the strategy RCT [pt] AND 2005[Entrez Date] to identify a comparison group of tagged records. Overall, 16 039 records were retrieved and we sampled one in 27 records to create a sample of 591 tagged records. An error was, however, noticed in the search, when 150 of 591 records from the sample were not about humans, did not include abstracts, or did not include the word “random” in the title or abstract. Therefore only 441 records met the inclusion criteria. A second search of Medline was carried out using the second strategy outlined in the web extra. Based on this search, 11 769 records were retrieved that were not in the first sample. We sampled one in 77 of these records to obtain 153 additional records, deriving the 594 tagged records for this study.
Nearly all identified records were reports from individual randomised controlled trials: 572/591 (97%, 95% confidence interval 95% to 98%) of the untagged records and 578/594 (97%, 96% to 98%) of the tagged records. The most common type of non-randomised controlled trial report in the untagged records was a brief mention of a randomised controlled trial (for example, a remark that a randomised controlled trial was being planned), and in the tagged records it was a report of findings from a non-randomised study (table 1⇓).
Because the first 100 untagged records were used to establish classification categories, agreement on type of trial information was examined only for the remaining records (101 to 591). Initial agreement was 85%, including the 55/491 (11%) cases in which both raters thought that reading the full text was necessary to reach a decision on classification. Initial agreement was 92% for the tagged records (n=594), with the reviewers reading the full text of 32 (5%) reports.
The information presented in the untagged and tagged records differed. Fewer than half (41%, 37% to 45%) of reports in the untagged records presented main results from randomised controlled trials compared with almost all (91%, 89% to 93%) of reports in the tagged records (table 2⇓). A total of 38 untagged records and three tagged records concerned a diagnostic test intervention; all were classified as presenting main trial results. Among records classified as presenting the main results from a randomised controlled trial, 38/234 (16%, 12% to 21%) of untagged records and 3/526 (<1%, 0% to 1%) of tagged records concerned a diagnostic test intervention (difference 16%, 11% to 20%).
For both tagged and untagged records, more than 90% of reports describing main results from randomised controlled trials reported using a parallel or crossover design, although untagged records reported a crossover design more often than tagged records (36% v 10%; difference 25%, 19% to 32%; table 3⇓).
Evidence was lacking of a difference between untagged and tagged records as to whether they were published in English (552/572, 97%, 95% to 98% v 544/578, 94%, 92% to 96%; difference 2%, 0% to 5%) or in a specialty medical journal (525/572, 92%, 90% to 94% v 533/578, 92%, 90% to 94%; difference 0%, −4% to 3%). Both untagged and tagged records had publication dates between 2003 and 2006 (citations sometimes enter Medline before publication date).
Publication type tagging and MeSH indexing of untagged records
Overall, 19/572 (3%, 2% to 5%) untagged records were indexed as Clinical Trial [pt], indicating that the record was identified by US National Library of Medicine indexers as a publication from a clinical trial but not a publication from a randomised controlled trial (table 4⇓). The most common indexing term applied was Randomized Controlled Trials [MeSH], which indicates the topic, not the publication type, of the record. In addition, 24/572 (4%, 3% to 6%) records were indexed with Clinical Trials [MeSH], but no records were indexed with both Clinical Trial [pt] and Clinical Trials [MeSH]. For half of all data reports and more than two thirds of reports on main results, none of the alternative clinical trials indexing terms that were examined had been applied (table 4). No randomised controlled trials reporting diagnostic tests were tagged with Clinical Trial [pt], Clinical Trials [MeSH], or Randomized Controlled Trials [MeSH], although 3/38 (8%, 0% to 16%) records were tagged with Random Allocation [MeSH] and 1/38 (3%, 0% to 14%) records were tagged with Cross-Over Studies [MeSH].
We found 572 records entered into Medline in 2005 that described randomised controlled trials but were not tagged with RCT [pt]. Only half of the untagged reports of randomised controlled trials had any clinical trial indexing terms applied. Thus the reports would be difficult for systematic reviewers and others to identify. This finding underscores the advantage of using validated search strategies when seeking comprehensive identification of randomised controlled trials (for example, for systematic reviews), since our searches identified randomised controlled trials beyond those indexed with RCT [pt] by the US National Library of Medicine. Other researchers have tested several Medline search strategies for studies of treatments and identified strategies that maximise search sensitivity, which may be most appropriate for systematic reviewers and others requiring comprehensive retrieval, and strategies that maximise search specificity, which may be more appropriate for clinicians or others who wish to minimise retrieval of non-relevant material.16
Nearly half of the untagged citations to randomised controlled trials that we identified (234/572; 41%, 37% to 45%) appear to report the trials’ main results, and a similar proportion (245/572; 43%, 39% to 47%) describe the trials’ design and methods, baseline data, long term follow-up, or secondary outcome analyses, all of which are important to those identifying randomised controlled trials for possible inclusion in a systematic review. In contrast, only 52/578 (9%, 7% to 11%) records tagged with RCT [pt] by the US National Library of Medicine describe something other than the trial’s main results. Thus reports of randomised controlled trials containing specific types of information (for example, design and rationale) seem more likely not to be tagged with RCT [pt], and unless or until the application of the RCT [pt] changes, use of validated search strategies may be particularly important for identifying these types of reports.
Why were these reports of trials not tagged as RCT [pt]? We think it likely that the reasons for not tagging main results may be different from reasons for not tagging other types of reports for randomised controlled trials. For one thing, untagged articles reporting main results are indexed less often with other MeSH terms related to clinical trials compared with other types of untagged randomised controlled trial reports (see table 4), perhaps indicating a difference of opinion on whether certain types of trial design are randomised controlled trials. US National Library of Medicine indexers may view crossover studies and randomised tests or experiments related to health but not immediately concerned with diagnosis, treatment, or prevention of disease, in a different category from randomised controlled trials. Consistent with this explanation, a higher proportion of crossover studies was identified in the untagged records describing main results compared with tagged records. Moreover, we used the US National Library of Medicine definition of RCT [pt] (see box 1) and are not aware of any explicit indexing rules that would contradict our classification of a study as a randomised controlled trial.
Untagged reports of design and methods are most often tagged with the MeSH term Randomized Controlled Trials (91% of the time, 85% to 97%), which indicates that US National Library of Medicine indexers consider that the record content is about a randomised controlled trial but do not regard these reports to be in the form of a randomised controlled trial publication (see box 1 for definitions). Because papers with design and methods are an important resource for those seeking information about a trial protocol or detailed reporting on the organisation and conduct of a trial, they can be viewed as an important extension of the main results of a paper, in that they provide information needed to assess the validity of the findings. The identification of design and methods papers is essential until trial protocols themselves are standardised and widely accessible.17 18 19 20
Analyses other than main results are important in the portfolio of reports emanating from a randomised controlled trial and also should be tagged with RCT [pt]. Reports of baseline data are important for assessing the applicability or generalisability of results from randomised trials17; they were tagged with the MeSH term Randomized Controlled Trials almost half the time.
Publications describing findings from longer term follow-up of participants of randomised trials provide important information on the benefits and harms of an intervention that may not be evident from a trial of relatively brief duration. Indeed, randomised controlled trials may be designed with relatively short term primary outcomes not because they are the most important outcomes but because there are funding limitations, a short term outcome is all that is required for regulatory approval, or investigators wish to minimise missing data for the primary analysis. The emphasis of comparative effectiveness research is on “real world” clinical problems, and this includes follow-up for both benefits and harms beyond the short term.
Secondary data analyses, such as cost effectiveness analyses of trial interventions or comparisons of intervention effects across trial subgroups, provide information about the effects of interventions in specific population subgroups or the relative value of the interventions. Comparative effectiveness research emphasises finding out which treatments work best in which individuals, and understanding effectiveness and harm in subpopulations is a key component of a thorough analysis of evidence from randomised controlled trials.
We understand better why studies utilising data or participants from randomised controlled trials but analysing the data without randomised comparisons (that is, using an observational design) are not indexed using RCT [pt]. These reports are related to randomised controlled trials and some provide detailed information about trial participants or procedures (for example, a report detailing the quality control programme for human papillomavirus DNA testing in a trial of cervical cancer screening21). It would be beneficial if MeSH indexing were able to accommodate observational studies from randomised controlled trials in such a way as to provide notice that a randomised controlled trial provided the data used in the reported analysis. However, the US National Library of Medicine might want also to consider tagging with RCT [pt] those reports presenting observational data from randomised controlled trials when they provide detailed information about trial participants or procedures, as they may fulfil functions similar to design and methods papers.
Limitations of the study
Our study has some limitations. Our sample of untagged records excludes records identified by Cochrane staff as randomised controlled trials but for which indexing had not been completed (50% of the records identified by the retagging project for 2005). It is possible that the records we examined in this study are from journals that are typically indexed quickly by the US National Library of Medicine, and our results may not be applicable to records from journals that are indexed more slowly. Our sample of untagged records was based on what we retrieved from the Cochrane highly sensitive search strategy,9 was limited to records with abstracts, and required the term “random” or a variant in the title or abstract; thus we may have missed untagged records in the Medline database that would have altered our findings. In addition, unlike the US National Library of Medicine indexers we did not read the full text of each tagged and untagged record in our sample. We did, however, obtain the full text of all records that we thought could not be accurately assessed by reading the title and abstract alone, and we obtained the full text of each record that we classified as not a randomised controlled trial (with the exception of one tagged record in Japanese). Finally, we did not undergo training by the US National Library of Medicine for indexing or have full access to its training materials, and so we cannot be sure whether records were not tagged owing to indexer errors or to US National Library of Medicine rules. If records were not indexed because of US National Library of Medicine definitions or rules, we cannot be certain of the nature of those rules.
Implications of the findings
CENTRAL, the register of published trial reports developed and maintained by the Cochrane Collaboration for the Cochrane Library, includes all Medline records that are tagged with RCT [pt] or CCT [pt]. The register contains untagged reports of trials only if they have been identified (for example, by hand searching of journals) by people associated with the Cochrane Collaboration and specially submitted to the register. Because the Medline retagging project was suspended in 2006, only two of the 572 untagged randomised controlled trial records discussed in this study have been included in the Cochrane register to date.
Although researchers have reported that for those years in which the Cochrane retagging project had been completed, use of the RCT [pt] in Medline provided adequate sensitivity and precision for those seeking a rapid method to identify randomised controlled trials in Medline,2 the situation has now changed. The lack of tagging of randomised controlled trials with the relevant publication type means that published evidence, including the main results, design and methods, baseline findings, long term follow-up, and secondary analyses, may be difficult to find in Medline. Published evidence on subgroup analysis and long term follow-up that is difficult to identify is especially troubling for comparative effectiveness researchers. In addition, those carrying out rapid searches of Medline for randomised controlled trials, searches utilising Clinical Queries (some of which depend on RCT [pt]), or searches unassisted by a trained information professional, should be aware that certain types of trial evidence may not be retrieved through use of RCT [pt] for the years since the retagging project ceased.16 22 Editors and others who review systematic reviews for journals should continue to examine the systematic review search strategies to ensure that the authors have not relied on simple searches such as that using RCT [pt].
Reports of randomised controlled trials identified by the Cochrane retagging project were more likely than reports tagged by the US National Library of Medicine to contain information on trial design and methods, baseline characteristics of participants, long term follow-up of participants, and secondary data analyses. Based on our identification of over 500 records of randomised controlled trials added to PubMed in 2005 and not tagged with RCT [pt], we estimate that at least 500 records each year, or a total of 3000 records describing randomised controlled trials but not indexed as such, may have been entered into Medline between 2006 and 2011. This estimate assumes that the US National Library of Medicine procedures for assigning RCT [pt] have not changed substantively since 2005. The US National Library of Medicine periodically revises indexing guidance and it would be informative to update our study to see if the patterns we observed here are also present in more recent years. If these patterns continue, those at the US National Library of Medicine responsible for indexing should consider whether there are specific changes that could improve the indexing of randomised trials. The US National Library of Medicine and the Cochrane Collaboration may also wish to jointly re-establish the Medline retagging project. Finally, clinicians, researchers, and healthcare decision makers may be missing important evidence from randomised controlled trials, even when published, if they do not use validated search strategies.
When searching for information from randomised controlled trials, it should not be assumed that publications of main results are the only published information relevant to the conduct of the trial or the details of the intervention. Because the Cochrane Collaboration’s Medline retagging project has ceased, reliance on publication type is not wise and validated search strategies should be used when seeking complete identification of reports from randomised controlled trials.
What is already known on this topic
Identification of randomised trials is important for systematic reviewers, decisions on a trial, or the characterisation of the conduct or reporting of trials
The indexing tag RCT [pt] in Medline aids in retrieval of randomised controlled trials from Medline and is a component of some validated search strategies
The Cochrane Collaboration and the US National Library of Medicine teamed up in 1994-2006 to identify and tag reports from randomised controlled trials not already tagged with RCT [pt] in Medline
What this study adds
Over 500 randomised controlled trial reports entered into Medline in 2005 were not tagged with RCT [pt]
Only half of the untagged trial reports have any clinical trial indexing terms applied, thus they may be difficult to identify by systematic reviewers and others searching for trials
Untagged reports from randomised controlled trials were more likely than tagged reports to contain information on trial design and methods, baseline characteristics of participants, long term follow-up, or secondary data analyses
Cite this as: BMJ 2011;344:d7501
Part of this material was presented in poster form at the 16th Cochrane Colloquium, Freiburg, Germany, 2008 (Wieland S, Dickersin K. Why were they missed? Randomized controlled trials (RCTs) identified through the Medline Retagging Project but not the US National Library of Medicine (NLM)) and the Sixth International Congress on Peer Review and Biomedical Publication, Vancouver, Canada, 2009 (Wieland S, Dickersin K. Understanding why the US National Library of Medicine (NLM) fails to properly index the publication type [pt] of a number of randomized controlled trials (RCTs)). We thank Carol Lefebvre, senior information specialist at the UK Cochrane Centre, for her comments on an earlier version of this manuscript.
Contributors: LSW is the guarantor of the study, had full access to all the data in the study, and takes responsibility for the integrity of the data and the accuracy of the data analysis. LSW contributed to the study conception and design, acquisition of data, analysis and interpretation of data, and drafting the article. KD and KR contributed to the study design, analysis and interpretation of data, and drafting the article.
Funding: This study received no funding.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: Not required.
Data sharing: No additional data available.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.