BMJ  2004;328:1040 (1 May), doi:10.1136/bmj.38068.557998.EE (published 8 April 2004)

Paper

Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey

R Brian Haynes, professor1, Nancy L Wilczynski, research associate1, the Hedges Team

1 Health Information Research Unit, Department of Clinical Epidemiology and Biostatistics, McMaster University Faculty of Health Sciences, 1200 Main Street West, Hamilton, ON, Canada, L8N 3Z5

Correspondence to: R B Haynes bhaynes{at}mcmaster.ca

Abstract

Objective To develop optimal search strategies in Medline for retrieving sound clinical studies on the diagnosis of health disorders.

Design Analytical survey.

Setting Medline, 2000.

Participants 170 journals for 2000 of which 161 were indexed in Medline.

Main outcome measures The sensitivity, specificity, precision ("positive predictive value"), and accuracy of 4862 unique terms in 17 287 combinations were determined by comparison with a hand search of all articles (the "gold standard") in 161 journals published during 2000 (49 028 articles).

Results Only 147 (18.9%) of 778 articles about diagnostic tests met basic criteria for scientific merit. Combinations of search terms reached peak sensitivities of 98.6% at a specificity of 74.3%. Compared with best single terms, best multiple terms increased sensitivity for sound studies by 6.8% (absolute increase), while also increasing specificity (absolute increase 6.0%) when sensitivity was maximised. When terms were combined to maximise specificity, the single term, specificity.tw. (98.4%), outperformed combinations of terms. The strategies newly reported in this paper outperformed other validated search strategies except for one strategy that had slightly higher sensitivity (99.3% v 98.6%) but lower specificity (54.7% v 74.3%).

Conclusion New empirical search strategies in Medline can optimise retrieval of articles reporting high quality clinical studies of diagnosis.

Introduction

Clinical research, usually widely accessible first in the biomedical journal literature, provides quantitative information about the sensitivity, specificity, and predictive value of many diagnostic tests. This information, however, is buried in a much larger biomedical literature.

Finding the current best evidence in Medline for a diagnostic process is daunting, given that Medline has over 11 million articles from over 4500 journals, covering all aspects of biomedical and health research.1 A recent qualitative study found that two of the six obstacles to answering clinical questions with evidence were the time required to find information and the difficulty in selecting an optimal search strategy.2

Search filters ("hedges") can improve the retrieval of clinically relevant and scientifically sound studies from Medline and similar databases.3-7 More sophisticated search filters can be created by combining disease content terms with medical subject headings, explosions, publication types, subheadings, and textwords (see box).

In the early 1990s our group at McMaster University developed search filters on a small subset of 10 journals and for four types of article (therapy, diagnosis, prognosis, and causation (aetiology)).8 9 These strategies have been adapted for use in the Clinical Queries interface of Medline (www.ncbi.nlm.nih.gov/entrez/query/static/clinical.html). This research is being updated and expanded with data from 161 journals indexed in Medline from 2000. We report on the information retrieval properties of single terms and combinations of terms in Medline for identifying methodologically sound studies on the diagnosis of health disorders.

Methods

We developed search strategies by using methodological search terms and phrases in a subset of Medline records matched with a hand search of the contents of 161 journal titles for 2000. The search strategies were treated as diagnostic tests for sound studies, and the manual review of the literature was treated as the gold standard. It is potentially confusing to use the terminology of diagnostic testing for assessing strategies for retrieving articles about diagnostic tests, especially when some of the search terms are the same. Nevertheless, the principles for retrieval are the same as those for diagnosis. Thus we determined the sensitivity, specificity, accuracy, and precision (a library science term equivalent to the diagnostic test term "positive predictive value") of single term and multiple term Medline search strategies (box). Sensitivity and specificity are not affected by the proportion of high quality articles in the database; precision depends on this proportion, and so does accuracy, but to a lesser extent (see bmj.com).

After extensive attempts only 2% (n = 968) of the handsearch items did not match citations in Medline. Unmatched citations that were detected by a search strategy led to slight underestimates of the precision, specificity, and accuracy of the search strategy. Similarly, unmatched citations that were not detected by a search strategy led to slight overestimates of specificity and accuracy.

Manual review
Six research assistants reviewed all issues of 170 journals for 2000 of which 161 were indexed in Medline. The journal titles were regularly reviewed for content for four evidence based journals prepared by our group, according to an explicit process that assesses the scientific merit and clinical relevance of original and review articles for health care (www.acpjc.org/shared/purpose_and_procedure.htm).


Terms and definitions for search strategies

  • Sensitivity—proportion of high quality articles retrieved
  • Specificity—proportion of low quality diagnosis studies or non-diagnosis studies not retrieved
  • Precision—proportion of retrieved articles of high quality
  • Accuracy—proportion of all articles correctly categorised
  • "ANDed"—combined with
  • di—diagnosis subheading
  • du—diagnostic use subheading
  • exp—explosion
  • fs—floating subheading
  • MeSH—medical subject heading
  • mp—multiple posting (term in title, abstract, or MeSH heading)
  • pt—publication type
  • sh—MeSH subject heading
  • tw—textword
  • xs—exploded subheading
  • :—truncation


Methodological criteria for evaluating studies of diagnosis were: inclusion of a range of participants, some but not all of whom have the disorder or derangement of interest; use of an objective diagnostic ("gold") standard or current clinical standard for diagnosis; all participants receiving the new test and some form of the diagnostic standard; interpretation of diagnostic standard without knowledge of test result, and vice versa; and analysis consistent with study design. These criteria were developed for critical appraisal of the healthcare literature, and the second to fourth criteria have been empirically validated.10 11 The research assistants were rigorously calibrated and periodically checked for application of criteria to determine if each article was methodologically sound for any of six categories of purpose (diagnosis and screening, treatment and prevention, prognosis, aetiology and harm, clinical prediction guides, and economics).12

Collecting search terms and data
To construct a comprehensive set of possible search terms, we listed MeSH terms and textwords related to study criteria and then sought input from clinicians and librarians, review of published and unpublished searching strategies from other groups, and requests to Medline experts. Individuals were asked what terms or phrases they used when searching for each category. We compiled a list of 5395 terms, of which 4862 were unique. All terms were tested in all purpose categories using the Ovid Technologies searching system.

Data collection forms were used to record handsearched data for each article found in each issue of the 161 journal titles. After verification of the data online, the handsearch data were written to a database. Each journal title was searched in Medline for 2000, and the full Medline records were captured for all articles in the journals. Medline data were then linked with the handsearch data.

Testing strategies
We calculated the sensitivity, specificity, precision, and accuracy for each term. Individual search terms with a sensitivity of more than 25% and a specificity of more than 75% were incorporated into the development of search strategies that included a combination of two or more terms. All combinations of terms used the Boolean OR.

For the development of multiple term search strategies to optimise either sensitivity or specificity, we tested the combination of individual terms with all two term search strategies with sensitivity at least 75% and specificity at least 50%. For optimising accuracy, two term search strategies with accuracy of more than 75% were considered for multiple term development. Overall, we tested 17 287 multiple term search strategies. Search strategies were also developed that optimised combined sensitivity and specificity.

Results

Overall, 49 028 articles were included in the analysis. Of these, 778 (1.6% of original studies and review articles, case reports, or general interest papers) were classified as original studies evaluating a diagnosis question, of which 147 (18.9%) met the methodological criteria.

See bmj.com for the operating characteristics for the single terms with the highest sensitivity and specificity. The best accuracy when keeping sensitivity to 50% or more was seen with the term "specificity.tw." (.tw. is Ovid search system's syntax for searching all words in the title and abstract of an article).

See bmj.com for the strategies yielding the highest combined sensitivity and specificity based on testing of all strategies for combinations up to three terms. Some one term and two term strategies outperformed multiple term strategies. Because of the low prevalence of diagnosis articles, the accuracy of search terms is driven by their specificity, and thus the three search strategies yielding the highest accuracy are the same as those yielding the highest specificity. Table 1 shows the search strategies best optimising sensitivity and specificity.


View this table:
[in this window]
[in a new window]
 
Table 1 Top search strategies for optimising sensitivity (while keeping specificity >=50%) and specificity (while keeping sensitivity >=50%). Values are percentages (95% confidence intervals)

 

Discussion

Our study documents search terms with best sensitivity, specificity, accuracy, and balance of sensitivity and specificity for retrieving high quality studies of diagnostic tests from Medline. This research updates our previous one published in 1994.9 When the 1991 strategies for diagnosis articles were tested in the 2000 database, the performance of the 2000 strategies was consistently better (table 2). We did not have enough data to do an independent validation of our diagnostic test strategies and thus risked overestimating their performance. We did independent validations for studies of therapy, however, with the greatest statistically significant difference being 1.1% for one set of specificities. Furthermore, by double checking only articles that initially seemed to pass criteria, we may have underestimated performance: a few articles that met our criteria may have been missed in the hand search.


View this table:
[in this window]
[in a new window]
 
Table 2 Comparison of performance of strategies from 1991 and 2000, compiled using 2000 dataset. Values are percentages

 

Searchers who want retrieval with less non-relevant material can choose strategies with high specificity. For those interested in comprehensive retrievals or in searching for clinical topics with few citations, strategies with higher sensitivity may be more appropriate. The strategies that optimised the balance of sensitivity and specificity provided the best separation of eligible studies from others but did so without regard for whether sensitivity or specificity was affected. Regardless of the strategy used, we foresee that the most effective way to harness these strategies is to have them embedded within searching systems, either as clinical queries in PubMed or as stored searches that can be invoked at the user's request. The US National Library of Medicine has updated their Clinical Queries site for searching Medline for studies of diagnostic tests and other clinical topics, and they are available free (web.ncbi.nlm.nih.gov/entrez/query/static/clinical.shtml). Further, the new strategies have been incorporated into Ovid's main search engine for Medline (www.ovid.com), with the high specificity strategies being incorporated into Skolar (www.skolar.com).


What is already known on this topic

Information on the accuracy of diagnostic tests abounds in the medical literature but is often unknown to, or forgotten by, clinicians

The medical literature is accessible through large internet databases such as Medline, but few clinicians know how to search them well

What this study adds

Special Medline search strategies were developed and tested that retrieved up to 99% of scientifically strong studies of diagnostic tests

These strategies have been automated for use in PubMed Medline at a special screen, Clinical Queries, and Ovid Technology's Medline and Skolar services


Other investigators have attempted to find strategies that outperform those we previously published, with some success.4-7 9 Our new strategies have set the bar higher, but there is still considerable room for improvement, particularly for the precision of searches.


Full titles of journals indexed in Medline are on bmj.com

This is the abridged version of an article that was posted on bmj.com on 8 April 2004: http://bmj.com/cgi/doi/10.1136/bmj.38068.557998.EE

The Hedges Team includes Angela Eady, Brian Haynes, Susan Marks, Ann McKibbon, Doug Morgan, Cindy Walker-Dilks, Stephen Walter, Stephen Werre, Nancy Wilczynski, and Sharon Wong, all at McMaster University Faculty of Health Sciences.

Contributors: See bmj.com

Funding: This study was funded by the US National Institutes of Health (grant No 1 RO1 LM06866).

Competing interests: None declared.

Ethical approval: Not required.

References

  1. NLM Fact Sheet. www.nlm.nih.gov/pubs/factsheets/bsd.html (accessed 7 Dec 2003).
  2. Ely JW, Osheroff JA, Ebell MH, Chambliss ML, Vinson DC, Stevermer JJ, et al. Obstacles to answering doctors' questions about patient care with evidence: qualitative study. BMJ 2002;324: 710.[Abstract/Free Full Text]
  3. Haynes RB, McKibbon KA, Fitzgerald D, Guyatt GH, Walker CJ, Sackett DL. How to keep up with the medical literature. V. Access by personal computer to the medical literature. Ann Intern Med 1986;105: 810-6.
  4. Bachmann LM, Coray R, Estermann P, Ter Riet G. Identifying diagnostic studies in MEDLINE: reducing the number needed to read. J Am Med Inform Assoc 2002;9: 653-8.[Abstract/Free Full Text]
  5. Deville WL, Bezemer PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. J Clin Epidemiol 2000;53: 65-9.[CrossRef][Web of Science][Medline]
  6. Van der Weijden T, IJzermans CJ, Dinant GJ, van Duijn NP, de Vet R, Buntinx F. Identifying relevant diagnostic studies in MEDLINE. The diagnostic value of the erythrocyte sedimentation rate (ESR) and dipstick as an example. Fam Pract 1997;14: 204-8.[Abstract/Free Full Text]
  7. Vincent S, Greenley S, Beaven O. Clinical Evidence diagnosis: developing a sensitive search strategy to retrieve diagnostic studies on deep vein thrombosis: a pragmatic approach. Health Info Libr J 2003;20: 150-9.[CrossRef][Medline]
  8. Wilczynski NL, Walker CJ, McKibbon KA, Haynes RB. Assessment of methodologic search filters in MEDLINE. Proc Annu Symp Comput Appl Med Care 1993;: 601-5.
  9. Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC. Developing optimal search strategies for detecting clinically sound studies in MEDLINE. J Am Med Inform Assoc 1994;1: 447-58.[Abstract/Free Full Text]
  10. Jaeschke R, Guyatt GH, Sackett DL, et al for the Evidence Based Medicine Working Group. Users' guides to the medical literature: III-How to use an article about a diagnostic test A. Are the results of the study valid? JAMA 1994;271: 389-91.[Abstract/Free Full Text]
  11. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282: 1061-6.[Abstract/Free Full Text]
  12. Wilczynski NL, McKibbon KA, Haynes RB. Enhancing retrieval of best evidence for health care from bibliographic databases: calibration of the hand search of the literature. Medinfo 2001;10(Pt 1): 390-3.[Medline]
(Accepted 18 March 2004)


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Articles

Bridging the gaps in evidence based diagnosis
Sharon E Straus
BMJ 2006 333: 405-406. [Extract] [Full Text] [PDF]

Clever searching for evidence
Sharon Sanders and Chris Del Mar
BMJ 2005 330: 1162-1163. [Extract] [Full Text] [PDF]

Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey
R Brian Haynes, K Ann McKibbon, Nancy L Wilczynski, Stephen D Walter, Stephen R Werre for the Hedges Team
BMJ 2005 330: 1179. [Abstract] [Full Text] [PDF]

New search strategies retrieve morefrom Medline
BMJ 2004 328: 0. [Full Text]

This article has been cited by other articles:

  • Chase, H. S., Kaufman, D. R., Johnson, S. B., Mendonca, E. A. (2009). Voice Capture of Medical Residents' Clinical Information Needs During an Inpatient Rotation. J. Am. Med. Inform. Assoc. 16: 387-394 [Abstract] [Full text]  
  • Leeflang, M. M.G., Deeks, J. J., Gatsonis, C., Bossuyt, P. M.M., on behalf of the Cochrane Diagnostic Test Accuracy, (2008). Systematic Reviews of Diagnostic Test Accuracy. ANN INTERN MED 149: 889-897 [Abstract] [Full text]  
  • Coiera, E., Westbrook, J. I., Rogers, K. (2008). Clinical Decision Velocity is Increased when Meta-search Filters Enhance an Evidence Retrieval System. J. Am. Med. Inform. Assoc. 15: 638-646 [Abstract] [Full text]  
  • Astin, M. P., Brazzelli, M. G., Fraser, C. M., Counsell, C. E., Needham, G., Grimshaw, J. M. (2008). Developing a Sensitive Search Strategy in MEDLINE to Retrieve Studies on Assessment of the Diagnostic Performance of Imaging Techniques. Radiology 247: 365-373 [Abstract] [Full text]  
  • Fletcher, J. W., Djulbegovic, B., Soares, H. P., Siegel, B. A., Lowe, V. J., Lyman, G. H., Coleman, R. E., Wahl, R., Paschold, J. C., Avril, N., Einhorn, L. H., Suh, W. W., Samson, D., Delbeke, D., Gorman, M., Shields, A. F. (2008). Recommendations on the Use of 18F-FDG PET in Oncology. JNM 49: 480-508 [Abstract] [Full text]  
  • Hegedus, E J, Goode, A, Campbell, S, Morin, A, Tamaddoni, M, Moorman, C T III, Cook, C (2008). Physical examination tests of the shoulder: a systematic review with meta-analysis of individual tests. Br. J. Sports. Med. 42: 80-92 [Abstract] [Full text]  
  • Schmitt, B., Baugh, M., Scaglione, S. (2007). Practice corner: Chair's rounds. Evid. Based Med. 12: 133-134 [Full text]  
  • Staunton, M. (2007). Evidence-based Radiology: Steps 1 and 2--Asking Answerable Questions and Searching for Evidence. Radiology 242: 23-31 [Abstract] [Full text]  
  • McKibbon, K. A., Wilczynski, N. L., Haynes, R. B. (2006). Developing Optimal Search Strategies for Retrieving Qualitative Studies in PsycINFO. Eval Health Prof 29: 440-454 [Abstract]  
  • Cheng, A., Bell, D. (2006). Evidence Behind the WHO Guidelines: Hospital Care for Children: What is the Precision of Rapid Diagnostic Tests for Malaria?. J Trop Pediatr 52: 386-389 [Full text]  
  • Guha, I N, Parkes, J, Roderick, P R, Harris, S, Rosenberg, W M (2006). Non-invasive markers associated with liver fibrosis in non-alcoholic fatty liver disease.. Gut 55: 1650-1660 [Full text]  
  • Bardia, A., Wahner-Roedler, D. L., Erwin, P. L., Sood, A. (2006). Search Strategies for Retrieving Complementary and Alternative Medicine Clinical Trials in Oncology. Integr Cancer Ther 5: 202-205 [Abstract]  
  • Straus, S. E (2006). Bridging the gaps in evidence based diagnosis.. BMJ 333: 405-406 [Full text]  
  • Walters, L. A., Wilczynski, N. L., Haynes, R. B. (2006). Developing Optimal Search Strategies for Retrieving Clinically Relevant Qualitative Studies in EMBASE. Qual Health Res 16: 162-168 [Abstract]  
  • Verbeek, J, Salmi, J, Pasternack, I, Jauhiainen, M, Laamanen, I, Schaafsma, F, Hulshof, C, van Dijk, F (2005). A search strategy for occupational health intervention studies. Occup. Environ. Med. 62: 682-687 [Abstract] [Full text]  
  • Tatsioni, A., Zarin, D. A., Aronson, N., Samson, D. J., Flamm, C. R., Schmid, C., Lau, J. (2005). Challenges in Systematic Reviews of Diagnostic Technologies. ANN INTERN MED 142: 1048-1055 [Abstract] [Full text]  
  • Sanders, S., Del Mar, C. (2005). Clever searching for evidence. BMJ 330: 1162-1163 [Full text]  
  • Haynes, R B., McKibbon, K A., Wilczynski, N. L, Walter, S. D, Werre, S. R, for the Hedges Team, (2005). Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey. BMJ 330: 1179- [Abstract] [Full text]  
  • Hamilton, W., Sharp, D. (2004). Diagnosis of lung cancer in primary care: a structured review. Fam Pract 21: 605-611 [Abstract] [Full text]  
  • Wilczynski, N. L., Haynes, R. B., Lavis, J. N., Ramkissoonsingh, R., Arnold-Oatley, A. E., The HSR Hedges Team, (2004). Optimal search strategies for detecting health services research studies in MEDLINE. CMAJ 171: 1179-1185 [Abstract] [Full text]  
  • Onady, G. M., Raslich, M. A. (2004). Evidence-based Medicine: Searching Literature and Databases for Clinical Evidence (Search Tools). Pediatr. Rev. 25: 358-363 [Full text]  
  • Pai, M., McCulloch, M., Enanoria, W., Colford, J. M Jr (2004). Systematic reviews of diagnostic test evaluations: what's behind the scenes?. Evid. Based Med. 9: 101-103 [Full text]  

Rapid Responses:

Read all Rapid Responses

Does it help to use the best search strategy?
Jos H Verbeek, et al.
bmj.com, 14 Apr 2004 [Full text]
Network of expertise
Sara J Stock
bmj.com, 30 Apr 2004 [Full text]
New search filter for diagnostic studies: Ovid and PubMed versions not the same
Yngve T Falck-Ytter, et al.
bmj.com, 28 May 2004 [Full text]
Response to letter: New search filter for diagnostic studies: Ovid and PubMed versions not the same
Nancy L Wilczynski, et al.
bmj.com, 25 Jan 2006 [Full text]
Response to letter: Does it help to use the best search strategy?
Nancy L. Wilczynski, et al.
bmj.com, 3 Feb 2006 [Full text]
Response to letter: Network of expertise
Nancy L. Wilczynski, et al.
bmj.com, 3 Feb 2006 [Full text]



Access jobs at BMJ Careers
Whats new online at Student 

BMJ