BMJ 2003;326:41-44 ( 4 January )

Education and debate

Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative

Patrick M Bossuyt, professor of clinical epidemiologya Johannes B Reitsma, clinical epidemiologista David E Bruns, editorb Constantine A Gatsonis, professor of medical science (biostatistics) and applied mathematicsc Paul P Glasziou, professor of evidence based practiced Les M Irwig, professor of epidemiologye Jeroen G Lijmer, clinical epidemiologista David Moher, directorf Drummond Rennie, deputy editorg Henrica C W de Vet, professor of epidemiologyh for the STARD steering group.

a Department of Clinical Epidemiology and Biostatistics, Academic Medical Center, University of Amsterdam, PO Box 22700, 1100 DE Amsterdam, Netherlands, b Clinical Chemistry, University of Virginia, Charlottesville, VA 22903-0757, USA, c Center for Statistical Sciences, Brown University, Providence, RI 02912, USA, d School of Population Health, University of Queensland, Brisbane, Queensland 4006, Australia, e Department of Public Health and Community Medicine, University of Sydney, Sydney, NSW 2006, Australia, f Thomas C Chalmer's Center for Systematic Reviews, Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8LI, Canada, g JAMA, 515 N State St, Chicago, IL 60610, USA, h Institute for Research in Extramural Medicine, VU University Medical Center, 1081 BT Amsterdam, Netherlands

Correspondence to: P Bossuyt stard{at}amc.uva.nl

Objective: To improve the accuracy and completeness of reporting of studies of diagnostic accuracy, to allow readers to assess the potential for bias in a study, and to evaluate a study's generalisability.
Methods: The Standards for Reporting of Diagnostic Accuracy (STARD) steering committee searched the literature to identify publications on the appropriate conduct and reporting of diagnostic studies and extracted potential items into an extensive list. Researchers, editors, and members of professional organisations shortened this list during a two day consensus meeting, with the goal of developing a checklist and a generic flow diagram for studies of diagnostic accuracy.
Results: The search for published guidelines about diagnostic research yielded 33 previously published checklists, from which we extracted a list of 75 potential items. At the consensus meeting, participants shortened the list to a 25 item checklist, by using evidence, whenever available. A prototype of a flow diagram provides information about the method of patient recruitment, the order of test execution, and the numbers of patients undergoing the test under evaluation and the reference standard, or both.
Conclusions: Evaluation of research depends on complete and accurate reporting. If medical journals adopt the STARD checklist and flow diagram, the quality of reporting of studies of diagnostic accuracy should improve to the advantage of clinicians, researchers, reviewers, journals, and the public.



© 2003 BMJ Publishing Group Ltd

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Articles

Clinical prediction rules
Gavin Falk and Tom Fahey
BMJ 2009 339: b2899. [Extract] [Full Text]

Sepsis: definition, epidemiology, and diagnosis
Andrew Lever and Iain Mackenzie
BMJ 2007 335: 879-883. [Extract] [Full Text] [PDF]

Duplex ultrasonography, magnetic resonance angiography, and computed tomography angiography for diagnosis and assessment of symptomatic, lower limb peripheral arterial disease: systematic review
Ros Collins, Jane Burch, Gillian Cranny, Raquel Aguiar-Ibáñez, Dawn Craig, Kath Wright, Elizabeth Berry, Michael Gough, Jos Kleijnen, and Marie Westwood
BMJ 2007 334: 1257. [Abstract] [Full Text] [PDF]

Reporting diagnostic tests
Sharon E Straus
BMJ 2003 326: 3-4. [Extract] [Full Text] [PDF]

This article has been cited by other articles:

  • MacDougall, H. G., Weber, K. P., McGarvie, L. A., Halmagyi, G. M., Curthoys, I. S. (2009). The video head impulse test: Diagnostic accuracy in peripheral vestibulopathy. Neurology 73: 1134-1141 [Abstract] [Full text]  
  • Smith, M. W., Reed, J.D., Facco, R., Hlaing, T., McGee, A., Hicks, B. M., Aaland, M. (2009). The Reliability of Nonreconstructed Computerized Tomographic Scans of the Abdomen and Pelvis in Detecting Thoracolumbar Spine Injuries in Blunt Trauma Patients with Altered Mental Status. JBJS 91: 2342-2349 [Abstract] [Full text]  
  • Richardson, E., Good, M., McGrath, G., More, S. J. (2009). The use of Geographic Information System (GIS) and non-GIS methods to assess the external validity of samples postcollection. jvdi 21: 633-640 [Abstract] [Full text]  
  • Mitchell, A J (2009). CSF phosphorylated tau in the diagnosis and prognosis of mild cognitive impairment and Alzheimer's disease: a meta-analysis of 51 studies. J. Neurol. Neurosurg. Psychiatry 80: 966-975 [Abstract] [Full text]  
  • Falk, G., Fahey, T. (2009). Clinical prediction rules. BMJ 339: b2899-b2899 [Full text]  
  • Mattsson, N., Zetterberg, H., Hansson, O., Andreasen, N., Parnetti, L., Jonsson, M., Herukka, S.-K., van der Flier, W. M., Blankenstein, M. A., Ewers, M., Rich, K., Kaiser, E., Verbeek, M., Tsolaki, M., Mulugeta, E., Rosen, E., Aarsland, D., Visser, P. J., Schroder, J., Marcusson, J., de Leon, M., Hampel, H., Scheltens, P., Pirttila, T., Wallin, A., Jonhagen, M. E., Minthon, L., Winblad, B., Blennow, K. (2009). CSF Biomarkers and Incipient Alzheimer Disease in Patients With Mild Cognitive Impairment. JAMA 302: 385-393 [Abstract] [Full text]  
  • Azuaje, F., Devaux, Y., Wagner, D. (2009). Computational biology for cardiovascular biomarker discovery. Brief Bioinform 10: 367-377 [Abstract] [Full text]  
  • Brunser, A. M., Lavados, P. M., Hoppe, A., Lopez, J., Valenzuela, M., Rivas, R. (2009). Accuracy of Transcranial Doppler Compared With CT Angiography in Diagnosing Arterial Obstructions in Acute Ischemic Strokes. Stroke 40: 2037-2041 [Abstract] [Full text]  
  • Thomas, E., Dore, C. J. (2009). Statistical guidelines for contributors to Rheumatology. Rheumatology (Oxford) 48: 461-461 [Full text]  
  • Breusegem, C., Vandewalle, E., Van Calster, J., Stalmans, I., Zeyen, T. (2009). Predictive Value of a Topical Dexamethasone Provocative Test before Intravitreal Triamcinolone Acetonide Injection. IOVS 50: 573-576 [Abstract] [Full text]  
  • DeRoos, S. T., Chillag, K. L., Keeler, M., Gilbert, D. L. (2009). Effects of Sleep Deprivation on the Pediatric Electroencephalogram. Pediatrics 123: 703-708 [Abstract] [Full text]  
  • Wells, K., Littell, J. H. (2009). Study Quality Assessment in Systematic Reviews of Research on Intervention Effects. Research on Social Work Practice 19: 52-62 [Abstract]  
  • Krug, B., Crott, R., Lonneux, M., Baurain, J.-F., Pirson, A.-S., Vander Borght, T. (2008). Role of PET in the Initial Staging of Cutaneous Malignant Melanoma: Systematic Review. Radiology 249: 836-844 [Abstract] [Full text]  
  • du Toit, C, Stieler, M, Saunders, R, Bisset, L, Vicenzino, B (2008). Diagnostic accuracy of power Doppler ultrasound in patients with chronic tennis elbow. Br. J. Sports. Med. 42: 872-876 [Abstract] [Full text]  
  • Gaffikin, L., McGrath, J., Arbyn, M., Blumenthal, P. D (2008). Avoiding verification bias in screening test evaluation in resource poor settings: a case study from Zimbabwe. Clin Trials 5: 496-503 [Abstract]  
  • Jelinek, M. (2008). Spectrum bias: why generalists and specialists do not connect. Evid. Based Med. 13: 132-133 [Full text]  
  • Youngstrom, E. (2008). Commentary: Evidence-based Assessment is not Evidence-based Medicine--Commentary on Evidence-based Assessment of Cognitive Functioning in Pediatric Psychology. J Pediatr Psychol 33: 1015-1020 [Full text]  
  • Bossuyt, P. M. M. (2008). STARD Statement: Still Room for Improvement in the Reporting of Diagnostic Accuracy Studies. Radiology 248: 713-714 [Full text]  
  • Chaput, M., Handschumacher, M. D., Tournoux, F., Hua, L., Guerrero, J. L., Vlahakes, G. J., Levine, R. A. (2008). Mitral Leaflet Adaptation to Ventricular Remodeling: Occurrence and Adequacy in Patients With Functional Mitral Regurgitation. Circulation 118: 845-852 [Abstract] [Full text]  
  • Migliacci, R., Nasorri, R., Ricciarini, P., Gresele, P. (2008). Ankle-brachial index measured by palpation for the diagnosis of peripheral arterial disease. Fam Pract 25: 228-232 [Abstract] [Full text]  
  • McBain, V A, Forrester, J V, Lois, N (2008). Fundus autofluorescence in the diagnosis of cystoid macular oedema. Br J Ophthalmol 92: 946-949 [Abstract] [Full text]  
  • Ore, L., Garzozi, H. J, Tamir, A., Stein, N., Cohen-Dar, M. (2008). Performance measures of the illiterate E-chart vision-screening test used in Northern District Israeli school children. J Med Screen 15: 65-71 [Abstract] [Full text]  
  • Fletcher, J. W., Djulbegovic, B., Soares, H. P., Siegel, B. A., Lowe, V. J., Lyman, G. H., Coleman, R. E., Wahl, R., Paschold, J. C., Avril, N., Einhorn, L. H., Suh, W. W., Samson, D., Delbeke, D., Gorman, M., Shields, A. F. (2008). Recommendations on the Use of 18F-FDG PET in Oncology. JNM 49: 480-508 [Abstract] [Full text]  
  • Liang, Q-L, Shi, H-Z, Qin, X-J, Liang, X-D, Jiang, J, Yang, H-B (2008). Diagnostic accuracy of tumour markers for malignant pleural effusion: a meta-analysis. Thorax 63: 35-41 [Abstract] [Full text]  
  • Lara, C., Ponce de Leon, S., Foncerrada, H., Vega, M. (2007). Diabetes or Impaired Glucose Tolerance: Does the label matter?. Diabetes Care 30: 3029-3030 [Full text]  
  • Virgili, G., Menchini, F., Dimastrogiovanni, A. F., Rapizzi, E., Menchini, U., Bandello, F., Chiodini, R. G. (2007). Optical Coherence Tomography versus Stereoscopic Fundus Photography or Biomicroscopy for Diagnosing Diabetic Macular Edema: A Systematic Review. IOVS 48: 4963-4973 [Abstract] [Full text]  
  • Lever, A., Mackenzie, I. (2007). Sepsis: definition, epidemiology, and diagnosis. BMJ 335: 879-883 [Full text]  
  • Biesheuvel, C., Irwig, L., Bossuyt, P. (2007). Observed Differences in Diagnostic Test Accuracy between Patient Subgroups: Is It Real or Due to Reference Standard Misclassification?. Clin. Chem. 53: 1725-1729 [Abstract] [Full text]  
  • Cordonnier, C., Al-Shahi Salman, R., Wardlaw, J. (2007). Spontaneous brain microbleeds: systematic review, subgroup analyses and standards for study design and reporting. Brain 130: 1988-2003 [Abstract] [Full text]  
  • Collins, R., Burch, J., Cranny, G., Aguiar-Ibanez, R., Craig, D., Wright, K., Berry, E., Gough, M., Kleijnen, J., Westwood, M. (2007). Duplex ultrasonography, magnetic resonance angiography, and computed tomography angiography for diagnosis and assessment of symptomatic, lower limb peripheral arterial disease: systematic review. BMJ 334: 1257-1257 [Abstract] [Full text]  
  • Viljoen, A., Twomey, P. J (2007). True or not: uncertainty of laboratory results. J. Clin. Pathol. 60: 587-588 [Full text]  
  • Viljoen, A., Twomey, P. J (2007). Limitations of transferability of absolute cut-points in non-standardised assays. J. Clin. Pathol. 60: 584-584 [Full text]  
  • Arzola, C., Davies, S., Rofaeel, A., Carvalho, J. C. A. (2007). Ultrasound Using the Transverse Approach to the Lumbar Spine Provides Reliable Landmarks for Labor Epidurals. Anesth. Analg. 104: 1188-1192 [Abstract] [Full text]  
  • Jiang, J., Shi, H.-Z., Liang, Q.-L., Qin, S.-M., Qin, X.-J. (2007). Diagnostic Value of Interferon-{gamma} in Tuberculous Pleurisy: A Metaanalysis. Chest 131: 1133-1141 [Abstract] [Full text]  
  • Harbord, R. M., Deeks, J. J., Egger, M., Whiting, P., Sterne, J. A. C. (2007). A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 8: 239-251 [Abstract] [Full text]  
  • Moreno-Ramirez, D., Ferrandiz, L., Nieto-Garcia, A., Carrasco, R., Moreno-Alvarez, P., Galdeano, R., Bidegain, E., Rios-Martin, J. J., Camacho, F. M. (2007). Store-and-Forward Teledermatology in Skin Cancer Triage: Experience and Evaluation of 2009 Teleconsultations. Arch Dermatol 143: 479-483 [Abstract] [Full text]  
  • Leeflang, M., Reitsma, J., Scholten, R., Rutjes, A., Di Nisio, M., Deeks, J., Bossuyt, P. (2007). Impact of Adjustment for Quality on Results of Metaanalyses of Diagnostic Accuracy. Clin. Chem. 53: 164-172 [Abstract] [Full text]  
  • Coops, A., Henson, D. B., Kwartz, A. J., Artes, P. H. (2006). Automated Analysis of Heidelberg Retina Tomograph Optic Disc Images by Glaucoma Probability Score. IOVS 47: 5348-5355 [Abstract] [Full text]  
  • Sood, A., Schuyler, M. (2006). Finally, a Perfect Diagnostic Test for Pulmonary Tuberculosis--or Is It?. Am. J. Respir. Crit. Care Med. 174: 963-964 [Full text]  
  • Hoffmann, S, Hartung, T (2006). Toward an evidence-based toxicology. Hum Exp Toxicol 25: 497-513 [Abstract]  
  • Wildschut, H. I.J., Peters, T.J., Weiner, C. P. (2006). Screening in women's health, with emphasis on fetal Down's syndrome, breast cancer and osteoporosis. Hum Reprod Update 12: 499-512 [Abstract] [Full text]  
  • Locker, T, Goodacre, S, Sampson, F, Webster, A, Sutton, A J (2006). Meta-analysis of plethysmography and rheography in the diagnosis of deep vein thrombosis.. Emerg. Med. J. 23: 630-635 [Abstract] [Full text]  
  • Gatsonis, C., Paliwal, P. (2006). Meta-analysis of diagnostic and screening test accuracy evaluations: methodologic primer.. Am. J. Roentgenol. 187: 271-281 [Abstract] [Full text]  
  • Granchi, D., Pellacani, A., Spina, M., Cenni, E., Savarino, L. M., Baldini, N., Giunti, A. (2006). Serum Levels of Osteoprotegerin and Receptor Activator of Nuclear Factor-{kappa}B Ligand as Markers of Periprosthetic Osteolysis. JBJS 88: 1501-1509 [Abstract] [Full text]  
  • Shunmugam, M., Azuara-Blanco, A. (2006). The quality of reporting of diagnostic accuracy studies in glaucoma using the heidelberg retina tomograph.. IOVS 47: 2317-2323 [Abstract] [Full text]  
  • Lee, A., Fan, L. T. Y., Gin, T., Karmakar, M. K., Ngan Kee, W. D. (2006). A systematic review (meta-analysis) of the accuracy of the mallampati tests to predict the difficult airway.. Anesth. Analg. 102: 1867-1878 [Abstract] [Full text]  
  • Bagshaw, S. M., McAlister, F. A., Manns, B. J., Ghali, W. A. (2006). Acetylcysteine in the Prevention of Contrast-Induced Nephropathy: A Case Study of the Pitfalls in the Evolution of Evidence. Arch Intern Med 166: 161-166 [Abstract] [Full text]  
  • Jones, C. M., Athanasiou, T., Tekkis, P. P., Malinovski, V., Purkayastha, S., Haq, A., Kokotsakis, J., Darzi, A. (2005). Does Doppler echography have a diagnostic role in patency assessment of internal thoracic artery grafts?. Eur. J. Cardiothorac. Surg. 28: 692-700 [Abstract] [Full text]  
  • Savarino, L., Granchi, D., Cenni, E., Baldini, N., Greco, M., Giunti, A. (2005). Systemic cross-linked N-terminal telopeptide and procollagen I C-terminal extension peptide as markers of bone turnover after total hip arthroplasty. J Bone Joint Surg Br 87-B: 571-576 [Abstract] [Full text]  
  • Birim, O., Kappetein, A. P., Stijnen, T., Bogers, A. J.J.C. (2005). Meta-Analysis of Positron Emission Tomographic and Computed Tomographic Imaging in Detecting Mediastinal Lymph Node Metastases in Nonsmall Cell Lung Cancer. Ann. Thorac. Surg. 79: 375-382 [Abstract] [Full text]  
  • Loy, C. T., Irwig, L. (2004). Accuracy of Diagnostic Tests Read With and Without Clinical Information: A Systematic Review. JAMA 292: 1602-1609 [Abstract] [Full text]  
  • Straus, S. E (2003). Reporting diagnostic tests. BMJ 326: 3-4 [Full text]  

Rapid Responses:

Read all Rapid Responses

Climbing the Health Outcome Mountain: are clinical microbiologists in the diagnostic team?
Giuseppe Giocoli
bmj.com, 6 Jan 2003 [Full text]
Would the incorporation of a sample size item be appropriate in the STARD checklist?
Jairo Echeverry-Raad, et al.
bmj.com, 12 Jan 2003 [Full text]
Re: Would the incorporation of a sample size item be appropriate in the STARD checklist?
Patrick M Bossuyt
bmj.com, 14 Jan 2003 [Full text]
Accuracy should be evaluated in relation to patient outcomes and the population context
David Jenkins, et al.
bmj.com, 15 Jan 2003 [Full text]



Access jobs at BMJ Careers
Whats new online at Student 

BMJ