Intended for healthcare professionals


Sample sizes of studies on diagnostic accuracy: literature survey

BMJ 2006; 332 doi: (Published 11 May 2006) Cite this as: BMJ 2006;332:1127
  1. Lucas M Bachmann, senior research fellow (lucas.bachmann{at},
  2. Milo A Puhan, research fellow2,
  3. Gerben ter Riet, clinical epidemiologist3,
  4. Patrick M Bossuyt, professor4
  1. 1 Division of Epidemiology and Biostatistics, Department of Social and Preventive Medicine, University of Bern, Switzerland
  2. 2 Horten Centre, University of Zurich, CH-8091 Zurich, Switzerland
  3. 3 Department of General Practice, Academic Medical Centre, 1105 AZ Amsterdam, Netherlands
  4. 4 Department of Clinical Epidemiology and Biostatistics, Academic Medical Centre, Amsterdam, Netherlands
  1. Correspondence to: L M Bachmann
  • Accepted 7 March 2006


Objectives To determine sample sizes in studies on diagnostic accuracy and the proportion of studies that report calculations of sample size.

Design Literature survey.

Data sources All issues of eight leading journals published in 2002.

Methods Sample sizes, number of subgroup analyses, and how often studies reported calculations of sample size were extracted.

Results 43 of 8999 articles were non-screening studies on diagnostic accuracy. The median sample size was 118 (interquartile range 71-350) and the median prevalence of the target condition was 43% (27-61%). The median number of patients with the target condition—needed to calculate a test's sensitivity—was 49 (28-91). The median number of patients without the target condition—needed to determine a test's specificity—was 76 (27-209). Two of the 43 studies (5%) reported a priori calculations of sample size. Twenty articles (47%) reported results for patient subgroups. The number of subgroups ranged from two to 19 (median four). No studies reported that sample size was calculated on the basis of preplanned analyses of subgroups.

Conclusion Few studies on diagnostic accuracy report considerations of sample size. The number of participants in most studies on diagnostic accuracy is probably too small to analyse variability of measures of accuracy across patient subgroups.


  • This article was posted on on 20 April 2006:

  • Contributors All members of the SUBIRAR (subjectivity rationality and reasoning) research collaboration (Klaus Eichler, Madlaina Scharplatz, and Johann Steurer, Horten Centre, University of Zurich, Switzerland, Ulrich Hoffrage, Max Planck Institute for Human Development and Cognition, Berlin, Germany; Alfons G Kessels, Hans Severens, Maastricht University, Germany; Khalid S Khan, University of Birmingham, UK; Jos Kleijnen, Centre for Reviews and Dissemination, University of York, UK) were involved in the design and critical review of the study. LMB, MAP, and GtR developed the protocol. LMB and MAP acquired the data. All authors interpreted the data and helped prepare the manuscript. LMB was guarantor.

  • Funding LMB was supported by the Swiss National Science Foundation (grants 3233B0-103182 and 3200B0-103183).

  • Competing interests None declared.

View Full Text