Intended for healthcare professionals

Education And Debate Evidence base of clinical diagnosis

Designing studies to ensure that estimates of test accuracy are transferable

BMJ 2002; 324 doi: (Published 16 March 2002) Cite this as: BMJ 2002;324:669
  1. Les Irwig, professor (,
  2. Patrick Bossuyt, professor of clinical epidemiologyb,
  3. Paul Glasziou, professor of evidence based practicec,
  4. Constantine Gatsonis, professord,
  5. Jeroen Lijmer, clinical researcherb
  1. a Screening and Test Evaluation Program, Department of Public Health and Community Medicine, University of Sydney, NSW 2006, Australia
  2. b Department of Clinical Epidemiology and Biostatistics, Academic Medical Centre, PO Box 22700, 1100 DE Amsterdam, Netherlands
  3. c School of Population Health, University of Queensland Medical School, Herston, QLD 4006, Australia
  4. d Center for Statistical Sciences, Brown University, Providence, RI 02192, USA

    This is the third in a series of five articles

    Measures of test accuracy are often thought of as fixed characteristics determinable by research and then applicable in practice. Yet even when tests are evaluated in a study of adequate quality—one including such features as consecutive patients, a good reference standard, and independent, blinded assessments of tests and the reference standard1—performance of a diagnostic test in one setting may vary significantly from the results reported elsewhere.28 In this paper, we explore the reasons for this variability and its implications for the design of studies of diagnostic tests.

    Summary points

    Test accuracy may vary considerably from one setting to another

    This may be due to the target condition, the clinical problem, what other tests have been done, or how the test is carried out

    Larger studies than those usually done for diagnostic tests will be needed to assess transferability of results

    These studies should explore the extent to which variation in test accuracy between populations can be explained by patient and test features

    True variability in test accuracy

    To interpret a test's results in different setting requires an understanding of whether and why the test's accuracy varies. Broadly speaking, measures of accuracy fall into two broad categories: measures of discrimination between people who are and who are not diseased, and measures of prediction used to estimate post-test probability of disease.

    Measures of discrimination

    Global measures of test accuracy assess only the ability of the test to discriminate between people with and without a disease. Common examples are the area under the receiver operating characteristic curve (ROC), and the odds ratio (OR), sometimes also referred to as the diagnostic odds ratio. Such results may suffice for some broad health policy decisions—for example, to decide whether a new test is in general better than an existing test for …

    View Full Text

    Log in

    Log in through your institution


    * For online subscription