Original ArticleEvaluation of Diagnostic Imaging Tests: Diagnostic Probability Estimation
Introduction
Any evaluation of a diagnostic test has to do with a particular generic context of its potential application: concern to learn about the presence of a particular illness in a particular domain of presentation for testing. Thus, for ventilation-perfusion (V-Q) scanning of the lungs, the evaluation might focus on the diagnosis of pulmonary embolism (PE) in the patient giving rise to a suspicion for this illness by a specified set of domain-defining criteria.
For whichever context, evaluation must focus on a particular conceptual variant of the test. Thus, as for V-Q scanning in this context, the concept of the test without further specifications is so vague that one does not know even the broadest nature of its results: is it images per se, descriptive readings or data based on these (such as number of mismatched defects), or interpretation of the images or data with respect to presence of the illness (such as “low probability” of PE)? In other words, without such specification, it is unclear, even, where the test ends and the interpretation of its result begins. The choice among these three conceptualizations of an imaging test is, in and of itself, already a major basis for divergent outlooks on the evaluation of imaging tests.
Another important basis for divergence of outlooks relates to the theoretical framework for diagnosis and, hence, for diagnostic research. It was the radiologist Lusted who, in collaboration with Ledley, introduced the Bayes’ theorem framework for this [1]. Yet, an alternative theoretical framework [2] deserves attention, one that in the context of diagnostic tests has particular merit with respect to imaging tests on the grounds that they produce descriptive readings or data on multiple aspects of the image(s).
In what follows, we outline very briefly the outlook that now prevails in the evaluation of diagnostic imaging tests, present critical questions about it, and then outline and justify the proposed alternative approach to setting diagnostic probabilities. We illustrate the prevailing outlook by the Prospective Investigation of Pulmonary Embolism Diagnosis (PIOPED) [3] and the alternative by reanalysis of the PIOPED data.
Section snippets
The prevailing outlook
The PIOPED was an eminent, multicenter study about the presence of PE in the domain of adults in whom symptoms suggestive of PE were present within the most recent 24 hours and prompted a request for radiologic assessment. The radiologic test at issue was V-Q scanning in conjunction with chest roentgenography 3, 4.
The definition of the V-Q test under evaluation involved three sequential elements:
- 1.
Production of the images (imaging proper)—when to produce them (recency of symptoms) and how
Critical questions
Taking some distance from this prevailing outlook and culture in the evaluation of diagnostic imaging tests, two important, interrelated questions arise. First, would it not be much more natural to take the development of categories of illness probability (“high probability,” etc.)—insofar as they are of interest at all—to be the first-order objective of the study rather than an a priori constraint for it? In other words, why define the readings-based categories of illness probability in
The alternative outlook: elements
The PIOPED “interpretation categories” were defined on the basis of the following input readings/data [3]:
- •
Number of large segmental (i.e., 75% or more of a segment) perfusion defects that were mismatched (i.e., without corresponding ventilation or roentgenographic abnormalities or substantially larger than these)
- •
Number of moderate segmental (i.e., 25%–75%) mismatched perfusion defects
- •
Number (0, 1–3, 4+) of small segmental (i.e., 25% or less) mismatched perfusion defects with normal roentgenogram
The alternative outlook: extensions
In accordance with the spirit of the PIOPED, addressed earlier here was the situation in which the radiologist expresses diagnostic probability on the basis of the radiologic data alone. Yet, ultimately the diagnostic probability that guides the decision about intervention is based on added inputs from the patient’s history and physical examination as well as tests other than imaging. Some aspects of history are relevant to differential risks for the illness at issue and its
Discussion
Our orientational proposition is that diagnostic interpretation of the readings from a (set of) diagnostic image(s) should not be construed as part of the test itself. Instead, the test should be construed as ending with the readings (descriptive) constituting the test result.
Given this conceptualization of an imaging test in diagnosis, we strongly propose that a priori definition of a scale (unidimensional) a result interpretation should be replaced by logistic regression analysis of the data
Acknowledgements
We thank H. Dirk Sostman, M.D., for providing us with access to the Prospective Investigation of Pulmonary Imbolism Diagnosis database and for helpful discussions on the manuscript.
References (14)
- et al.
Improved noninvasive diagnosis of acute pulmonary embolism with optimally selected clinical and chest radiographic findings
Acad Radiol
(1996) - et al.
A multivariate analysis of the risk of coronary heart disease in Framingham
J Chron Dis
(1967) - et al.
Reasoning foundations of medical diagnosis
Science
(1959) Properties of diagnostic data distributions
Biometrics
(1976)- The PIOPED Investigators. Value of the ventilation/perfusion scan in acute pulmonary embolism: Results of the...
- et al.
Ventilation-perfusion scintigraphy in the PIOPED study. I. Data collection and tabulation
J Nucl Med
(1993) A primer on the precision and accuracy of the clinical examination
JAMA
(1992)