Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
John W Ely a Department of Family Medicine, 01291-D PFP,
University of Iowa College of Medicine, 200 Hawkins Drive, Iowa City,
IA 52242-1097, USA, b Praxis Press, 36 W 25th Street, 7th Floor, New York, NY 10010, USA, c Division of Medical Informatics and Outcomes Research, Oregon
Health Sciences University, 3181 SW Sam Jackson Park Road, Portland, OR
97201, USA, d Department of Family Practice, Michigan State University, B101
Clinical Center, East Lansing, MI 48824-1315, USA, e Moses Cone Family
Medicine Residency, 1125 N Church Street, Greensboro, NC 27401, USA, f Presbyterian Medical
Center, 39th and Market Streets, Medical Arts Building, Suite 102, Philadelphia, PA 19104, USA, g School of Information Resources and
Library Science, University of Arizona, 1515 East First Street, Tucson,
AZ 85719, USA
Correspondence to: J W Ely john-ely{at}uiowa.edu
| |
Abstract |
|---|
|
|
|---|
Objective:
To develop a taxonomy of doctors'
questions about patient care that could be used to help answer such questions.
Doctors often have questions about care as they see their
patients: "How soon should I mobilise a patient with a deep vein thrombosis?" "How common is depression after infectious
mononucleosis?" "Should a pregnant woman at full term with
spontaneous rupture of membranes but not in labour come to the hospital
now (3 am) or could she wait four hours?" Doctors answer only a
minority of such questions authoritatively by consulting information
resources.1-3
Answers might be more readily available if the authors of such
resources knew what information needs arise in practice. In a previous
study of doctors' questions, we developed a scheme to classify 1101 questions collected from 103 Iowa family doctors.2 The
purpose was to determine whether the essence of clinical questions could be captured by a limited number of generic question types. Questions with nearly identical structures (such as "How should I
treat her Paget's disease?" and "How should I treat his
epididymitis?") were placed into a single generic type ("How should
I treat condition x?"). Through an iterative process of coding and
revision, we developed a taxonomy of 69 generic types. This taxonomy
may have limited applicability, however, because it was based on
questions from a homogeneous group of doctors and because its
interrater reliability was measured among a small group of investigators.
Therefore, in the current study, we modified this previously developed
taxonomy to accommodate a different set of questions, and we measured
interrater reliability in a more heterogeneous group of coders. Our
goal was to produce a logical and concise classification scheme that
could be applied reproducibly to the full range of questions that occur
in primary care. We believe that such a scheme could increase the
likelihood of finding answers to primary care questions. For
example, the scheme could be used to identify frequently asked but
problematic question types, enabling authors to develop better answers
and more effective strategies for linking questions to their answers.
Original taxonomy
Additional questions
Generic question taxonomy
Design:
Use of 295 questions asked by Oregon primary care doctors to modify previously developed taxonomy of 1101 clinical questions asked by Iowa family doctors.
Setting:
Primary care practices in Iowa and Oregon.
Participants:
Random samples of 103 Iowa family
doctors and 49 Oregon primary care doctors.
Main outcome measures:
Consensus among seven
investigators on a meaningful taxonomy of generic questions; interrater
reliability among 11 individuals who used the taxonomy to classify a
random sample of 100 questions: 50 from Iowa and 50 from Oregon.
Results:
The revised taxonomy, which comprised 64 generic question types, was used to classify 1396 clinical questions. The three commonest generic types were "What is the drug of choice for condition x?" (150 questions, 11%); "What is the cause of symptom x?" (115 questions, 8%); and "What test is indicated in situation x?" (112 questions, 8%). The mean interrater reliability among 11 coders was moderate (
=0.53, agreement 55%).
Conclusions:
Clinical questions in primary care can be categorised into a limited number of generic types. A moderate degree
of interrater reliability was achieved with the taxonomy developed in
this study. The taxonomy may enhance our understanding of doctors'
information needs and improve our ability to meet those needs.
![]()
Introduction
Top
Abstract
Introduction
Participants and methods
Results
Discussion
References
![]()
Participants and methods
Top
Abstract
Introduction
Participants and methods
Results
Discussion
References
In our previous study, 1101 questions about patient care were
collected from 103 randomly selected Iowa family doctors.2
The investigators visited doctors in their offices and recorded
questions between patients' visits. Participants were asked to report
everything from "clear cut questions (What's the dose of
metformin?)" to the "vague, fleeting uncertainties" that they
would normally keep to themselves. The purpose was to describe the
questioning and answer seeking behaviour of family doctors and to
develop a taxonomy of generic questions.
In a separate study Gorman and Helfand collected 295 questions
from 49 Oregon primary care doctors (29 family doctors, 14 general
internists, and 6 general paediatricians).3 The participants were asked to report questions about diagnosis or management. The purpose was to determine how doctors decide which questions to pursue and which to leave unanswered.
In our current study, seven investigators coded a random sample of
100 of the additional questions using the previously developed
taxonomy.2 Working independently, each investigator
suggested changes to better accommodate the questions. We added new
generic question types, changed the wording of existing types, and
combined closely related types. A fourth option was to use an existing
type to make a plausible, albeit imperfect, match. These decisions were
based on consensus and guided by our goal of producing a concise,
intuitive, reliable taxonomy.
Taxonomy reliability
To measure the interrater reliability of the final taxonomy,
the seven investigators coded a final random sample of 100 questions,
50 from Iowa and 50 from Oregon. In addition, four volunteers who were
not familiar with the taxonomy coded the same 100 questions. The
statistic4 was used to estimate interrater reliability.
This is a measure of agreement, which corrects for agreement that
occurs by chance. It can be defined as
(Po
Pe)/(1
Pe), where
Po is the observed agreement and Pe is the
agreement expected by chance.4 From this formula, it can be seen that when the number of categories is large, as in this study,
the agreement expected by chance (Pe) will be close to zero, and the
will be close to the observed percentage agreement (Po). We used a z test based on the
values and their
standard errors to compare reliability between groups of coders
(investigators v volunteers) and between groups of
questions (Iowa v Oregon). We chose a two tailed
significance level of 0.05 and performed all analyses with Stata (Stata
Corporation, College Station, TX).
| |
Results |
|---|
|
|
|---|
Demographic data
The mean age of the 103 Iowa doctors was 48 years, 23 (22%) were
women, and 54 (52%) practised in a rural area.2 The mean
age of the 49 Oregon doctors was 45 years, 6 (12%) were women, and 24 (49%) practised in a rural area.3 The investigators
comprised three academic family doctors, three internists with interest
and training in medical informatics, and a medical information
scientist. The four volunteer coders comprised three family doctors at
the University of Iowa and a medical librarian.
Generic questions
The generic questions were categorised using four hierarchical
levels of specificity (see extra table on the BMJ website
for details). The first level consisted of five broad areas: diagnosis,
treatment, management, epidemiology, and non-clinical questions.
Management questions asked what steps to take without distinguishing
between diagnostic steps and treatment steps. A branching structure of
secondary, tertiary, and quaternary levels further characterised the
generic questions. Each quaternary category was exemplified by one or
more closely related generic questions. For example, the question "Is
there a way to continue lovastatin in patients with side effects of
headache or indigestion (such as reduce the dose)?" would be coded as
"treatment" (primary), "drug prescribing" (secondary),
"adverse effects" (tertiary), and "administration in the face of
adverse effects" (quaternary). The generic question corresponding to
this quaternary category is "How can drug x be administered without
causing adverse effect y?"
Question frequency
After combining the Iowa and Oregon questions (n=1396), we found
that the three most common generic types were "What is the drug of
choice for condition x?" (150 questions, 11%), "What is the cause
of symptom x?" (115 questions, 8%), and "What test is indicated in
situation x?" (112 questions, 8%) (see table). Eight (0.6%) of the
1396 questions could not be classified beyond the primary level. To
accommodate these questions, each primary level included a "not
elsewhere classified" category.
|
Interrater reliability
The combined
statistic for all 11 coders was 0.53 (55%
agreement, indicating "moderate" reliability5). Agreement was slightly higher for the 50 Iowa questions than for the 50 Oregon questions (
values 0.54 v 0.51, P<0.001). When only the five broad areas in the primary level of the taxonomy were
considered (diagnosis, treatment, management, epidemiology, non-clinical) agreement was "substantial"5 (
=0.70),
and agreement remained substantial when the primary and secondary
levels (26 categories) were considered (
=0.62).
values 0.55 v 0.54, P=0.33). Agreement among the investigators with previous coding experience was higher than that among the other seven coders (
values 0.68 v 0.47, P<0.001).
| |
Discussion |
|---|
|
|
|---|
Main findings
In this study, we modified a taxonomy of generic clinical
questions and measured how reproducibly 11 coders could assign
questions to it. We found that a large number of questions could be
categorised using a limited number of generic types. Coding
reproducibility was moderate and was highest among the most experienced coders.
Comparison with other studies
Cimino matched clinicians' natural language inquiries with
generic types for which computerised retrieval strategies had been
previously developed.6 The query types were based on
semantic relations drawn from the National Library of Medicine's
Unified Medical Language System.
7 8
The generic types were applied to questions that would be submitted to computerised retrieval systems rather than to the "on the spot" questions that we collected. These investigators did not describe a comprehensive taxonomy of generic queries.
Implications of study
Our goal was to build a taxonomy that was valid, reliable,
concise, intuitive, comprehensive, and useful. We have identified four
areas of potential usefulness. Firstly, taxonomies such as ours, that
are based on the generic type of information needed, could be used to
organise large collections of clinical questions for efficient
retrieval. Authors who want to produce clinically relevant material
should address real questions that occur in practice. But there is an
infinite number of such questions, and even frequently asked questions
would be unmanageable without some way to organise them.
Limitations of study
The kinds of questions collected in studies of doctors'
information needs seem to depend on the methods used to collect them.
For example, the Iowa doctors in this study were more likely than the
Oregon doctors to ask about the cause of symptoms and physical
findings. But the Oregon doctors were asked to report questions about
"diagnosis or management," whereas Iowa doctors were asked to
report everything from "clear cut questions" to "vague, fleeting
uncertainties." Both datasets used office observations to collect
questions, but other methods, such as the critical incident
technique,12 have been used and might influence the kinds
of questions collected.
that is, questions were collected without regard to the most appropriate method for answering them. Other question sets are "system based," focusing, for example, on questions submitted to
computerised information retrieval systems.13 Some user
based questions ("What is causing her abdominal pain?") would
require a system based modification before they could be answered by a general information resource ("What is the differential diagnosis of
right lower quadrant pain in adolescent females?").
We achieved only moderate interrater reliability. However, the coding
reliability in this study compares favourably with other attempts to
categorise medical topics. For example, highly trained Medline indexers
achieve "consistency percentages" of only 43% when assigning
medical subject headings and subheadings (MeSH terms) to journal
articles.14
Conclusions
Doctors do not pursue answers to most of their questions, partly
because they believe the answers are not readily available3
a belief that is often
correct.
1 3 15
Doctors need rapid, accurate, and
accessible answers to on the spot questions as they see their
patients.16 By learning about doctors' questions, we hope
to influence the content of clinical information resources. By
organising the full range of information needs that occur in practice,
we can begin to address the most common types.
|
What is already known on this topic
In a previous study, the essence of 1101 clinical questions asked by family doctors was captured in 69 generic types (such as, "What is the drug of choice for condition x?") The applicability of this generic question taxonomy may be limited because of the homogeneous nature of the participants What this study addsAfter revision of the original taxonomy, questions asked by a different group of 49 primary care doctors could be classified with moderate reliability among 11 coders The taxonomy has four potential uses: to organise large numbers of real questions, to route questions to appropriate knowledge resources by using automated interfaces, to characterise and help remedy areas where current resources fail to address specific question types, and to set priorities for research by identifying question types for which answers do not exist. |
| |
Acknowledgments |
|---|
Contributors: JWE collected the Iowa questions, coordinated the development of the taxonomy, and wrote the first draft of the paper. JAO had the original idea of organising clinical questions by generic type. He helped with constructing the taxonomy and with writing the paper. PNG collected the Oregon questions, guided the early development of the study and helped design the taxonomy. MHE, MLC, EAP, and PZS helped plan the study, and each coded 250 Oregon questions and 50 Iowa questions. They modified the first draft of the taxonomy to better accommodate these questions and approved the final version. All authors contributed to editing the paper. Dedra Diehl, Susan Langbehn, Robert Garrett, and Mark Graber helped test the taxonomy, and Jeffrey Dawson provided statistical support. JWE and JAO are guarantors for the study.
| |
Footnotes |
|---|
Funding: This study was supported by a grant (G9518) from the American Academy of Family Physicians Foundation.
Competing interests: None declared.
A list of the taxonomy of generic
clinical questions appears on the BMJ's website
| |
References |
|---|
|
|
|---|
| 1. | Covell DG, Uman GC, Manning PR. Information needs in office practice: are they being met? Ann Intern Med 1985; 103: 596-599. |
| 2. |
Ely JW, Osheroff JA, Ebell MH, Bergus GR, Levy BT, Chambliss ML, et al.
Analysis of questions asked by family doctors regarding patient care.
BMJ
1999;
319:
358-361 |
| 3. |
Gorman PN, Helfand M.
Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered.
Med Decis Making
1995;
15:
113-119 |
| 4. | Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York: Wiley, 1981. |
| 5. | Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33: 159-174[CrossRef][Medline]. |
| 6. | Cimino J. Generic queries for meeting clinical information needs. Bull Med Libr Assoc 1993; 81: 195-206[Medline]. |
| 7. |
Humphreys BL, Lindberg DA, Schoolman HM, Barnett GO.
The unified medical language system: an informatics research collaboration.
J Am Med Inform Assoc
1998;
5:
1-11 |
| 8. |
McCray AT, Miller RA.
Making the conceptual connections: the unified medical language system (UMLS) after a decade of research and development.
J Am Med Inform Assoc
1998;
5:
129-130 |
| 9. | Graesser AC, Lang K, Horgan D. A taxonomy for question generation. Questioning Exchange 1988; 2: 3-15. |
| 10. | Osheroff JA, Bankowitz RA. Physicians' use of computer software in answering clinical questions. Bull Med Libr Assoc 1993; 81: 11-19[Medline]. |
| 11. | Curley SP, Connelly DP, Rich EC. Physicians' use of medical knowledge resources: preliminary theoretical framework and findings. Med Decis Making 1990; 10: 231-241. |
| 12. | Northup DE, Moore-West M, Skipper B, Teaf SR. Characteristics of clinical information-searching: investigation using critical incident technique. J Med Educ 1983; 58: 873-881[Medline]. |
| 13. | Lindberg DA, Siegel ER, Rapp BA, Wallingford KT, Wilson SR. Use of MEDLINE by physicians for clinical problem solving. JAMA 1993; 269: 3124-3129[Abstract]. |
| 14. | Funk ME, Reid CA. Indexing consistency in MEDLINE. Bull Med Libr Assoc 1983; 71: 176-183[Medline]. |
| 15. | Chambliss ML, Conley J. Answering clinical questions. J Fam Pract 1996; 43: 140-144[Medline]. |
| 16. |
Huth EJ.
"In the balance": weighing the evidence.
Ann Intern Med
1994;
120:
889 |
(Accepted 22 May 2000)
Read all Rapid Responses
What can you learn from this BMJ paper? Read Leanne Tite's Paper+