Assessing the quality of researchBMJ 2004; 328 doi: http://dx.doi.org/10.1136/bmj.328.7430.39 (Published 01 January 2004) Cite this as: BMJ 2004;328:39
- Paul Glasziou (firstname.lastname@example.org), reader1,
- Jan Vandenbroucke, professor of clinical epidemiology2,
- Iain Chalmers, editor, James Lind library3
- 1Department of Primary Health Care, University of Oxford, Oxford OX3 7LF
- 2Leiden University Medical School, Leiden 9600 RC, Netherlands
- 3James Lind Initiative, Oxford OX2 7LG
- Correspondence to: P Glasziou
- Accepted 20 October 2003
Inflexible use of evidence hierarchies confuses practitioners and irritates researchers. So how can we improve the way we assess research?
The widespread use of hierarchies of evidence that grade research studies according to their quality has helped to raise awareness that some forms of evidence are more trustworthy than others. This is clearly desirable. However, the simplifications involved in creating and applying hierarchies have also led to misconceptions and abuses. In particular, criteria designed to guide inferences about the main effects of treatment have been uncritically applied to questions about aetiology, diagnosis, prognosis, or adverse effects. So should we assess evidence the way Michelin guides assess hotels and restaurants? We believe five issues should be considered in any revision or alternative approach to helping practitioners to find reliable answers to important clinical questions.
Different types of question require different types of evidence
Ever since two American social scientists introduced the concept in the early 1960s,1 hierarchies have been used almost exclusively to determine the effects of interventions. This initial focus was appropriate but has also engendered confusion. Although interventions are central to clinical decision making, practice relies on answers to a wide variety of types of clinical questions, not just the effect of interventions.2 Other hierarchies might be necessary to answer questions about aetiology, diagnosis, disease frequency, prognosis, and adverse effects.3 Thus, although a systematic review of randomised trials would be appropriate for answering questions about the main effects of a treatment, it would be ludicrous to attempt to use it to ascertain the relative accuracy of computerised versus human reading of cervical smears, the natural course of prion diseases in humans, the effect of carriership of a mutation on the risk of venous thrombosis, or the rate of vaginal adenocarcinoma in the daughters of pregnant women given diethylstilboesterol.4
To answer their everyday questions, practitioners …