Review Article
Quality assessment of observational studies is not commonplace in systematic reviews

https://doi.org/10.1016/j.jclinepi.2005.12.010Get rights and content

Abstract

Background and Objective

To review current practice in the assessment of the quality of original observational studies included in systematic reviews.

Materials and Methods

Examination of all systematic reviews identified by a basic PubMed search for the years 1999–2000 (32 reviews) and 2003–2004 (98 reviews). English language systematic reviews published in peer-reviewed journals was the setting. Each review was evaluated for the use of quality assessment of original observational studies and if quality assessment occurred, what type of assessment was used.

Results

Quality assessment occurred in 22% of systematic reviews identified in 1999–2000 compared with 50% of reviews identified from 2003–2004. All earlier reviews devised their own quality assessment criteria, whereas in 2003–2004 10 different quality assessment techniques were identified.

Conclusions

Quality assessment does not routinely occur in systematic reviews of observational studies. Where it does occur, there is no clear consensus in the method used.

Introduction

The systematic review has underpinned evidence-based medicine, in which health care professionals are encouraged to use up-to-date and relevant scientific evidence to aid the clinical decision making process. Systematic reviews are often viewed as a “gold standard” of research [1] , and have instant appeal to clinicians who may not have the time, access, or skills to interpret individual studies. The use of a comprehensive, reproducible search strategy can reassure clinicians that all relevant literature is likely to have been identified. However, it is the extension of this systematic methodology to the critical appraisal and synthesis of original studies that arguably represents the greatest advantage over traditional narrative reviews. The judgement of external and internal validity of original studies, and the selection and weighting of results based on these features, is critical if the conclusions and recommendations of systematic reviews are to be believed by clinicians. However, this assessment of internal and external validity may be made more difficult because of the absence of complete and accurate reporting of original studies.

Quality assessment is well established in systematic reviews of randomized controlled trials. A large number of different assessment tools exist, with a review by Moher et al. [2] identifying 25 scales and nine checklists developed to assess the quality of randomized controlled trials. A more recent estimate suggests that between 50 and 60 quality scales exist [3]. Despite the large numbers of tools available, there is no universally accepted gold standard. Some attempts have been made to improve the validity of the quality assessment scales, such as the use of a Delphi technique [4], [5] to aid consensus, but the reliability of many lists is currently unknown [3]. QUADAS (Quality Assessment of Diagnostic Accuracy Studies) has recently been developed, using a Delphi technique and literature search, to assist in the quality assessment of studies of diagnostic accuracy [6].

The use of quality assessment tools to appraise observational studies included in systematic reviews is less well established than in systematic reviews of RCTs [7]. As no accepted gold standard exists, researchers either ignore the issue or develop their own tools [7]. Using quality assessment tools has several advantages. It allows original research to be systematically appraised and evaluated, and can be used to rank, weight, or score studies. Components are often generic and allow even novice researchers to appraise original material. However, each study is unique and a quality checklist may not include items that are relevant to consider for that particular study, or include irrelevant or useless items. This lack of flexibility could result in a misclassification of study quality. Creating additional criteria that are study specific may overcome this problem, although inherent problems exist with this. Even when a scale is used, subjective assessment is still required, as most checklists use terms such as “adequate” and “appropriate” to describe criteria [7].

Methodologic quality refers to the extent to which all aspects of a studies design and conduct can be shown to protect against systematic bias, nonsystematic bias, and inferential error [8]. It is therefore not surprising that checklists used to appraise observational studies tend to concentrate on issues of external and internal validity, although no empirical evidence exists to support this. Core domains include the comparability of subjects, details of intervention and exposures, outcome measurement, statistical analysis, and funding or sponsorship [9].

Considering observational studies is important, as clinicians will often have important questions that cannot be answered by randomized controlled trials or diagnostic studies. These include issues of prognosis, etiology, and risk, which are often best addressed by utilizing an observational design. Whereas the quality assessment of original studies included in systematic reviews of randomized controlled trials is widely accepted as constituting good practice, less is known about the status of quality assessment of observational studies. Although a variety of quality checklists for such studies have been developed [[10], [11]; see also www.york.ac.uk/inst/crd/pdf/crd4_ph5.pdf], it is unclear if these tools are currently being used in the assessment of quality in systematic reviews. Given this, how is quality currently being assessed?

We adopted the perspective of a busy clinician seeking to identify a recent systematic review of observational studies. Our aim was to describe what level of quality assessment the clinician might encounter.

Section snippets

Methods

A PubMed search was performed on July 14, 2005, using the key words “systematic review” and “observational studies.” The abstracts of all papers published in the years 1999–2000 and 2003–2004 were screened for eligibility. Articles meeting the full inclusion criteria were then analyzed for the presence and nature of any quality assessment. Any article with a description of a quality assessment process was included. This search was not intended to be systematic, but rather was designed to

Results

A total of 32 systematic reviews were identified for the period 1999–2000, of which 27 met the inclusion criteria. For the period 2003–2004, 98 reviews were identified, of which 78 met the inclusion criteria.

Six (22%) of the 1999–2000 articles described quality assessment of original studies, compared with 39 (50%) of articles published in 2003–2004 (see Table 1).

In 1999–2000, all reviews that assessed quality devised their own quality assessment criteria. In 2003–2004, 10 different tools were

Discussion

This study has examined the current practice of quality assessment in systematic reviews of original observational studies. There were three main findings: the increase in quality assessment activity from 1999–2000 to 2003–2004, the absence of quality assessment in almost half of the most recently published studies, and a lack of consensus in the choice of quality assessment tool in the remainder.

Of the 78 most recently published reviews that were analyzed, 39 (50%) did not report using any

References (22)

  • S. West et al.

    Systems to rate the strength of scientific evidence. Evidence report/technology assessment No. 47 (prepared by the Research Triangle Institute–University of North Carolina Evidence-based Practice Center under Contract No. 290-97-0011). AHRQ Publication No. 02-E016

    (2002)
  • Cited by (128)

    View all citing articles on Scopus
    View full text