Papers

Volume and outcome in coronary artery bypass graft surgery: true association or artefact?

BMJ 1995; 311 doi: http://dx.doi.org/10.1136/bmj.311.6998.151 (Published 15 July 1995) Cite this as: BMJ 1995;311:151
  1. Amanda J Sowden, research fellowa,
  2. Jonathan J Deeks, research fellowa,
  3. Trevor A Sheldon, directora
  1. aNHS Centre for Reviews and Dissemination, University of York, York YO1 5DD
  1. Correspondence to: Dr Sowden.
  • Accepted 1 May 1995

Abstract

Objectives: To examine the evidence for a relation between volume of coronary artery bypass graft surgery and hospital death rates, and to assess the degree to which this could be due to confounding because of differences in case mix.

Subjects: People receiving coronary artery bypass graft surgery in the United States.

Design: A systematic review of empirical studies examining the relation between volume and outcome of coronary artery bypass graft surgery. Studies were scored according to degree of adjustment for case mix. Above 200 procedures a year was regarded as high volume.

Results: Fifteen studies were identified, all of which used observational data from the United States for 1972-92. Six were included in the analysis, one was included in a sensitivity analysis, and eight were excluded because of duplicate analyses of data sources and methods of reporting results. The seven studies analysed reported a reduced mortality with increased volume. Studies with better adjustment for case mix, however, indicated less reduction in mortality with increased volume (P=0.04). The apparent advantages of higher volume also decreased over time (P<0.001).

Conclusions: The evidence for reduced mortality in hospitals with a high volume of coronary artery bypass graft surgery is based entirely on observational studies. These studies may have overestimated the benefit of increased volume because of poor adjustment for case mix. It signals the need for caution in interpreting the results of observational studies that examine the relation between volume and outcome.

Key messages

  • Key messages

  • This evidence comes from observational studies in the United States that compared routine data from hospitals with high and low volumes of the operation

  • Results from observational studies, because they are not randomised, are subject to confounding due for example to different case mix of patients

  • This study shows that the more differences in case mix are taken into account, the smaller are the apparent benefits of increased volume of surgery

  • Policy makers should not assume that concentrating surgical services into larger and more active units will improve outcomes.

Introduction

There has been considerable interest over recent years in the concentration of health care services into larger units. This reflects both a concern to exploit perceived economies of scale and a belief that the delivery of care in larger specialist units will improve quality by increasing volumes of activity.1

Much research has been carried out to compare outcomes (principally mortality in hospital) in units with different volumes of activity.2 Most of these studies have used data from routine administrative or clinical databases in the United States. It has been assumed that units with higher death rates provide poorer quality care.3

One of the major problems with such routine observational data is that differences in outcomes from various units might be due to differences in case mix between the units.4 Differences in case mix may reflect factors such as age, demographic characteristics, severity of the primary diagnosis, and complexity of comorbidities.

In other words because such studies are observational rather than experimental they are susceptible to confounding.5 Because of potential confounding it is difficult to attribute differences in mortality to characteristics of care such as volume of activity. For example, it might be that patients receiving a procedure in a high volume facility are less severely ill than those referred to smaller centres. In this case, differences between outcomes in terms of patients' health across units will overestimate the advantages of larger units. Statistical adjustment to control for the effects of confounding can improve the reliability of estimates of the effect of treatment, though it is uncertain how adequately confounding factors have been taken into account.6 Another concern arises from inaccuracies and omissions in retrospective discharge data. Organised prospective collection of risk factors is probably both more reliable and complete and will lead to better risk adjustment.

If volume effects are real it is unclear as to what this actually means and whether changes in hospital volumes will directly lead to reductions in mortality. It is also uncertain whether volume effects are constant or change over time as (say) new procedures become more widely adopted. We assessed the degree to which the reported relation between volume of activity and mortality may be an artefact which reflects differences in case mix and thus be affected by the extent of adjustment.

We used coronary artery bypass graft surgery as it has been extensively studied, and there is also wide variability in the adjustments made for case mix. It is a technique which has been widely adopted since it was first described in 1967, and there has been a steady decline in overall mortality for this type of surgery over the past 20 years.7 8 There are also independent reports which identify those patient factors that accurately predict survival. Theoretically a significant yet spurious relation could thus be reported because of inadequate adjustment for risk of mortality or case mix.

Two recent studies have identified preoperative risk factors which predict postoperative mortality.9 10 Included are age, sex, previous open heart operations, ejection fraction, diabetes, previous myocardial infarction, dependence on dialysis, “disasters,” cardiac catheterisation, unstable angina and intractable congestive heart failure, emergency procedure, creatinine concentrations over 168 mmol/l, severe left ventricular dysfunction, chronic pulmonary disease, previous vascular surgery, second or subsequent operation, and mitral valve insufficiency. Clinical severity scores based on this type of detailed clinical information have been shown to be good predictors of mortality and useful in the comparison of the performance of different hospitals.11 It has also been shown that good adjustment with clinical data from specialised databases can produce similar results to those of randomised controlled trials.12

Methods

The literature was reviewed to identify studies which examined the relation between volume of surgery and outcome in patients. The review was based on a search of Medline (from 1985 to 1994) and of the Science Citation Index on the Bath Information and Data Service (BIDS) (from 1993 to 1994). The reference lists of identified articles were also searched. Key relevant journals (Medical Care) were also hand searched from 1971 to 1994.

Data from each study were extracted by using the cut off point closest to 200 procedures a year to define high and low volume hospitals. A figure of 200 was used as it was the only cut off point that was common to all studies and thus allowed comparison between the results. In addition, several authors have suggested that there is a threshold of about 200 such procedures a year.13

One study, in which it was unclear how the cut off points related to hospital volume, was excluded from the main analysis but was included in a sensitivity analysis. Studies in which volume had been analysed as a continuous rather than categorical measure were excluded from the analysis as it was not possible to extract the required data.

Care was taken not to include data from the same source and time period more than once. Where duplication was possible but not clear from the published study the authors were contacted for clarification.

Numbers of patients and adjusted mortalities were extracted from each study along with the variables used to adjust for patient mix. In some studies the expected rather than adjusted death rates were presented. In these instances the crude death rate was retained for the high volume group while the low volume mortality was adjusted by multiplying by the ratio of the expected death rates in the high volume compared with the low volume group.

Each study was given a score from 0 to 3 indicating the adequacy of adjustment based on the evidence of prognostic factors discussed above (table 1). The assessor was blind to the results of each study when the scores were assigned. The estimates of benefit associated with higher volume (odds ratio) for each study were plotted against the degree of adjustment used in the study on the four point classification. This was repeated for the year of study.

TABLE I

Scoring of adjustment for case mix in the coronary artery bypass graft studies

View this table:

A statistical model was developed to investigate whether there was a systematic change in the estimates of the volume effect as the degree of adjustment for patient mix was improved and also as the year of data collection increased. Logistic regression was used to model the reported risks of death in high (>200) and low (</=200) volume hospitals in each study. A covariate indicating high and low volume was included to estimate the effect of volume on mortality. All models also included a covariate for each study, so that volume effects were estimated on the basis of pooled comparisons within studies.

Logistic regression conveniently allows effects between and within studies to be modelled, together with their interaction terms. Whitehead and Whitehead discussed regression models as a method of investigating possible explanations of “treatment interactions” (heterogeneity in treatment effects between studies) in meta-analysis.14 They proposed estimating covariate-treatment interactions, where the covariates are study features, such as characteristics of the subjects or study design, which vary between the studies. Thompson, for example, used a logistic regression model to investigate the effect of differences in the follow up periods in meta-analysis of trials of reductions in serum cholesterol concentration.15

The model presented in this paper included interaction terms which measured the modification of any volume effect according to the degree of adjustment for case mix and the modification of any volume effect related to the year of data collection. It is these interaction terms which are of primary interest in the analysis.

The statistical models were initially fitted to data from the six studies with the division of high and low volume near 200 cases a year. The simplest model estimated the change in the odds of death in high compared with low volume hospitals (model A). This estimate gives the effect of volume averaged over the six studies. The interaction of the adjustment score for case mix with volume was added to this model. The adjustment score was treated as a linear trend in the interaction (model B). A similar model was fitted which included an interaction term with year (model C). Overdispersion (residual heterogeneity) was accounted for in the models by appropriately scaling the standard errors (see appendix).16 The significance of the estimates and their confidence intervals were calculated by using scaled standard errors. The statistical analysis was performed with GLIM statistical software.17

Data from the study that used a cut off point which could not be directly linked to hospital volumes were included in a sensitivity analysis. The models were refitted to the data from all seven studies and changes in parameter estimates and significance levels were examined to investigate the robustness of the findings.

Results

Fifteen studies were identified that examined the relation between volume and outcome in coronary artery bypass graft surgery (table II and III).7 8 9 13 18 19 20 21 22 23 24 25 26 27 28 All studies used observational data from the United States. Several studies used data from the same source and time period (table II), and the data were included only once for each set of studies.7 8 18 20 21 25 26 A further study was excluded as all hospitals performed more than 200 procedures a year. This study, which compared only five hospitals found no significant relation between volume and outcome.13

TABLE II

Studies included in review

View this table:
TABLE III

Studies excluded from review

View this table:

Hospital discharge abstracts were the main data source for all studies. The prognostic variables controlled for in these 15 studies varied from simple age and sex to some clinical risk factors. There were large differences between the numbers of hospitals and patients included in the studies and between categories of volume (table II). The cut off points used to define high and low volume varied between 150 and 223 cases a year. One study presented data on a 20% sample of elderly beneficiaries of Medicare.20 As it was unclear how these volumes of patients related to hospital volumes the results of this study were included in the sensitivity analysis.

All of the studies included in the analysis reported a positive relation between volume and outcome, with five of the seven showing this result as significant (one being included in the sensitivity analysis). Of the three studies which included volume of physicians, one found a positive relation between volume and outcome26 and two did not.13 24

Figure 1 shows the estimates of the benefit (odds ratios of mortality) associated with carrying out more than 200 procedures a year compared with fewer than 200 procedures a year for each study plotted against the four point adjustment scale for case mix. The blocks indicate the estimate of the odds ratio, and their size relates to the size of the study. As can be seen studies with adequate adjustment for case mix have higher odds ratios and so lower estimates of the benefit of high volume. Figure 2 repeats this for year of study.

FIG 1
FIG 1

Estimated effect on mortality of high volume hospital compared with low volume hospital by degree of adjustment for case mix

FIG 2
FIG 2

Estimated effect on mortality of high volume hospital compared with low volume hospital by year of data collection

Table IV gives details of the statistical modelling of these trends. Model B shows that the interaction term between volume and the degree of adjustment is significant and greater than one. This means that, as the degree of adjustment for case mix increases, the estimate of the advantage of increased volume is significantly diminished. Model C shows that year is also related to the estimate of the volume effect: recent studies have found weaker effects of volume than the early reports. Baseline risk was added as an explanatory variable but did not materially alter the estimate of the adjustment-volume interaction. The degree of adjustment for case mix has improved with time, and so year and adjustment variables are highly correlated (Spearman's rank correlation coefficient=0.79: n=6; P<0.1).

TABLE IV

Statistical modelling of trends depending on adjustment for volume or year, or both

View this table:

Discussion

APPARENT BENEFIT DECREASES WITH ADJUSTMENT FOR CASE MIX

These results indicate that even though most of the studies have suggested a positive relation between volume and outcome this might be confounded by differences in case mix between high and low volume hospitals. The analysis indicates that there is a significant association between the degree of adjustment in volume-outcome studies and their reported effect of volume on mortality. In the main analysis the odds ratio associated with treatment in high rather than low volume hospitals changed from 0.54 to 0.84 as the adjustment changed from a score of 0 to a score of 3. In other words, as the degree of adjustment is increased the estimated net beneficial effect of increased volume is reduced (odds ratio moves nearer to 1).

This relation could not be explained by looking at differences between studies in the baseline risk in low volume units. The inclusion of an additional study which used different definitions of high and low volume weakened the apparent relation with case mix but not with year. It is unclear whether more extensive and detailed adjustment for the effects of case mix would further reduce or indeed increase this effect as such data are not available.

The overdispersion evident in models A and B indicates unexplained heterogeneity between the studies, which suggests that there may well be other factors accounting for the apparent effect of volume on hospital mortality. The analysis presented here has been restricted by the amount of available data. In addition linear trends have been assumed, but in the absence of more studies the form of the relations could not have been explored in more detail.

APPARENT BENEFIT DECREASES OVER TIME

The analysis also showed that the size of the estimated benefit of high volume reduced over time. If the year of study was treated as a linear trend in the logistic analysis the odds ratio associated with treatment in high rather than low volume hospitals changed from 0.54 in 1972 (the earliest data included) to 0.95 in 1991 (the most recent data included).

It is not possible to determine the relative importance of the adjustment for effects of case mix and year; both may be working at the same time. Large volume hospitals may achieve their high rate of activity by lowering their threshold for treatment and operating on patients who are less severely ill. At the same time, general experience of coronary artery bypass surgery is increasing and surgeons as a whole might have advanced along the learning curve, so reducing earlier apparent differentials.

These results are based on comparisons between studies, but several of the reports contain comparisons within studies which support the between study findings. One study contained data from both 1972 and 1982 and showed that the volume effect decreased throughout the decade.7 Five studies presented both crude and adjusted rates: three studies with low adjustment scores showed no or relatively small reductions in the volume effect with increased adjustment.7 18 21 Two of the analyses of the New York State data, however, found that adjustment actually increased the estimated volume effect.9 26 In these studies few patients were treated in low volume hospitals and so were given low weights in the analysis.

This analysis is based exclusively on published studies which have examined the relation between volume and mortality. As with all reviews, the analysis might be subject to publication bias, where, for example, researchers finding no relation after adjusting for case mix might not submit for publication.29 In as much as there may be publication bias the results presented in this paper are likely to underestimate the impact of variability of adjustment for case mix.

There have been several recent reviews of the literature exploring the relation between volume and outcome.30 31 32 33 One review focused entirely on the volume-outcome relation in coronary artery bypass graft surgery.34 Although the issue of case mix and other possible sources of confounding have been noted, usually they are ignored when conclusions are drawn from the literature. This paper is possibly the first published attempt to analyse the possible size of these biases by estimating the effect of increasing the adequacy of adjustment.

VOLUME EFFECT OFTEN ASSUMED BY POLICY MAKERS

Our finding is of importance to both researchers and policy makers. A positive relation between volume and outcome seems to have been assumed by many policy makers and used to justify the concentration of health care delivery into larger units. For example, in the Netherlands coronary artery bypass graft surgery has been regionalised by regulation.34 A hospital must obtain a licence to carry out this procedure, after which a minimum of 600 procedures a year must be performed.

There are several problems in the interpretation of research, such as deciding the direction of any causal relation—whether volume affects quality or whether better units and clinicians attract more patients.35 36 There is little evidence about the degree to which any volume advantages operate at the hospital or the clinician level,33 37 and the results of studies examining the effect of volume of surgeons are contradictory and complex.13 24 26 38 The overreliance on mortality as an indicator of quality also limits the analyses.

LITTLE SUPPORT FOR AUTOMATIC EFFECT OF VOLUME ON OUTCOME

It is not our purpose to argue for or against the existence of a relation between volume of activity and patient outcome. The analysis has shown, however, that estimates of benefit suggested in the literature are likely to be biased because of inadequate adjustment for case mix. In other specialties, for example cancer services, higher volumes have generally also been associated with lower mortality but rarely is there any adjustment for case mix.39 The results of the analysis presented here cannot necessarily be generalised to other specialties. It should, however, signal the need for caution in interpreting observational studies, especially when there has been little attempt at considering the effects of confounding or the accuracy of the data used for adjustment.40

The degree of adjustment for medical history and concomitant medical conditions crucially affect estimates of the volume-outcome relation in coronary artery bypass graft surgery. Observational studies with routinely available data may have overestimated the effect of increased high volume of activity on the quality of care.

We acknowledge the helpful comments of Professors Alan Maynard and Alan Williams and Dr Ian Watt and the statistical referee.

Appendix

Taking account of overdispersion

When the residual deviance (unexplained variation) is substantially greater than the degrees of freedom in a logistic model this indicates that there is overdispersion or heterogeneity between the studies being combined which is not adequately explained by the model. In such cases the estimated standard errors are too small and significance inflated. To take account of this overdispersion and calculate more conservative confidence intervals the standard errors are adjusted by multiplying by a scaling factor calculated as41(square root)(residual deviance)/(residual degrees of freedom)

Analysis of overdispersion for the fitted models is as follows:

View this table:

Footnotes

  • Source of funding The authors are supported by the Department of Health and received some support for this work from the Yorkshire Collaborating Centre for Health Services Research.

  • Conflict of interest None.

References

View Abstract