Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
BMJ 2003;327:13-17 (5 July), doi:10.1136/bmj.327.7405.13
Ben Bridgewater, consultant cardiac surgeon1, Anthony D Grayson, regional clinical information analyst2, Mark Jackson, head of clinical governance2, Nicholas Brooks, consultant cardiologist1, Geir J Grotte, consultant cardiac surgeon3, Daniel J M Keenan, consultant cardiac surgeon3, Russell Millner, consultant cardiothoracic surgeon4, Brian M Fabri, consultant cardiac surgeon2, Mark Jones, consultant cardiothoracic surgeon1
1 South Manchester University Hospital, Manchester M23 9LT, 2 Cardiothoracic Centre, Liverpool L14 3PE, 3 Manchester Royal Infirmary, Manchester M13 9WL, 4 Blackpool Victoria Hospital, Blackpool FY3 HNR
Correspondence to: Ben Bridgewater ben.bridgewater{at}smuht.nwest.nhs.uk
Design Retrospective analysis of prospectively collected data.
Setting All NHS centres in the geographical north west of England that undertake cardiac surgery in adults.
Participants All patients undergoing isolated bypass graft surgery for the first time between April 1999 and March 2002.
Main outcome measures Surgeon specific postoperative mortality and predicted mortality by EuroSCORE.
Results 8572 patients were operated on by 23 surgeons. Overall mortality was 1.7%. Observed mortality between surgeons ranged from 0% to 3.7%; predicted mortality ranged from 2% to 3.7%. Eighty five per cent (7286) of the patients had a EuroSCORE of 5 or less; 49% of the deaths were in this lower risk group. A large proportion of the variability in predicted mortality between surgeons was due to a small but differing number of high risk patients.
Conclusions It is possible to collect risk stratified data on all patients undergoing coronary bypass surgery. For most the predicted mortality is low. The small proportion of high risk patients is responsible for most of the differences in predicted mortality between surgeons. Crude comparisons of death rates can be misleading and may encourage surgeons to practise risk averse behaviour. We recommend a comparison of death rates that is stratified by risk and based on low risk cases as the national benchmark for assessing consultant specific performance.
The Society of Cardiothoracic Surgeons of the United Kingdom and Ireland had been planning to undertake an analysis on low risk patients, but because of a lack of an appropriate dataset it is now planning to publish individual surgeons' crude mortality data later this year.2 Two possible datasets could be analysed in the United Kingdom: hospital episode statistics, which are known to be inaccurate at the level of individual clinicians, and returns of crude mortality that have been made to the Society of Cardiothoracic Surgeons of the United Kingdom and Ireland's annual register on the basis of individual surgeons since 1997. Neither dataset has been subjected to rigorous validation, and the society's returns have no mechanism for risk stratification. Although hospital episode statistics can be partly adjusted for case mix by age, sex, and urgency, these are known to be only a few of the many patient specific factors that contribute to predicted operative mortality.35 The Society of Cardiothoracic Surgeons has collected a more comprehensive dataset since 1996,1 but this has been voluntary, and not all hospitals and surgeons have contributed. It cannot be used for a comprehensive comparative analysis in the United Kingdom.
We have a long track record of collecting cardiac surgery audit data and validating risk prediction models in Manchester.68 Recently we have collected a full dataset on all patients undergoing adult cardiac surgery in NHS institutions in the north west of England since April 1997.8 9 We have analysed this database to explore differences between crude mortality and risk stratified results for surgeon specific publication.
We collected data prospectively on a total of 8572 consecutive patients undergoing isolated coronary artery bypass graft surgery for the first time between 1 April 1999 and 31 March 2002 in the north west of England. Data collection methods and definitions have been described in detail previously.8 9 Each patient had a dataset collected, which included data from before and after the operation, to enable a predicted mortality to be calculated. Data were collected in each institution and returned to a central source for analysis. Each centre conducted validation of activity and analysis. Mortality was defined as any in-hospital death. Every patient's record contained an anonymised identifier for each consultant surgeon. Data were analysed for all consultants who were operating in the region on 1 April 2003.
Statistical analysis
Categorical data are shown as a percentage whereas continuous data are
shown as a mean with a range. We determined crude mortality for each surgeon.
We calculated predicted mortality for each patient by using the additive
EuroSCORE,10 a
scoring system derived from an analysis of 19 000 patients throughout Europe
that was reported in 1999. The EuroSCORE ascribes additive points to several
risk factors related to the patient and the procedure, to generate a predicted
mortality for each patient. It has been shown to be a good overall predictor
of mortality for both European and North American
surgery.11
12 If a patient related
factor necessary to calculate the EuroSCORE was missing in the record that
factor was assumed to be absent (this occurred in less than 2% of cases). We
examined the distribution of patients in each EuroSCORE group and compared the
predicted mortality for each surgeon. Owing to the non-normal distribution of
predicted mortality between surgeons, we determined variability by the
interquartile range. We used the EuroSCORE to determine low (
5) and high
(> 5) risk
groups10 and
compared observed mortality and variability between the two groups. We
calculated the C statistic (equivalent to the area under the receiver
operating characteristic curve) to assess the performance of the EuroSCORE. A
C statistic of greater than 0.7 indicates a reasonable ability to discriminate
between patients who died and those who did
not.13
14 We calculated the C
statistic for the total population, low and high risk groups. We examined
surgeon specific mortality in the low risk group by comparing each surgeon's
death rates, plotted with 95% confidence intervals, against the mean
performance for the region in this group of patients. We analysed the effect
of volume of cases on mortality in the low risk group by the
2
test for trend after rank ordering surgeons and categorising them as either
low, middle, or high volume thirds. We used SAS for Windows version 8.2 to
perform all analyses.
|
|
Predicted and observed mortality
Figure 2 shows the number of
patients in each EuroSCORE group. Predicted mortality in most patients was
low. Figures 3 and
4 show the number of deaths and
observed percentage mortality in each EuroSCORE group. A large number of
patients were in the low EuroSCORE groups, but the percentage mortality was
low (less than 2%). In general, observed mortality increased with increasing
EuroSCORE. For lower risk patients the EuroSCORE overpredicted observed
mortality. In EuroSCORE groupings of 14 and above, the observed mortality was
substantially in excess of the EuroSCORE. The EuroSCORE was a good predictor
of overall mortality as shown by a C statistic of 0.75. The mean predicted
mortality was 3.0% (range 2.0% to 3.7%), indicating a difference of nearly
100% between surgeons at the outer limits of the group. The overall
variability between surgeons was high, as shown by an interquartile range of
0.64.
|
|
|
Comparison of low and high risk patients
Eighty five per cent of the total number of patients had a EuroSCORE of 5
or less. Almost half of all observed deaths were in the low risk group (49%).
The remaining 51% of deaths were in the 15% of cases in the high risk group
(EuroSCORE > 5). The proportion of individual surgeons' practices that are
high risk ranged from 5.6% to 23.9%. For the low risk group the observed
mortality was 1.0% (range 0% to 2.9% between surgeons), with a mean predicted
mortality of 2.3% (range 1.7% to 2.7% between surgeons). For the high risk
group the observed mortality was 5.7% (range 0% to 13.6% between surgeons),
with a mean predicted mortality of 7.4% (range 6.6% to 8.3% between surgeons).
The C statistic indicating predictive ability of the EuroSCORE for the low
risk and high risk groups was 0.72 and 0.62, respectively, indicating a
satisfactory predictive ability for low risk patients but an unsatisfactory
ability for those having high risk
surgery.13
14
The variability in predicted mortality between surgeons according to the interquartile range in the low and high risk patients was 0.32 and 0.67, respectively, showing that the low risk patients in each surgeon's practice are a relatively homogeneous group, but there is much greater variation between surgeons in the high risk population. Figure 5 shows for the low risk patients that the 95% confidence intervals around mortality for each surgeon operating in the north west overlap the mean mortality for the region, indicating no surgeon is experiencing mortality results that are different from the peer group. We found a strong univariate association between the volume of operations that each surgeon had performed and observed mortality in the low risk group (P < 0.001) (table 2).
|
|
Strengths and weaknesses of the study
Our study has been conducted on a large population of patients undergoing
surgery over three years. The average number of cases was 372 per surgeon,
which is a reasonable size to allow comparisons to be made. The study has been
conducted in the north west of England and includes all patients undergoing
surgery in NHS hospitals in a defined geographical
area.8
9 This is about one
eighth of all cardiac surgical activity in the United Kingdom. The data have
the confidence of clinicians, which should reassure patients that benchmarking
between surgeons is meaningful, helps surgeons believe any differences that
emerge, and encourages changes in practice to be made where necessary. This
project shows that where there is clinical and management commitment,
collecting robust, comprehensive data is possible and useful.
The dataset we have used undergoes local validation in each centre but has not been subjected to external validation, which is a weakness of our study. It has been shown previously that some problems arise with the completeness and reliability of this type of data.15 We have addressed issues of incomplete data by assuming that any risk factor that has a missing field is negative for that risk factor. The incidence of missing data in our study was less than 2%, but this would lead to a small overall underestimate of predicted risk.
We have used low and high risk groupings to allow meaningful comparisons to be made10; the low risk group contains most patients and has a low variability in predicted risk between surgeons. Using the low risk group for mortality analysis and benchmarking excludes higher risk patients from comparisons. Clinically, high risk patients are a heterogeneous group, ranging from stable patients with multiple comorbidities to patients who come to surgery as emergencies, often directly from the cardiac catheter laboratory. We have not compared surgeons' death rates in the high risk group as predicted mortality differs between surgeons, the proportion of individual surgeons' patients who are high risk varies, and the EuroSCORE is a poor predictor for this population. However, half of the deaths in the population of patients are seen in this high risk group, and politicians and the public may be wary of excluding this many deaths from comparative analysis. It is also possible that by not analysing the high risk group we may be losing important messages about performance, which may be useful for improving quality.
Strengths and weaknesses compared with other studies
Healthcare outcomes can be benchmarked in several different ways. An
approach that has been suggested and used elsewhere is to allocate a predicted
mortality for each surgeon on their total practice of coronary artery surgery
by using a mortality prediction tool, and comparing predicted with observed
mortality to generate an adjusted death
rate.16 In our
study patients in the highest EuroSCORE groups (14 and above) have an observed
mortality in excess of 50% (fig
5), and a small number of these patients in a surgeon's practice
would affect their adjusted mortality adversely. This number varies markedly
between surgeons. Because the EuroSCORE is a poor predictor in the high risk
group as shown by the C statistic, we think that using adjusted death rates
may produce erroneous conclusions.
Although the EuroSCORE is generally regarded to be a good overall predictor of mortality for patients undergoing heart surgery,1012 it has been noticed previously that it underpredicts risk in high risk patients,17 but the effects of this observation on the publication of surgeon specific mortality has not been described. The EuroSCORE working group have addressed underprediction of the additive score by producing a logistic regression model,17 the logistic EuroSCORE, which may be a better predictor in high risk patients, but this has not yet been fully validated and is not widely used. We studied the widely used additive model for our investigation, but failure to examine all available predictive scoring systems is a further limitation of this work.
Several studies have looked at outcomes of individual surgeons or institutions and their relation to volume of surgery.1820 Some of these have been on crude mortality data and others corrected for case mix. Although we have observed a strong association between volume and outcome in our data, we did not design our investigation to look for this as a primary end point. We believe that this observation should be treated with caution as there are numerous possible effects, including time and learning curve effects, which were not controlled for by our study design.
|
Meaning of the study
We believe that publishing surgeon specific, crude mortality
data,2 as is planned
in the United Kingdom, is not in the best interests of patients, and our study
shows that surgeons cannot be compared fairly in this way. Cardiac surgeons
already work in a stressful environment, and the perception that a "bad
run" might jeopardise their career or result in suspension and
investigation may lead to a tendency to turn down high risk cases. The easiest
way to obtain low mortality is to do only straightforward operationsso
called risk averse behaviour. This has already been identified as a potential
problem after a survey of all cardiac surgeons in the United Kingdom in 2000,
where 94% of responders agreed that high risk patients were being turned down
for surgery.1 Death
rates in these patients often approach 100% if the patients are denied surgery
and patients at heightened risk from surgery are, in general, those who have
the most to gain from a successful
operation.21 Our
recommendation of benchmarking only low risk patients seems scientifically
justified and pragmatic and should help to prevent risk averse behaviour.
Unanswered questions and future research
Some evidence from North America sheds light on the effects of publication
of surgeon specific data on patients, cardiologists, and
surgeons,1
22
23 but we do not know to
what extent initiatives to publish crude mortality data for individual
surgeons will actually deny operations to high risk patients, and what
implications this will have on patients' survival, quality of life, and use of
healthcare resources. This is an important area for future studies. Further
investigations are also needed on high risk patients, to improve the quality
of risk prediction in this group, and to understand variability in outcomes
following high risk surgery for quality improvement purposes.
This study has been conducted on behalf of the North West Quality Improvement Programme in Cardiac Interventions, and the participating consultant surgeons are listed as follows: John Au, Ben Bridgewater, Colin Campbell, John Carey, John Chalmers, Walid Dhimis, Abdul Deiraniya, Andrew Duncan, Brian Fabri, Elaine Griffiths, Geir Grotte, Ragheb Hasan, Tim Hooper, Mark Jones, Daniel Keenan, Neeraj Mediratta, Russell Millner, Nick Odom, Brian Prendergast, Mark Pullan, Abbas Rashid, Paul Waterworth, Nizar Yonan. We would like to acknowledge the assistance of the audit officers working in each centre for their hard work in collecting and validating the data.
Contributors: BB had the idea for the study and with ADG and MJ was responsible for the study design. Data analysis was performed by ADG and MJ. The manuscript was prepared by BB and ADG. All authors contributed to writing the paper, which was written on behalf of the North West Quality Improvement Programme in Cardiac Interventions. BB will act as guarantor.
Funding: All primary care trusts in the north west of England.
Competing interests: None declared.
Ethical approval: The project was conducted on routinely collected prospective data. All patient identifiers were anonymised. The study therefore did not need ethical approval.
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
Read all Rapid Responses
What can you learn from this BMJ paper? Read Leanne Tite's Paper+