Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Paris P Tekkis a Academic Department of
Surgery, King's College Hospital, London SE5 9RS, b Academic Unit of Surgery, University of Liverpool, University
Hospital Aintree, Liverpool L9 7AL, c Department of
Surgery, University Hospital Lewisham, London SE13 6LH, d Department
of Public Health Sciences, St George's Hospital, London SW17 0QT Correspondence to: P P Tekkis ptekkis{at}blueyonder.co.uk
| |
Abstract |
|---|
|
|
|---|
Objective:
To design and validate a statistical
method for evaluating the performance of surgical units that adjusts for case volume and case mix.
Design:
Validation study using routinely collected data on in-hospital mortality.
Data sources:
Two UK databases, the ASCOT prospective
database and the risk scoring collaborative (RISC) database, covering
1042 patients undergoing surgery in 29 hospitals for gastro-oesophageal cancer between 1995 and 2000.
Statistical analysis:
A two level hierarchical
logistic regression model was used to adjust each unit's operative
mortality for case mix. Crude or adjusted operative mortality was
plotted on mortality control charts (a graphical representation of
surgical performance) as a function of number of operations. Control
limits defined as 90%, 95%, and 99% confidence intervals identified
units whose performance diverged significantly from the mean.
Results:
The mean in-hospital mortality was
12% (range 0% to 50%). The case volume of the units ranged from one
to 55 cases a year. When crude figures were plotted on the mortality control chart, four units lay outside the 90% control limit, including two outside the 95% limit. When operative mortality was adjusted for
risk, three units lay outside the 90% limit and one outside the 95%
limit. The model fitted the data well and had adequate discrimination
(area under the receiver operating characteristics curve 0.78).
Conclusions:
The mortality control chart is an
accurate, risk adjusted means of identifying units whose surgical
performance, in terms of operative mortality, diverges significantly
from the population mean. It gives an early warning of divergent
performance. It could be adapted to monitor performance across various specialties.
|
What is already known on this topic
Mortality control charts are another way to compare the performance of healthcare providers, particularly for outcomes of surgery What this study adds
Mortality control charts have a "buffer zone" for indicating divergence from the mean mortality and are particularly useful for specialties with a low volume of surgery |
| |
Introduction |
|---|
|
|
|---|
For some major types of surgery, operative mortality is an important measure of performance. To reflect performance accurately, however, mortality must be adjusted for the effect of pre-existing comorbid disease, and existing models of risk stratification have several problems.
Gastrectomy and oesophagectomy have the highest mortality among
elective operations in Britain, and evidence about the relation between
case volume and outcome conflicts.1-3 The subspecialty of
upper gastrointestinal cancer surgery exemplifies the general problem
of quantifying surgical risk with adjustment for case mix and volume.
We developed statistical techniques for evaluating surgical performance
on a continuous scale and applied the techniques to data on upper
gastrointestinal cancer surgery.
| |
Data and methods |
|---|
|
|
|---|
Data sources
We took data on outcomes of gastro-oesophageal cancer
surgery from two databases on upper gastrointestinal surgery: the
stomach and oesophageal cancer outcome and techniques (ASCOT) prospective database and the risk scoring collaborative (RISC) database. There was no population overlap between the databases. Both
databases provided comprehensive POSSUM (physiological and operative
severity score for the enumeration of mortality and morbidity) data on
large cohorts of gastro-oesophageal surgery patients.4
This database on
gastro-oesophageal cancer surgery, which was developed by the British
Oesophago-Gastric Cancer Group, collects a comprehensive dataset on
cases of gastro-oesophageal cancer referred to surgeons, whether or not
an operation actually took place.5 For this study the
database's coordinator used an independent source (hospital episode
statistics) to validate a sample of 157 cases. From January 1999 to
December 2000 the 31 hospitals across the United Kingdom that joined
this voluntary collaboration submitted data on 1036 cases.
The RISC database
This database recorded data on 601 patients undergoing oesophageal and gastric surgery in five hospitals
in the South East and Thames Region, which included cases from general and thoracic surgical units. Of the cases, 351 were recorded
retrospectively from pre-existing databases, case notes, theatre books,
and operating lists, and 250 were prospectively collected from January
1999 to January 2001. The data were independently validated against other hospital data sources (medical records or mortuary registers).
Inclusion and exclusion criteria
We included data on oesophageal and gastric operations for
malignant and benign disease with palliative or curative intent. We
excluded cases where patients were treated medically or by endoscopic
techniques (n=572) and cases with missing notes (n=23).
End point and risk factors
The primary end point was in-hospital mortality (any death during
the same hospital admission as the operation), which can be more
reliably quantified than 30 day mortality and includes patients with
complications who remained in hospital beyond 30 days. Risk factors
studied were age; sex; POSSUM score; surgical procedure; mode of
surgery (emergency or elective); tumour staging; and malignancy
(according to POSSUM category).
Statistical analysis
We used univariate analysis to identify risk factors for
mortality. Continuous variables were grouped into subcategories, and
unifactorial logistic regression was used to compare these with a
reference level. We used the
2 test to analyse
categorical variables. We used a multifactorial logistic regression
model to adjust for case mix. See bmj.com for details of the model and
statistical analysis.
|
|
This graphical method for
monitoring surgical performance plots units' mortality as a function of
number of operations. The exact binomial distribution is used to
construct control limits (90%, 95%, and 99% confidence intervals)
around the mean operative mortality for the group. These control limits indicate whether a particular unit's operative mortality differs significantly from the mean at 10%, 5%, and 1% significance levels. Each unit's operative mortality (unadjusted or adjusted for case mix)
can be plotted as a single point representing the total mortality or as
a running mean as a function of the number of operations done.
Underperforming units will lie above the upper control limits, while
units with unusually good results will lie below the lower control
limits. Units lying within the 95% control limits have an operative
mortality that is statistically consistent with the group mean.
| |
Results |
|---|
|
|
|---|
Of 1637 cases, 1042 (63.7%) satisfied the inclusion criteria: 497 of 1036 cases (47.9%) in the ASCOT database and 545 of 601 cases (90.7%) in the RISC database. Although 36 hospitals contributed data to the study, the analysis was based on data from 29 centres, as seven units did not contribute operated cases. The cases comprised 538 oesophagectomies (51.6%), 443 gastrectomies (42.5%), and 61 palliative bypass procedures (5.9%). Of the operations, 828 (79.5%) were elective and 78 (8.6%) were emergencies; in 136 cases (13.1%) the mode of surgery was not recorded. Nine hundred and nineteen operations (93.7%) were for cancer. The overall in-hospital operative mortality was 12% (9.4% in patients having an elective procedure and 26.9% in patients having an emergency procedure). No evidence of systematic under-reporting of risk factors was shown, and missing data were distributed evenly among the hospitals.
The final multifactorial model used age, POSSUM score, POSSUM malignancy category, and mode of surgery as risk factors (table). Mode of surgery was retained in the model as it is clinically highly relevant and has been reported as an important predictor of outcome.1 The model fitted the data well and had adequate discrimination (area under the receiver operating characteristics curve 0.78 (standard error 0.02)).
Units reported between one and 55 operations a year, with mortality ranging from 0% to 50%. The mortality control chart for unadjusted operative mortality shows that four units lay outside the 90% control limit. When operative mortality was adjusted for case mix, however, no unit was shown to underperform at the 95% control limit, and the individual values regress towards the mean (figure 1). Two units had better results than the group average, with risk adjusted operational mortalities of 4.2% and 3.8%. Figure 2 shows the running means of the risk adjusted operational mortality for two of the units (31 and 33), representing two consecutive series of 102 and 166 cases. Despite fluctuations, unit 31 remained within the central part of the graph, whereas unit 33 repeatedly crossed the lower 99% control limit and thus could be said to be a truly outlying unit and a consistently good performer.
|
| |
Discussion |
|---|
|
|
|---|
The mortality control chart improves on current methods of evaluating surgical units' performance. It is an accurate, risk adjusted means of identifying outlying units while giving an early warning of units approaching divergence from the mean. Comprehensive, accurate, and reliable data are essential if operative mortality is to be used to compare providers' quality of care.
Validity of the data
The information in the study was a combination of prospective data
and medical records. Centres voluntarily contributed data, and at
present there is no formal system for externally validating the
completeness of the database. Internal validity was established by
comparing the operative mortality for a random sample of five
participating hospitals (157 patients) with hospital episode statistics
obtained independently from the hospitals' information departments. The
two databases reported similar overall mortality (14% in the ASCOT
data and 13.8% in the hospital episode statistics), but they differed
in the individual hospitals' volumes of operations and in the
variability of mortality. Although overall operative mortality in the
units in our study was consistent with recently published data from the
West Midlands region, our units were not randomly selected, and we
cannot be sure how representative they are of all UK hospitals.
However, although the quality of our data is limited, implementation of
such a monitoring system in hospitals should lead to an increased
awareness of the data that need to be collected, with subsequent
improvement in the quality of the data.
Quality of the statistical analysis
Hierarchical regression models are particularly useful in
modelling observations with a hierarchical or clustered structure, such
as patients in different hospitals or pupils in different
schools.6 We used confidence intervals around the providers' performances to compare each unit's performance with the
average, with wider confidence intervals for low volume units. If these
wider limits are not allowed for, low volume providers are more likely
to be ranked misleadingly at the top or bottom of the group. Confidence
intervals can be placed around a unit's rank, thus emphasising "the
caution with which any league tables must be
treated."7
Control limits in the mortality chart define outlying units and give an early warning when a unit's performance starts to diverge from the population mean. The less extreme control limits delineate an early warning "buffer zone" to trigger examination of practice. Because the control limits are much wider for low volumes, a high (risk adjusted) operational mortality in these hospitals may require longer monitoring to establish a meaningful estimate of mortality.
Usefulness of mortality control charts
Mortality control charts can be extended to any surgical specialty
that uses risk adjusted outcomes. Similar graphical methods have been
used to investigate the effect of case volume on unadjusted operative
mortality in paediatric cardiac surgery.8 The mean
operative mortality and corresponding control limits for any population
will need to be reviewed periodically to reflect changes over time. The
mortality control chart is intended to add to existing statistical
methods for monitoring surgical performance rather than replace them.
| |
Acknowledgments |
|---|
We thank all the consultants who contributed data to the study, the data collection officers for their help, and the research staff at the Centre for Multilevel Modelling, University of London, for their invaluable help in developing the hierarchical models. The hospitals and trusts that contributed data are listed at bmj.com
Contributors: See bmj.com
| |
Footnotes |
|---|
Funding: The Hue Falwasser Fellowship of the Royal College of Surgeons of England. The guarantors accept full responsibility for the conduct of the study, had access to the data, and controlled the decision to publish.
Competing interests: None declared.
Ethical approval: The multicentre research ethics committee for Wales.
This is an abridged version; the
full version is on bmj.com
| |
References |
|---|
|
|
|---|
| 1. | Gillison EW, Powell J, McConkey CC, Spychal RT. Surgical workload and outcome after resection for carcinoma of the oesophagus and cardia. Br J Surg 2002; 89: 344-348[CrossRef][Web of Science][Medline]. |
| 2. |
Swisher SG, Deford L, Merriman KW, Walsh GL, Smythe R, Vaporicyan A, et al.
Effect of operative volume on morbidity, mortality, and hospital use after esophagectomy for cancer.
J Thorac Cardiovasc Surg
2000;
119:
1126-1132 |
| 3. |
Birkmemeyer JD, Siewers AE, Finlayson EVA, Stukel TA, Lucas FL, Batista I, et al.
Hospital volume and surgical mortality in the United States.
N Engl J Med
2002;
346:
1128-1137 |
| 4. | Copeland GP, Jones D, Walters M. POSSUM: a scoring system for surgical audit. Br J Surg 1991; 78: 355-360[Medline]. |
| 5. | Cummins J, McCulloch P. ASCOT: a comprehensive clinical database for gastro-oesophageal cancer surgery. Eur J Surg Oncol 2001; 27: 709-713[CrossRef][Web of Science][Medline]. |
| 6. | Goldstein H, Thomas S. Using examination results as indicators of school and college performance. J R Statist Soc (Ser A) 1996; 159: 149-163[CrossRef]. |
| 7. |
Marshall EC, Spiegelhalter DJ.
Reliability of league tables of in vitro fertilisation clinics: retrospective analysis of live birth rates.
BMJ
1998;
316:
1701-1704 |
| 8. | Stark J, Gallivan S, Lovegrove J, Hamilton JR, Monro JL, Pollock JC, et al. Mortality rates after surgery for congenital heart defects in children and surgeons' performance. Lancet 2000; 355: 1004-1007[CrossRef][Web of Science][Medline]. |
(Accepted 6 February 2002)
Read all Rapid Responses