BMJ 2003;326:786-788 ( 12 April )

Papers

Mortality control charts for comparing performance of surgical units: validation study using hospital mortality data

Paris P Tekkis, research fellow of the Royal College of Surgeons of England aPeter McCulloch, senior lecturer in surgery bAdrian C Steger, consultant surgeon cIrving S Benjamin, professor of surgery aJan D Poloniecki, senior lecturer in biostatistics d

a Academic Department of Surgery, King's College Hospital, London SE5 9RS, b Academic Unit of Surgery, University of Liverpool, University Hospital Aintree, Liverpool L9 7AL, c Department of Surgery, University Hospital Lewisham, London SE13 6LH, d Department of Public Health Sciences, St George's Hospital, London SW17 0QT

Correspondence to: P P Tekkis ptekkis{at}blueyonder.co.uk


    Abstract
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

Objective: To design and validate a statistical method for evaluating the performance of surgical units that adjusts for case volume and case mix.
Design: Validation study using routinely collected data on in-hospital mortality.
Data sources: Two UK databases, the ASCOT prospective database and the risk scoring collaborative (RISC) database, covering 1042 patients undergoing surgery in 29 hospitals for gastro-oesophageal cancer between 1995 and 2000.
Statistical analysis: A two level hierarchical logistic regression model was used to adjust each unit's operative mortality for case mix. Crude or adjusted operative mortality was plotted on mortality control charts (a graphical representation of surgical performance) as a function of number of operations. Control limits defined as 90%, 95%, and 99% confidence intervals identified units whose performance diverged significantly from the mean.
Results: The mean in-hospital mortality was 12% (range 0% to 50%). The case volume of the units ranged from one to 55 cases a year. When crude figures were plotted on the mortality control chart, four units lay outside the 90% control limit, including two outside the 95% limit. When operative mortality was adjusted for risk, three units lay outside the 90% limit and one outside the 95% limit. The model fitted the data well and had adequate discrimination (area under the receiver operating characteristics curve 0.78).
Conclusions: The mortality control chart is an accurate, risk adjusted means of identifying units whose surgical performance, in terms of operative mortality, diverges significantly from the population mean. It gives an early warning of divergent performance. It could be adapted to monitor performance across various specialties.

What is already known on this topic
League tables are an established technique for ranking the performance of organisations such as healthcare providers

Mortality control charts are another way to compare the performance of healthcare providers, particularly for outcomes of surgery

What this study adds
Mortality control charts can be adjusted for case mix and case volume and are better than league tables for monitoring surgical performance

Mortality control charts have a "buffer zone" for indicating divergence from the mean mortality and are particularly useful for specialties with a low volume of surgery




    Introduction
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

For some major types of surgery, operative mortality is an important measure of performance. To reflect performance accurately, however, mortality must be adjusted for the effect of pre-existing comorbid disease, and existing models of risk stratification have several problems.

Gastrectomy and oesophagectomy have the highest mortality among elective operations in Britain, and evidence about the relation between case volume and outcome conflicts.1-3 The subspecialty of upper gastrointestinal cancer surgery exemplifies the general problem of quantifying surgical risk with adjustment for case mix and volume. We developed statistical techniques for evaluating surgical performance on a continuous scale and applied the techniques to data on upper gastrointestinal cancer surgery.


    Data and methods
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

Data sources
We took data on outcomes of gastro-oesophageal cancer surgery from two databases on upper gastrointestinal surgery: the stomach and oesophageal cancer outcome and techniques (ASCOT) prospective database and the risk scoring collaborative (RISC) database. There was no population overlap between the databases. Both databases provided comprehensive POSSUM (physiological and operative severity score for the enumeration of mortality and morbidity) data on large cohorts of gastro-oesophageal surgery patients.4

The ASCOT prospective database--- This database on gastro-oesophageal cancer surgery, which was developed by the British Oesophago-Gastric Cancer Group, collects a comprehensive dataset on cases of gastro-oesophageal cancer referred to surgeons, whether or not an operation actually took place.5 For this study the database's coordinator used an independent source (hospital episode statistics) to validate a sample of 157 cases. From January 1999 to December 2000 the 31 hospitals across the United Kingdom that joined this voluntary collaboration submitted data on 1036 cases.

The RISC database--- This database recorded data on 601 patients undergoing oesophageal and gastric surgery in five hospitals in the South East and Thames Region, which included cases from general and thoracic surgical units. Of the cases, 351 were recorded retrospectively from pre-existing databases, case notes, theatre books, and operating lists, and 250 were prospectively collected from January 1999 to January 2001. The data were independently validated against other hospital data sources (medical records or mortuary registers).

Inclusion and exclusion criteria
We included data on oesophageal and gastric operations for malignant and benign disease with palliative or curative intent. We excluded cases where patients were treated medically or by endoscopic techniques (n=572) and cases with missing notes (n=23).

End point and risk factors
The primary end point was in-hospital mortality (any death during the same hospital admission as the operation), which can be more reliably quantified than 30 day mortality and includes patients with complications who remained in hospital beyond 30 days. Risk factors studied were age; sex; POSSUM score; surgical procedure; mode of surgery (emergency or elective); tumour staging; and malignancy (according to POSSUM category).

Statistical analysis
We used univariate analysis to identify risk factors for mortality. Continuous variables were grouped into subcategories, and unifactorial logistic regression was used to compare these with a reference level. We used the chi 2 test to analyse categorical variables. We used a multifactorial logistic regression model to adjust for case mix. See bmj.com for details of the model and statistical analysis.


                              
View this table:
[in this window]
[in a new window]
 

Two level hierarchical logistic regression model for upper gastrointestinal surgery in all 29 hospitals



View larger version (35K):
[in this window]
[in a new window]
 
Fig 1.   Operative mortality in 29 hospitals, adjusted for case mix (with unadjusted mortality for three hospitals shown)

Mortality control chart--- This graphical method for monitoring surgical performance plots units' mortality as a function of number of operations. The exact binomial distribution is used to construct control limits (90%, 95%, and 99% confidence intervals) around the mean operative mortality for the group. These control limits indicate whether a particular unit's operative mortality differs significantly from the mean at 10%, 5%, and 1% significance levels. Each unit's operative mortality (unadjusted or adjusted for case mix) can be plotted as a single point representing the total mortality or as a running mean as a function of the number of operations done. Underperforming units will lie above the upper control limits, while units with unusually good results will lie below the lower control limits. Units lying within the 95% control limits have an operative mortality that is statistically consistent with the group mean.


    Results
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

Of 1637 cases, 1042 (63.7%) satisfied the inclusion criteria: 497 of 1036 cases (47.9%) in the ASCOT database and 545 of 601 cases (90.7%) in the RISC database. Although 36 hospitals contributed data to the study, the analysis was based on data from 29 centres, as seven units did not contribute operated cases. The cases comprised 538 oesophagectomies (51.6%), 443 gastrectomies (42.5%), and 61 palliative bypass procedures (5.9%). Of the operations, 828 (79.5%) were elective and 78 (8.6%) were emergencies; in 136 cases (13.1%) the mode of surgery was not recorded. Nine hundred and nineteen operations (93.7%) were for cancer. The overall in-hospital operative mortality was 12% (9.4% in patients having an elective procedure and 26.9% in patients having an emergency procedure). No evidence of systematic under-reporting of risk factors was shown, and missing data were distributed evenly among the hospitals.

The final multifactorial model used age, POSSUM score, POSSUM malignancy category, and mode of surgery as risk factors (table). Mode of surgery was retained in the model as it is clinically highly relevant and has been reported as an important predictor of outcome.1 The model fitted the data well and had adequate discrimination (area under the receiver operating characteristics curve 0.78 (standard error 0.02)).

Units reported between one and 55 operations a year, with mortality ranging from 0% to 50%. The mortality control chart for unadjusted operative mortality shows that four units lay outside the 90% control limit. When operative mortality was adjusted for case mix, however, no unit was shown to underperform at the 95% control limit, and the individual values regress towards the mean (figure 1). Two units had better results than the group average, with risk adjusted operational mortalities of 4.2% and 3.8%. Figure 2 shows the running means of the risk adjusted operational mortality for two of the units (31 and 33), representing two consecutive series of 102 and 166 cases. Despite fluctuations, unit 31 remained within the central part of the graph, whereas unit 33 repeatedly crossed the lower 99% control limit and thus could be said to be a truly outlying unit and a consistently good performer.



View larger version (33K):
[in this window]
[in a new window]
 
Fig 2.   Operative mortality in units 31 (n=102) and 33 (n=166), plotted as running means (adjusted for case mix)



    Discussion
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

The mortality control chart improves on current methods of evaluating surgical units' performance. It is an accurate, risk adjusted means of identifying outlying units while giving an early warning of units approaching divergence from the mean. Comprehensive, accurate, and reliable data are essential if operative mortality is to be used to compare providers' quality of care.

Validity of the data
The information in the study was a combination of prospective data and medical records. Centres voluntarily contributed data, and at present there is no formal system for externally validating the completeness of the database. Internal validity was established by comparing the operative mortality for a random sample of five participating hospitals (157 patients) with hospital episode statistics obtained independently from the hospitals' information departments. The two databases reported similar overall mortality (14% in the ASCOT data and 13.8% in the hospital episode statistics), but they differed in the individual hospitals' volumes of operations and in the variability of mortality. Although overall operative mortality in the units in our study was consistent with recently published data from the West Midlands region, our units were not randomly selected, and we cannot be sure how representative they are of all UK hospitals. However, although the quality of our data is limited, implementation of such a monitoring system in hospitals should lead to an increased awareness of the data that need to be collected, with subsequent improvement in the quality of the data.

Quality of the statistical analysis
Hierarchical regression models are particularly useful in modelling observations with a hierarchical or clustered structure, such as patients in different hospitals or pupils in different schools.6 We used confidence intervals around the providers' performances to compare each unit's performance with the average, with wider confidence intervals for low volume units. If these wider limits are not allowed for, low volume providers are more likely to be ranked misleadingly at the top or bottom of the group. Confidence intervals can be placed around a unit's rank, thus emphasising "the caution with which any league tables must be treated."7

Control limits in the mortality chart define outlying units and give an early warning when a unit's performance starts to diverge from the population mean. The less extreme control limits delineate an early warning "buffer zone" to trigger examination of practice. Because the control limits are much wider for low volumes, a high (risk adjusted) operational mortality in these hospitals may require longer monitoring to establish a meaningful estimate of mortality.

Usefulness of mortality control charts
Mortality control charts can be extended to any surgical specialty that uses risk adjusted outcomes. Similar graphical methods have been used to investigate the effect of case volume on unadjusted operative mortality in paediatric cardiac surgery.8 The mean operative mortality and corresponding control limits for any population will need to be reviewed periodically to reflect changes over time. The mortality control chart is intended to add to existing statistical methods for monitoring surgical performance rather than replace them.



    Acknowledgments

We thank all the consultants who contributed data to the study, the data collection officers for their help, and the research staff at the Centre for Multilevel Modelling, University of London, for their invaluable help in developing the hierarchical models. The hospitals and trusts that contributed data are listed at bmj.com

Contributors: See bmj.com

    Footnotes

Funding: The Hue Falwasser Fellowship of the Royal College of Surgeons of England. The guarantors accept full responsibility for the conduct of the study, had access to the data, and controlled the decision to publish.

Competing interests: None declared.

Ethical approval: The multicentre research ethics committee for Wales.

This is an abridged version; the full version is on bmj.com
    References
Top
Abstract
Introduction
Data and methods
Results
Discussion
References

1. Gillison EW, Powell J, McConkey CC, Spychal RT. Surgical workload and outcome after resection for carcinoma of the oesophagus and cardia. Br J Surg 2002; 89: 344-348[CrossRef][Web of Science][Medline].
2. Swisher SG, Deford L, Merriman KW, Walsh GL, Smythe R, Vaporicyan A, et al. Effect of operative volume on morbidity, mortality, and hospital use after esophagectomy for cancer. J Thorac Cardiovasc Surg 2000; 119: 1126-1132[Abstract/Free Full Text].
3. Birkmemeyer JD, Siewers AE, Finlayson EVA, Stukel TA, Lucas FL, Batista I, et al. Hospital volume and surgical mortality in the United States. N Engl J Med 2002; 346: 1128-1137[Abstract/Free Full Text].
4. Copeland GP, Jones D, Walters M. POSSUM: a scoring system for surgical audit. Br J Surg 1991; 78: 355-360[Medline].
5. Cummins J, McCulloch P. ASCOT: a comprehensive clinical database for gastro-oesophageal cancer surgery. Eur J Surg Oncol 2001; 27: 709-713[CrossRef][Web of Science][Medline].
6. Goldstein H, Thomas S. Using examination results as indicators of school and college performance. J R Statist Soc (Ser A) 1996; 159: 149-163[CrossRef].
7. Marshall EC, Spiegelhalter DJ. Reliability of league tables of in vitro fertilisation clinics: retrospective analysis of live birth rates. BMJ 1998; 316: 1701-1704[Abstract/Free Full Text].
8. Stark J, Gallivan S, Lovegrove J, Hamilton JR, Monro JL, Pollock JC, et al. Mortality rates after surgery for congenital heart defects in children and surgeons' performance. Lancet 2000; 355: 1004-1007[CrossRef][Web of Science][Medline].

(Accepted 6 February 2002)


© 2003 BMJ Publishing Group Ltd

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati    What's this?

Relevant Articles

Mortality control charts: Both components of POSSUM ratio require critical analysis
George A Khoury
BMJ 2003 326: 1397. [Extract] [Full Text]

Mortality control charts: Assessment of outcome is complex
Frank A Frizelle and John Frye
BMJ 2003 326: 1397. [Extract] [Full Text]

Mortality control charts improve monitoring of surgical performance
BMJ 2003 326: 0. [Full Text] [PDF]

This article has been cited by other articles:

  • Mohammed, M A, Worthington, P, Woodall, W H (2008). Plotting basic control charts: tutorial notes for healthcare practitioners. Qual Saf Health Care 17: 137-145 [Abstract] [Full text]  
  • Moloney, E. D, Bennett, K., Silke, B. (2007). Effect of an acute medical admission unit on key quality indicators assessed by funnel plots. Postgrad. Med. J. 83: 659-663 [Abstract] [Full text]  
  • Spiegelhalter, D J (2005). Handling over-dispersion of performance indicators. Qual Saf Health Care 14: 347-351 [Abstract] [Full text]  
  • Leandro, G, Rolando, N, Gallus, G, Rolles, K, Burroughs, A K (2005). Monitoring surgical and medical outcomes: the Bernoulli cumulative SUM chart. A novel application to assess clinical interventions. Postgrad. Med. J. 81: 647-652 [Abstract] [Full text]  
  • Rogers, C. A., Ganesh, J. S., Banner, N. R., Bonser, R. S., On behalf of the steering Group, (2005). Cumulative risk adjusted monitoring of 30-day mortality after cardiothoracic transplantation: UK experience. Eur. J. Cardiothorac. Surg. 27: 1022-1029 [Abstract] [Full text]  
  • Blackstone, E. H. (2004). Monitoring surgical performance. J. Thorac. Cardiovasc. Surg. 128: 807-810 [Full text]  
  • Rogers, C. A., Reeves, B. C., Caputo, M., Ganesh, J. S., Bonser, R. S., Angelini, G. D. (2004). Control chart methods for monitoring cardiac surgical performance and their interpretation. J. Thorac. Cardiovasc. Surg. 128: 811-819 [Full text]  
  • Mohammed, M A (2004). Using statistical process control to improve the quality of health care. Qual Saf Health Care 13: 243-245 [Full text]  
  • Marshall, T., Mohammed, M. A., Rouse, A. (2004). A randomized controlled trial of league tables and control charts as aids to health service decision-making. Int J Qual Health Care 16: 309-315 [Abstract] [Full text]  
  • Treasure, T. (2004). Congenital heart disease. BMJ 328: 594-595 [Full text]  
  • Frizelle, F. A, Frye, J. (2003). Mortality control charts: Assessment of outcome is complex. BMJ 326: 1397-1397 [Full text]  
  • Khoury, G. A (2003). Mortality control charts: Both components of POSSUM ratio require critical analysis. BMJ 326: 1397-1397 [Full text]  

Rapid Responses:

Read all Rapid Responses

What determines the standard?
John Robson
bmj.com, 14 Apr 2003 [Full text]
Only one standard: 0% morbidity and mortality
Richard G Fiddian-Green
bmj.com, 16 Apr 2003 [Full text]
Assessment of outcome is complex
Frank A. Frizelle, et al.
bmj.com, 17 Apr 2003 [Full text]
Why Not to Judge Performance with Statistical Process Control
Anthony P Morton
bmj.com, 26 Apr 2003 [Full text]
Control charts could mislead
Chris Sherlaw-Johnson
bmj.com, 2 May 2003 [Full text]
Data validation and interpretation in POSSUM
George A Khoury
bmj.com, 13 May 2003 [Full text]



Access jobs at BMJ Careers
Whats new online at Student 

BMJ