Surgeon specialization and operative mortality in United States: retrospective analysisBMJ 2016; 354 doi: https://doi.org/10.1136/bmj.i3571 (Published 21 July 2016) Cite this as: BMJ 2016;354:i3571
- Nikhil R Sahni, fellow1 2,
- Maurice Dalton, researcher3,
- David M Cutler, Otto Eckstein professor of applied economics1 3,
- John D Birkmeyer, executive vice president and chief academic officer4 5,
- Amitabh Chandra, Malcolm Wiener professor of social policy3 6
- 1Department of Economics, Harvard University, Cambridge, MA, USA
- 2McKinsey and Company, Boston, MA, USA
- 3National Bureau of Economic Research, Cambridge, MA, USA
- 4Dartmouth-Hitchcock Health System, Lebanon, NH, USA
- 5Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
- 6Harvard Kennedy School, Cambridge, MA, USA
- Correspondence to: N R Sahni, 280 Congress Street #1100, Boston, MA 02210, USA
- Accepted 24 June 2016
Objective To measure the association between a surgeon’s degree of specialization in a specific procedure and patient mortality.
Design Retrospective analysis of Medicare data.
Setting US patients aged 66 or older enrolled in traditional fee for service Medicare.
Participants 25 152 US surgeons who performed one of eight procedures (carotid endarterectomy, coronary artery bypass grafting, valve replacement, abdominal aortic aneurysm repair, lung resection, cystectomy, pancreatic resection, or esophagectomy) on 695 987 patients in 2008-13.
Main outcome measure Relative risk reduction in risk adjusted and volume adjusted 30 day operative mortality between surgeons in the bottom quarter and top quarter of surgeon specialization (defined as the number of times the surgeon performed the specific procedure divided by his/her total operative volume across all procedures).
Results For all four cardiovascular procedures and two out of four cancer resections, a surgeon’s degree of specialization was a significant predictor of operative mortality independent of the number of times he or she performed that procedure: carotid endarterectomy (relative risk reduction between bottom and top quarter of surgeons 28%, 95% confidence interval 0% to 48%); coronary artery bypass grafting (15%, 4% to 25%); valve replacement (46%, 37% to 53%); abdominal aortic aneurysm repair (42%, 29% to 53%); lung resection (28%, 5% to 46%); and cystectomy (41%, 8% to 63%). In five procedures (carotid endarterectomy, valve replacement, lung resection, cystectomy, and esophagectomy), the relative risk reduction from surgeon specialization was greater than that from surgeon volume for that specific procedure. Furthermore, surgeon specialization accounted for 9% (coronary artery bypass grafting) to 100% (cystectomy) of the relative risk reduction otherwise attributable to volume in that specific procedure.
Conclusion For several common procedures, surgeon specialization was an important predictor of operative mortality independent of volume in that specific procedure. When selecting a surgeon, patients, referring physicians, and administrators assigning operative workload may want to consider a surgeon’s procedure specific volume as well as the degree to which a surgeon specializes in that procedure.
Hundreds of studies have shown that surgeons with higher volumes have better outcomes across a variety of procedures.1 2 3 4 5 6 Researchers have identified several factors contributing to this association, including experience and technical skill.7 8 9 10 The ease of measuring this volume-outcomes relation, coupled with the strength of the association, makes it a powerful way to ascertain surgeons’ quality.
At the same time, the degree to which a surgeon specializes in a specific procedure may be as important as the number of times that he or she performs it.11 12 13 14 A surgeon who specializes in one operation may have better outcomes owing to muscle memory built from repetition, higher attention and faster recall as a result of less switching between different procedures, and knowledge transfer of outcomes for the same procedure performed in different patients.8 15 16 17 18 If this specialization hypothesis holds true, a surgeon performing 20 procedures of which all 20 are valve replacements (denoting 100% specialization in the procedure) would have lower operative mortality rates than a surgeon who performs 100 operations of which 40 are valve replacements (denoting 40% specialization in the procedure). In contrast, the volume-outcomes hypothesis would suggest that selecting the surgeon who performs 40 valve replacements would lead to superior outcomes for patients. To the best of our knowledge, no study has described a statistical association between a surgeon’s degree of specialization in a specific procedure and patients’ mortality.
Our objective was to test the hypothesis of a specialization-outcomes relation independent of a surgeon’s volume in that specific procedure. We looked at the same eight procedures originally studied for the volume-outcomes relation—carotid endarterectomy, coronary artery bypass grafting, valve replacement, abdominal aortic aneurysm repair, lung resection, cystectomy, pancreatic resection, and esophagectomy—to estimate the specialization-outcomes relation.1 We determined surgeons’ specialization by using US Medicare data rather than a surgeon’s self reported specialty or board certification.6 9 19 For each procedure, we compared risk adjusted 30 day mortality between surgeons who performed the same volume of the specific procedure but varied in the degree of specialization in that procedure.
We examined all eight conditions that were studied by Birkmeyer et al: four cardiovascular procedures (carotid endarterectomy, coronary artery bypass grafting, valve replacement, and abdominal aortic aneurysm repair) and four cancer resections (lung resection, cystectomy, pancreatic resection, and esophagectomy).1 We identified all patients undergoing one of these procedures and the associated surgeons in the Medicare Inpatient file from 2008 to 2013, the latest year of data available to us at the time of this study.
Defining surgeons’ volume and specialization
For a given inpatient claim, we defined a procedure by using the ICD-9 (international classification of diseases, ninth revision) procedure code listed in the principal procedure field. We attributed each surgery to the surgeon listed in the operating physician field of the inpatient claim. We defined “total operative volume” (v) as all procedures attributed to a surgeon and “procedure specific volume” (vj) as the number of cases attributed to a surgeon for the specific procedure being examined (see appendix for code list).1 2 We defined surgeon specialization (sj) as procedure specific volume divided by total operative volume across all procedures (sj=vj/v).
Using National Plan and Provider Enumeration System data, we identified each surgeon’s self reported specialty and included only surgeons in clinically appropriate specialties (see appendix for provider taxonomy codes).20 We also excluded surgeons with fewer than three consecutive years of available inpatient claims.
We divided surgeons into quarters based on their specialization. We also divided surgeons into quarters based on procedure specific volume and verified that a similar volume-outcomes relation existed in our data as in the original research (see appendix figure A1).1
We included patients aged 66 or older who were continuously enrolled in traditional fee for service Medicare starting 12 months before the admission month through four months after the index admission or death.21 We required that at least one year had passed before we identified a new procedure as a separate case for the same patient.
We further limited patients on the basis of specific diagnoses and other procedures performed (see appendix for ICD-9 codes). To avoid potential adverse mortality effects due to delays in performing surgery, we limited patients to those with procedures performed within three days of admission.22 23 We used SAS version 9.3 to build the study cohort.
For each patient, we compiled age, sex, and race data from the Medicare Master Beneficiary Summary file. Using the method developed by Krumholz et al, which mapped the 189 hierarchical condition categories developed by Pope et al for the Centers for Medicare and Medicaid Services (CMS) into 17 groups (cerebovascular disease, chronic liver disease, chronic obstructive pulmonary disease, dementia, diabetes, dialysis, hypertension, major psychiatric disorder, metastatic cancer, paralysis, peripheral vascular disease, pneumonia, protein calorie malnutrition, renal failure disease, stroke, substance abuse, trauma), we measured the comorbidity profile for each patient based on all Medicare Provider Analysis and Review (MedPAR) claims occurring 365 days before the index hospital admission, not including the index hospital admission event.24 25 For hospitals, we used hospital listings in the 2010 Dartmouth Atlas to identify academic medical centers.26
Analysis and outcome measures
The outcome variable of interest was death occurring within 30 days of the initial hospital admission date. We used a multilevel mixed logit model to examine the specialization-outcomes relation. We controlled for age, sex, race, year of surgery, comorbidity profile, day of the week, procedure type (see appendix for details on groupings), days between admission and surgery, and whether the hospital was an academic medical center. We included surgeon random effects to account for unobserved characteristics of surgeons, such as a surgeon’s technical skill or the experience of the surgeon, as well as repeated observations for the same surgeon. We also controlled for procedure specific volume in quarters.
We included hospital random effects to account for hospital specific factors, such as staffing ratios, the hospital’s financial health, and electronic health record systems, that may mediate the relation between specialization and mortality. Including these effects allowed us to interpret the estimated relation between surgeons’ specialization and outcomes as the opportunity to improve mortality through superior matching of patients to surgeons within a hospital, as opposed to matching patients to surgeons in different hospitals. We converted odds ratios to relative risks by using standard methods as developed by Zhang and Yu and specified for Stata by Cummings.27 28 We estimated the relative risk reduction from greater specialization as the difference in mortality between an average surgeon in the bottom quarter versus top quarter of surgeon specialization. We also estimated the absolute risk reduction and the number needed to treat (NNT).
We estimated a second set of regressions with surgeons’ specialization and procedure specific volume as continuous rather than ordinal variables. This specification has the benefits of estimating only one coefficient for surgeon specialization and being less subject to small sample sizes, but it is more restrictive than the specification with indicator variables for the degree of specialization. We used Stata 14 for all analyses and regressions.
Our interest in this study was driven by finding ways to improve outcomes for surgical patients. Following previous research, we chose our main measure of 30 day mortality to focus on a patient’s survival after surgery.1 6 Countries such as the United States and United Kingdom have also begun to report this measure so that patients can assess surgeons’ quality.29 30 No patients were involved in the design of this study.
We examined 39 157 344 Medicare fee for service inpatient claims between 2008 and 2013. Our study cohort consisted of 695 987 patients operated on by 25 152 surgeons. Tables 1 and 2⇓ summarize the characteristics of surgeons and patients for each of the eight procedures arranged by surgeon specialization. For most of the procedures, we found no clinically important association between surgeons’ specialization and patients’ age, sex, race, or comorbidity profile (see appendix for details on comorbidities). Average surgeon specialization for a procedure ranged from 6% for esophagectomy to 40% for coronary artery bypass grafting. Within a procedure, the difference in specialization among the least specialized and most specialized surgeons ranged from 43 percentage points (0.1-43%) for esophagectomy to 94 percentage points (0.1-94%) for coronary artery bypass grafting.
By definition, surgeons’ specialization might increase as a result of greater procedure specific volume (numerator) or lesser total operative volume across all procedures (denominator). In certain procedures, such as abdominal aortic aneurysm repair and cystectomy, increased surgeon specialization was associated with greater procedure specific volume and relatively stable total operative volume. For other procedures, such as coronary artery bypass grafting and esophagectomy, we observed similar trends in procedure specific volume but also lesser total operative volume. A greater percentage of procedures were performed at academic medical centers in the higher compared with lower quarters of surgeon specialization for all procedures except carotid endarterectomy and coronary artery bypass grafting.
Tables 3 and 4⇓ illustrate the extent to which a surgeon’s degree of specialization was related to the number of times he or she performed that specific procedure. Overall, we found that surgeons in the top quarter of procedure specific volume varied in their specialization in the same procedure. For example, for carotid endarterectomy nearly two thirds of surgeons in the top quarter of procedure specific volume were also in the top quarter of surgeon specialization, but for coronary artery bypass grafting less than 50% of surgeons in the top quarter of procedure specific volume were in the top quarter of surgeon specialization.
Relative risk reduction by quarters
We observed a statistically significant reduction in risk adjusted 30 day operative mortality between the bottom and top quarters of surgeon specialization for all four cardiovascular procedures and two of the four cancer resections independent of procedure specific volume (fig 1⇓). Among the four cardiovascular procedures, the relative risk reduction in mortality from greater specialization ranged from 15% for coronary artery bypass grafting to 46% for valve replacement (table 5⇓). Among the four cancer resections, the relative risk reduction in mortality ranged from 28% for lung resection to 48% for esophagectomy. Among the six procedures for which we found a statistically significant relative risk reduction, the absolute risk reduction ranged from 0.3% (NNT=334 patients) for carotid endarterectomy to 2.8% (NNT=36 patients) for abdominal aortic aneurysm repair. For the two cancer resections (pancreatic resection and esophagectomy) for which we did not find a statistically significant effect, the numbers of patients sampled were the smallest among all eight procedures reviewed (less than 11 000 patients for each procedure).
Specialization-outcomes relation versus volume-outcomes relation
Because we included procedure-specific volume in our regressions, we were able to estimate the relative risk reduction in mortality as a result of selecting a surgeon in the top quarter of surgeon specialization compared with a surgeon in the top quarter of procedure specific volume. For two of the four cardiovascular procedures (carotid endarterectomy and valve replacement) and three of the four cancer resections (lung resection, cystectomy, and esophagectomy), we observed a greater relative risk reduction due to surgeon specialization than due to procedure specific volume (see appendix for full regression results).
To estimate the role of surgeon specialization on the volume-outcomes relation, we estimated the relative risk reduction between the bottom and top quarters of procedure specific volume with and without accounting for surgeon specialization (table 6⇓). When we excluded surgeon specialization from the model, we found a statistically significant association between procedure specific volume and mortality for all four cardiovascular procedures and for two of the four cancer resections (lung resection and pancreatic resection). When we included surgeon specialization, the observed volume-outcomes relation decreased from 9% for coronary artery bypass grafting up to 100% for cystectomy.
We ran a second version of the regressions with surgeon specialization and procedure specific volume as continuous variables. For all four cardiovascular procedures and two of the four cancer resections (lung resection and esophagectomy), we observed a statistically significant association between surgeon specialization and mortality (see appendix for full results). For carotid endarterectomy and abdominal aortic aneurysm repair, we also observed a statistically significant association between procedure specific volume and mortality.
Our objective in this study was to quantify a possible association between a surgeon’s degree of specialization in a specific procedure and patients’ mortality. For six of the eight procedures examined (carotid endarterectomy, coronary artery bypass grafting, valve replacement, abdominal aortic aneurysm repair, lung resection, and cystectomy), we found surgeon specialization to be an important predictor of mortality. Given the well documented volume-outcomes relation in healthcare, we also controlled for the number of times the surgeon performed the specific procedure. Our results showed that the observed specialization-outcomes relation was independent of the surgeon’s volume in that specific procedure.
The volume-outcomes relation is generally attributed to “learning by doing,” which shows diminishing returns after a certain level of “doing.” We found a similar result for certain procedures such as abdominal aortic aneurysm repair, suggesting some minimal cut-off threshold of surgeon specialization. However, for procedures such as valve replacement, our results showed a linear trend from the bottom to top quarter of surgeon specialization, suggesting the absence of diminishing returns from specialization.
Furthermore, for five procedures (carotid endarterectomy, valve replacement, lung resection, cystectomy, and esophagectomy), the relative risk reduction in mortality from selecting a surgeon in the top quarter of surgeon specialization was greater than that from selecting a surgeon in the top quarter of procedure specific volume. Additionally, surgeon specialization accounted for at least some portion (if not all) of the observed volume-outcomes relation.
Several factors may explain the observed association between specialization and outcomes. Repetition of tasks has been shown to improve mortality and could be manifested in surgeons as muscle memory and dexterity.8 15 Greater familiarity with a medical device and its components has also been associated with better survival rates.18 Reduced mortality has been linked to performance of the same procedure under varying patient related circumstances, allowing a surgeon to transfer relevant knowledge and skills between patients. Furthermore, focusing on a single procedure reduces the cognitive demands of switching tasks.8 11 17 These potential mechanisms might result from both greater volume for a specific procedure (for example, task repetition) and from less total operative volume across all procedures (for example, academic research). We did not explore these mechanisms, as this was beyond the scope of this paper.
Strengths and limitations of study
Our study should be interpreted in the context of its limitations. One potential limitation may be unobserved choices by surgeons in selecting patients. For example, in a given hospital, a surgeon who specializes in only coronary artery bypass grafts may be performing these surgeries on unobservably healthier patients because he or she is less skilled than a colleague who also performs other procedures. However, given the relative comparability of patients’ characteristics between quarters of surgeon specialization, this explanation is unlikely to be true.
A second limitation may be unobserved characteristics of surgeons, such as a surgeon’s technical skill, the age and experience of the surgeon, or a less specialized surgical training program from which the surgeon did not become particularly skilled in a specific procedure. We attempted to account for these factors by including surgeon random effects.
Other limitations relate to the data involved. We generated our results on the basis of only the Medicare fee for service population. Although using these data guarantees that all patients were similarly insured and enabled us to measure outcomes effectively, we cannot factor in non-Medicare cases to measure total operative volume, procedure specific volume, and surgeon specialization, and nor can we account for the surgery team. We attempted to account for this discrepancy by using hospital random effects, which accounted for differences in payer mix across hospitals but not those across physicians. The restrictions of claims data, including unmeasured case mix differences and inaccurate coding, also limited our study. These limitations are also present in the volume-outcomes literature.
Areas for further study
Several aspects of these results warrant further study. We examined eight procedures; additional results for other procedures would be valuable to determine external validity. Further studies could examine whether spillover effects exist from specializing in similar surgeries to the specified procedure, the effect of team continuity and specialization, and the relation to morbidity metrics and episode costs.16 31 Furthermore, the balance of surgeon specialization and department specialization could be studied to determine optimal surgical case distribution to maximize outcomes for patients. Finally, specialization could be examined in non-surgical settings, such as the degree of a primary care physician’s panel of patients with a specific chronic condition.
Conclusion and policy implications
Our findings may have implications for policy makers, administrators, physicians, and patients, especially as surgeons’ specialization is measurable using data available to health systems. Policy makers considering how to improve the quality of rural or smaller hospitals in which surgeons cannot meet minimal volume thresholds could use surgeon specialization to assign patients to surgeons. At larger facilities, administrators determining case distribution across surgeons might consider not only a given surgeon’s volume in that procedure but also his or her degree of specialization. A physician might use a measure of surgeon specialization to refer his or her patient to the most appropriate surgeon, possibly improving patients’ outcomes.32 Finally, if these data are made available to a patient, he or she could choose a surgeon who specializes in the relevant procedure to possibly improve his or her chance of survival. We have been careful not to suggest that surgeons should specialize more, as that would require establishment of a causal relation. In all of our examples, surgeons’ specialization is used as a surrogate for surgeons’ quality, much like procedure specific volume is used as a proxy for surgeons’ quality. Whether the degree of specialization causally improves surgical quality remains a topic for future work.
What is already known on this topic
Cohort studies and systematic reviews have shown that surgeons with higher volumes have better outcomes across a variety of procedures up to a certain threshold
In other sectors such as airlines, banking, and automobile assembly, specialization has been shown to have a positive association with outcomes
Previous work has looked at the relation between outcomes and self reported specialty or board certification but not the association between a surgeon’s degree of specialization in a specific procedure and mortality
What this study adds
A statistically significant relative risk reduction in operative mortality was observed between the bottom and top quarters of surgeon specialization for six procedures
The relative risk reduction in operative mortality due to surgeon specialization was greater than that due to procedure specific volume for five procedures
The specialization-outcomes relation accounted for some portion, if not all, of the volume-outcomes relation
The observed specialization-outcomes relation suggests a new, easily measured metric of surgeons’ quality that builds on the volume-outcomes relation to inform the way healthcare is organized and delivered
We thank William Gordon and Atheendar Venkataramani for their clinical contributions. We also thank Ishani Ganguli for her clinical contributions and editing of the manuscript.
Contributors: NRS contributed to the literature review, study question and design, and data analysis and interpretation; prepared the first draft of the report; and contributed to subsequent versions. MD, JDB, DMC, and AC contributed to study design, data analysis and interpretation, and drafting of the final report. All authors approved the final version. NRS is the guarantor.
Funding: This report represents independent research funded in part by the National Institute on Aging (NIA) for JDB (P01-AG019783-07S1), DMC (P01-AG031098), and AC (P01-AG005842-24 and P01-AG19783). The views expressed are those of the authors and not necessarily those of the NIA. The funders had no role in study design, in the analysis or interpretation of data, in the writing of the report, or in the decision to submit the paper for publication.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organization for the submitted work other than the NIA as above; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: Not needed.
Data sharing: No additional data available.
Transparency declaration: The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.