- Antoine Duclos, assistant professor of public health1211,
- Jean-Louis Peix, professor of surgery3,
- Cyrille Colin, professor of public health12,
- Jean-Louis Kraimps, professor of surgery4,
- Fabrice Menegaux, professor of surgery5,
- François Pattou, professor of surgery67,
- Fréderic Sebag, professor of surgery8,
- Sandrine Touzet, senior epidemiologist12,
- Stéphanie Bourdy, project manager12,
- Nicolas Voirin, biostatistician910,
- Jean-Christophe Lifante, professor of surgery3
- The CATHY Study Group*
- 1Hospices Civils de Lyon, Pôle Information Médicale Evaluation Recherche, Lyon F-69003, France
- 2Université de Lyon, Equipe d’Accueil Mixte 4128 Santé-Individu-Société, Lyon F-69002
- 3Hospices Civils de Lyon, Centre Hospitalier Lyon Sud, Service de Chirurgie Générale et Endocrinienne, Pierre Bénite, Lyon
- 4Department of Endocrine Surgery, Poitiers University, Jean Bernard Hospital, Poitiers, France
- 5Assistance Publique-Hôpitaux de Paris, Hôpital la Pitié-Salpêtrière, Service de Chirurgie Générale, Viscérale et Endocrinienne, Paris, France
- 6Centre Hospitalier Régional Universitaire de Lille, Chirurgie Générale et Endocrinienne, Lille, France
- 7Université Lille Nord de France, Institut National de la Santé et de la Recherche Médicale (INSERM), Lille
- 8Assistance Publique-Hôpitaux de Marseille, Centre Hospitalier Universitaire la Timone-Adulte, Marseille, France
- 9Hospices Civils de Lyon, Hôpital Edouard Herriot, Service d’Hygiène, Epidémiologie et Prévention, Lyon
- 10Centre National de la Recherche Scientifique (CNRS), Unite Mixte de Recherche 5558, Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Lyon
- 11Center for Surgery and Public Health, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
- Correspondence to: A Duclos
- Accepted 20 October 2011
Objective To determine the association between surgeons’ experience and postoperative complications in thyroid surgery.
Design Prospective cross sectional multicentre study.
Setting High volume referral centres in five academic hospitals in France.
Participants All patients who underwent a thyroidectomy undertaken by every surgeon in these hospitals from 1 April 2008 to 31 December 2009.
Main outcome measures Presence of two permanent major complications (recurrent laryngeal nerve palsy or hypoparathyroidism), six months after thyroid surgery. We used mixed effects logistic regression to determine the association between length of experience and postoperative complications.
Results 28 surgeons completed 3574 thyroid procedures during a one year period. Overall rates of recurrent laryngeal nerve palsy and hypoparathyroidism were 2.08% (95% confidence interval 1.53% to 2.67%) and 2.69% (2.10% to 3.31%), respectively. In a multivariate analysis, 20 years or more of practice was associated with increased probability of both recurrent laryngeal nerve palsy (odds ratio 3.06 (1.07 to 8.80), P=0.04) and hypoparathyroidism (7.56 (1.79 to 31.99), P=0.01). Surgeons’ performance had a concave association with their length of experience (P=0.036) and age (P=0.035); surgeons aged 35 to 50 years had better outcomes than their younger and older colleagues.
Conclusions Optimum individual performance in thyroid surgery cannot be passively achieved or maintained by accumulating experience. Factors contributing to poor performance in very experienced surgeons should be explored further.
The operating room is a high risk environment because of the volume1 and complexity of procedures undertaken.2 Adverse events during hospitalisation are often linked to surgical care,3 with wide variation in mortality rates across hospitals.4 Although most efforts to improve surgical care focus on technological advances,5 it seems unlikely that the use of robotics or attempts to standardise work procedures to improve efficiency would erase errors and costs completely.6 The impact of human factors on patient safety is crucial, and the surgeon’s performance is a core element of successful surgery.
Professional expertise results from a gradual improvement in performance within a specialty. Typically, experts reach their peak performance in various domains at age 30 to 50 years—that is, after about 10 years’ experience in their specialty.7 Older doctors who have been in practice for a long time after graduation might have less factual knowledge and be less likely to adhere to evidence based medicine,8 which risks the safety of care. However, few studies have explored the association between clinicians’ experience and performance using objective outcome measurements.
In surgery, performance can be assessed by the occurrence of postoperative complications, and experience can be measured by the surgeon’s age or time spent in practice. The importance of the education and training used to gain experience among young surgeons is obvious.9 Acquiring the necessary technical background can take time before individuals are able to master routine procedures and achieve better outcomes.10 11 Although complication rates can vary greatly during a surgeon’s career, the potential decline in performance among very experienced surgeons remains unclear. Previous studies have used retrospective analyses of hospital databases without considering the clustered nature of the data.12 13 We aimed to model the association between the experience and performance of surgeons on postoperative complications in thyroid surgery.
Study design and population
We conducted a prospective cross sectional study between 1 April 2008 and 31 December 2009, in five academic hospitals (centres A-E). The participating hospitals were high volume referral centres in endocrine surgery, each undertaking at least 500 thyroidectomies a year. All 28 endocrine surgeons who did thyroid surgery in these centres participated in the study. Patient recruitment for one year after surgery began in April 2008 in centres A to C, May 2008 in centre D, and July 2008 in centre E. All patients who underwent a thyroid procedure were eligible for inclusion.
We considered two major complications of thyroid surgery: permanent recurrent laryngeal nerve palsy14 and hypoparathyroidism. Based on the available evidence, participating centres used an active follow-up protocol.15 16 Postoperative outcomes were systematically assessed during hospitalisation within 48 hours after the thyroid procedure. A second assessment was planned at least six months after the thyroidectomy to diagnose permanent complications.
The proportion of recurrent laryngeal nerve palsy was measured in patients who underwent unilateral or bilateral thyroid procedures. Postoperative vocal cord mobility was assessed by laryngoscopy.17 Exclusion criteria for recurrent laryngeal nerve palsy analysis included pre-existing nerve palsy before the intervention, previous thyroid surgery with unknown pre-existing nerve palsy status, voluntary resection of nerves during intervention due to invasive carcinoma, and patient death during follow-up.
The proportion of hypoparathyroidism was measured in patients who underwent a bilateral thyroidectomy. We defined postoperative hypoparathyroidism as a serum calcium concentration of lower than 2 mmol/L or a requirement for vitamin D or calcium supplementation (or both) to maintain healthy calcium concentrations after thyroidectomy.18 Exclusion criteria for hypoparathyroidism analysis included unilateral lobectomy, pre-existing hypocalcaemia or substitutive treatment with calcium or vitamin D (or both) before intervention, previous thyroid surgery with unknown pre-existing hypoparathyroidism status, and patient death during follow-up.
Owing to organisational constraints at centre E, systematic laryngoscopy was not feasible after each thyroid procedure. Therefore, vocal cord mobility was objectively assessed by laryngoscopy only in instances of postoperative voice alterations. For this reason, we did not include centre E in the analysis for recurrent laryngeal nerve palsy.
Local investigators and staff received handouts and training to become familiar with data collection and procedures for monitoring surgical complications. After each thyroidectomy, the surgeon would systematically complete a patient report form, which included items regarding surgical indication and procedures, and the number of interventions that the surgeon had undertaken that day. Clinical research assistants completed data collection using medical records. These data included patient demographics and information on previous thyroid surgeries, thyroid specimen weights, postoperative supplementation with calcium or vitamin D (or both), calcium assay values, and assessments of vocal cord mobility.
Surgeons also completed individual questionnaires regarding their demographics, previous background, and professional experience, using a one off survey. The surgeon’s length of experience was measured as the number of years spent in practice since graduation (that is, the end of residency). Before the analysis, we separated the experience variable into four categories to depict the successive steps in a surgeon’s career in France: less than 2 years (that is, a very young surgeon starting a surgical fellowship), 2 to 4 years (junior surgeon ending a surgical fellowship), 5 to 19 years (senior surgeon), and 20 years and over (very experienced surgeon at the head of a surgical department).
We obtained the structural characteristics of each participating hospital from the French annual statistics on healthcare facilities in 2008.19 We calculated the proportion of cases included in the study as the ratio of the number of thyroidectomies with a fully completed patient form to the number of eligible thyroidectomies recorded in the hospital administrative databases.
The Research Committee for the Protection of Persons allowed the study in accordance with ethical directives. The National Advisory Committee on Information Processing in Material Research in the Field of Health also approved the study, regarding the anonymous processing of personal health information. The participating centres approved the study protocol without giving incentives to surgeons for their participation. Before surgery, patients received written information about personal data use, and gave verbal consent for sharing their data.
We described the characteristics of the hospitals and surgeons using absolute frequencies with percentages for categorical variables. We calculated mean values with standard deviations and median values with minimum-maximum intervals for continuous variables. We calculated 95% confidence intervals of complication rates using an exact method based on binomial distribution and used the Spearman’s rank correlation test to investigate correlation between surgeon’s length of experience and age, as well as the overall number of thyroidectomies previously undertaken. To identify variables associated with recurrent laryngeal nerve palsy and hypoparathyroidism, we compared thyroid procedures with and without complications. We used χ2 and Mann-Whitney tests to compare categorical and continuous variables, respectively.
We identified the factors associated with the probability of each complication by calculating crude and adjusted odds ratios using a mixed effects logistic regression. This approach was appropriate for the hierarchical design of this study, in which patients were nested in surgeons, who in turn were nested in hospitals. In the models, we classified a patient’s case mix as a fixed effect (that is, the patient’s age, sex, body mass index, weight of thyroid specimen, and complexity of surgical case), the surgeon as both random and fixed effects (that is, the length of experience and number of surgical procedures done by the surgeon on the same day), and the hospital as a random effect.
Since patients with high risk of complications might have been cared for by more experienced surgeons, we adjusted outcomes for the complexity of surgical case. We based complexity measurement on patients’ variables that were significantly associated with complications in bivariate analyses. Consequently, we defined a complex surgical case as a procedure involving a patient with Graves’ disease or a malignant neoplasm, an extended thyroidectomy, or a thyroidectomy with lymph node dissection. Model estimates and their 95% confidence intervals were obtained by using the glmmPQL function20 of the R free software, which fits generalised linear mixed models by an approximate method for facilitating inference.21 We also calculated a measure of the explained variation in final multivariate models using a variant of R2, based on absolute deviations.22 23 Percentages of total variance attributable to the patient, surgeon, and hospital centre levels were calculated by dividing R2 of a model with the level of interest by the R2 the final model with all levels included.
Furthermore, we plotted individual surgeons’ performances according to their age or length of experience. Surgeons who operated on fewer than 30 patients were not considered in this performance curve analysis. Accordingly, the individual performances of 15 and 22 surgeons were plotted for recurrent laryngeal nerve palsy and hypoparathyroidism analyses, respectively. For each surgical case, we calculated the predicted probability of complication using parameter estimates from a multivariate logistic regression. The variables used were identical to those used previously for assessing the risk of complications, except for the experience of the surgeon, which was the variable of interest.
For each surgeon, we calculated the expected and observed number of complications as the sum of the predicted complication probabilities and the sum of the observed complications for all the procedures that each surgeon had undertaken, respectively. We measured performance as the difference between the expected and observed complication rates, using the equation (E–O/number of procedures)×100, where E and O are the expected and observed number of complications, respectively. A value above zero indicated good performance, whereas a value below zero indicated poor performance.
We tested whether a trend existed in the performance curves by adjusting a linear model with experience entered both as a linear and a quadratic term. A significant quadratic term suggested that surgeons’ performances did not vary linearly. Instead, performance might have increased, reached a peak, and then decreased with age or length of experience. All tests were two tailed, and P<0.05 was considered significant.
Of 3679 eligible procedures, 3574 (97%) were completed during the study period. In accordance with the exclusion criteria and complete follow-up, 2357 (66%) procedures were selected for the analysis of recurrent laryngeal nerve palsy (fig 1⇓), and 2904 (81%) for the analysis of hypoparathyroidism (fig 2⇓). We found no significant difference in case mix between patients who were included in analysis and those who were lost to follow-up. Overall rates of permanent nerve palsy and hypoparathyroidism were 2.08% (49/2357, 95% confidence interval 1.53% to 2.67%) and 2.69% (78/2904, 2.10% to 3.31%), respectively. We did not record any bilateral injuries of recurrent nerves.
Table 1⇓ summarises characteristics of the 28 participating surgeons and their distribution among the centres. The surgeons’ mean age was 41 years, with a mean length of experience of 10 years. Length of experience correlated strongly with age (ρ=0.94, P<0.0001) as well as with the overall number of thyroidectomies undertaken by surgeons during their career (ρ=0.85, P<0.0001). They declared mean work week duration of 66 hours. During the one year inclusion period, six (21%) surgeons did fewer than 30 thyroid procedures, seven (25%) did between 30 and 99 procedures, nine (32%) did between 100 and 199 procedures, and six (21%) did at least 200 procedures.
Univariate analysis showed that patients had permanent hypoparathyroidism more frequently after being operated on by surgeons who had spent the shortest or the longest time in practice since graduation (table 2⇓). Postoperative hypoparathyroidism was associated with younger patients, female patients, and heavier thyroid specimens; Graves’ disease or malignant neoplasm; and extended thyroidectomy or lymph node dissection. The frequency of complex surgical cases handled by surgeons was not associated with their experience. Surgeons in practice for at least 20 years did not have a higher proportion of high risk patients with Graves’ disease, thyroid cancer, extended thyroidectomy, or lymph node dissection than their less experienced colleagues (21.1% (194/921) v 22.8% (315/1384) for nerve palsy analysis; 25.9% (286/1105) v 28.3% (476/1680) for hypoparathyroidism analysis).
Table 3 ⇓ shows multivariable analyses of factors independently associated with complications after adjustment for both patient and surgeon variables. Experience of 20 years or more was the only factor associated with an increased probability of both recurrent laryngeal nerve palsy (odds ratio 3.06, P=0.04) and hypoparathyroidism (7.56, P=0.01). Occurrence of nerve palsy was linked with female patients (2.74, P=0.03), whereas hypoparathyroidism was associated with young patients (0.84 by 10 years increase, P=0.04) and inexperienced surgeons (5.93, P=0.02). Patients accounted for the largest proportion of total variability in surgical outcome. Variation attributable to surgeons was greater for hypoparathyroidism (32%) than for recurrent laryngeal nerve palsy (10%).
We found no association between recurrent laryngeal nerve palsy and a surgeon’s length of experience and age (fig 3⇓, web appendix). However, we saw a concave association between hypoparathyroidism and the length of experience or age of the surgeon (P=0.036 and P=0.035, respectively; fig 3). Surgeons between 35 and 50 years old (that is, with 5-20 years of practice since graduation) had better outcomes than their younger and older colleagues.
In this multicentre study, patients were at increased risk of permanent complications after a thyroidectomy when operated on by inexperienced surgeons or those who have spent the longest time in practice since graduation. We observed a concave association between surgeons’ experience and their case mix adjusted performance, suggesting that surgeons aged 35 to 50 years provided the safest care. Thyroidectomies by surgeons in practice for 20 years or more increased the probability of permanent complications considerably. Large effect sizes might raise important issues about patient safety and surgeons’ experience, but those findings should be interpreted cautiously in the light of wide confidence intervals, owing to the scarcity of monitored complications.
Comparison with other studies
Few studies have specifically assessed the influence of surgeons’ experience on surgical outcomes. Older surgeons who have been in practice for a long time have been found to be associated with increased mortality rates after coronary artery bypass grafting12 or carotid endarterectomy.13 Increased experience, especially among doctors with more than 20 years’ experience in practice,24 might also be associated with a raised risk of inpatient death related to medical care.8 However, compared with medical specialties, poor outcomes among very experienced surgeons could be explained by a weariness related to a high volume of procedures rather than improper technical skills or a lack of compliance with evidence based practice. Indeed, the organisation of healthcare provision in French academic hospitals is hierarchical, with most experienced surgeons at the head of surgical departments and responsible for the education and training of their young colleagues. Attending surgeons within the same team often reproduce similar techniques based on knowledge and skills that they have learned from older surgeons. Thyroid surgery is a highly reproducible and well defined procedure that has not changed substantially over the past decades.25 Furthermore, most experienced surgeons generally spend more time on academic and administrative duties than their younger colleagues, who can focus more on surgical activities. The additional burden on workload related to non-clinical tasks could affect surgeons’ attention in the operating room and jeopardise patient safety.26
According to the surgeons’ performance curves, they are expected to reach peak performance in mid-career after their education ends and their techniques are well mastered. Following this period, and without appropriate training and ongoing challenges, a surgeon’s performance could decline over time because of mental fatigue related to the repetition of a specific procedure over a long time. Physiological factors such as reduced stress with age27 or habit might also lead to poor compliance with new techniques, and then to increased complication rates. Talent and experience are not enough to guarantee safe surgery if a surgeon does not possess the motivation and willingness to progress.28 To maintain a high level of performance for the rest of their careers, surgeons should continuously strive to critically examine the care delivered.29 Based on those assumptions, recertification of surgeons older than 50 years could focus on mental coaching and raising awareness to their own performance. Such an approach needs the implementation of outcome monitoring systems that have proved to be useful in reducing surgical complications in both local30 and national31 initiatives.
Previous analyses based on administrative databases have also established that surgeons with high volume workloads could provide better care in thyroid surgery.32 33 34 In this study, we did not identify any volume threshold or a particular time of day that was related to increased risks of complication. The daily operation schedule seemed to have marginal effects on the occurrence of complications, compared with surgeon experience and patient case mix variables. Similar to previous results, patient unit related factors accounted for the majority of variability in surgical outcome.35 Irrespective of the lower proportion of variability attributable to surgeon and hospital centre levels, meaningful variation in performance could still be present even among a homogeneous population of highly specialised surgeons.
Strengths and limitations of study
The strengths of our study included that it was designed a priori to model the nature of the association between surgeons’ experience and performance in thyroid surgery, patient recruitment and data were recorded prospectively with great thoroughness, we based objective measurements of performance on rigorous follow-up of permanent complications that were systematically assessed and collected (and not self reported by surgeons), the patient case mix and other surgeon factors were partly controlled, and we considered the clustering of patients within surgeon and hospital centre levels.
However, this work had several limitations. The applicability of our results to other surgical fields is questionable, in view of the limited sample size of endocrine surgeons in academic referral centres. Our study had few middle aged surgeons with an intermediate length of experience, suggesting that the cohort of surgeons examined might not be representative of the average population of surgeons. The validity of our results also depends on how surgeons’ experience was measured and whether the risk adjusted rates of complications showed surgical outcomes adequately. The length of experience was highly correlated with the surgeon’s age or volume of thyroidectomies previously undertaken, but our analysis did not account for the possibility that a surgeon took a short career break, or that successive generation of surgeons might have had different experiences of training before entering independent practice.
Furthermore, we cannot exclude that other unknown or unmeasured factors might have explained part of the variation seen in complication rates. Despite adjusting for high risk patients and very complex surgical cases between the various surgeons’ age categories, the singularity of thyroid diseases occasionally required procedures in which complexity might not have been sufficiently captured in our multivariate models. Surgical outcome depends on interactions between many factors that are still poorly understood and difficult to characterise.36 These factors include the combination of manual and intellectual skills acquired during a surgeon’s academic and professional training. The physical and mental condition of the surgeon on a given day is also essential, and a sudden or more insidious fatigue could reduce the surgeon’s vigilance and increase the risk of complications.37 38 39 The number of procedures undertaken,40 41 excessive workloads,30 42 or resident intraoperative participation43 have also been suggested to influence surgical outcome. Although technical skills are a prerequisite for successful surgery, collective factors and a surgeon’s leadership are essential for effective teamwork within the operating room.44 45
Future studies should be conducted with larger populations of surgeons in various settings and other surgical specialties to corroborate the potential link between experience and performance. Since a cross sectional study might be inappropriate to resolve a dynamic phenomenon, a recommended design would be to follow a particular cohort of surgeons over time. Indeed, poor outcomes among older surgeons could be because their training is outdated rather than because of a declining performance with age. Despite practical constraints and possible confounding related to secular trends,8 longitudinal outcome monitoring of surgeons can be implemented to explore changes in performance during their careers.
Conclusions and policy implications
Working in a high volume academic hospital does not compensate for the probable lack of experience of newly appointed endocrine surgeons. Our findings also suggest that a surgeon cannot achieve or maintain top performance passively by accumulating experience, which raises concerns about ongoing training and motivation throughout a career that extends several decades. Solutions to help surgeons avoid poor outcomes could include simulation and proctoring in the early years of their careers, continuous monitoring of performance, and targeted retraining if appropriate. Individual feedback based on outcome indicators might increase awareness about performance and improve safety in surgical practice.
What is already known on this topic
Experts reach their peak performance in various specialties between the ages of 30 and 50 years
Doctors who have been in practice for a long time since graduation possess less factual knowledge and are less likely to adhere to evidence based care
Rates of postoperative complications can vary greatly during a surgeon’s career
What this study adds
Patients were at an increased risk of permanent complications after a thyroidectomy by inexperienced surgeons and by those in practice for 20 years or more
Surgeons aged 35 to 50 years could have better postoperative outcomes than their younger and older colleagues, which raises concerns about ongoing training and motivation of surgeons during their careers
Variability in performance over the course of a surgeon’s career are caused by unknown factors that should be further explored
Cite this as: BMJ 2012;344:d8041
Members of the CATHY Study Group: Laurent Arnalsteen, Robert Caizzo, Bruno Carnaille, Guelareh Dezfoulian, Carole Eberle, Ziad El Khatib, Emmanuel Fernandez, Antoine Lamblin, François Pattou, and Marie-France Six (Lille); Stéphanie Bourdy, Laetitia Bouveret, Cyrille Colin, Antoine Duclos, Benoît Guibert, Marie-Annick Le Pogam, Jean-Christophe Lifante, Jean-Louis Peix, Gaétan Singier, Pietro Soardo, Sandrine Touzet, and Nicolas Voirin (Lyon); Pascal Auquier, Jean-François Henry, Claire Morando, Frédéric Sebag, and Sam Van Slycke (Marseille); Inès Akrout, Fares Benmiloud, Jean-Paul Chigot, Isabelle Colombet, Gaëlle Godiris-Petit, Pierre Leyre, Fabrice Ménégaux, Séverine Noullet, Benoît Royer, and Christophe Tresallet (Paris); Thibault Desurmont, Claudia Dominguez, Jean-Louis Kraimps, Chiara Odasso, and Laetitia Rouleau (Poitiers); Yves-Louis Chapuis, Pierre Durieux, Alain Lepape, and Frédéric Triponez (scientific committee). We thank Kathy Corso (Boston) for reviewing the English language in the paper.
Contributors: AD and JCL obtained funding and supervised the study. AD, CC, FM, FP, FS, JCL, JLK, JLP, and ST conceived and designed the study. CC, FM, FP, FS, JCL, SB, JLK, and JLP were responsible for data acquisition and provided administrative, technical, or material support. AD, JCL, NV, and SB analysed and interpreted the data. AD, JCL, and NV drafted the manuscript. AD, CC, FM, FP, FS, JCL, JLK, JLP, NV, SB, and ST revised the manuscript critically. All authors approved the final version of the manuscript to be published. AD is guarantor. All authors had full access to the data and take responsibility for its integrity and the accuracy of the analysis.
Funding: This study was supported by a grant from the Programme de Recherche en Qualité Hospitalière 2007 of the French Ministry of Health (Ministère chargé de la Santé, Direction de l’Hospitalisation et de l’Organisation des Soins), Hospices Civils de Lyon, Lyon. The funding source had no involvement in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. Researchers were independent from the funder.
Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: that all authors received support from the French Ministry of Health for the submitted work; no relationships with any company that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: This study was approved by the Research Committee for the Protection of Persons and the National Advisory Committee on Information Processing in Material Research in the Field of Health in accordance with ethical directives, in France.
Patient consent: Informed consent was obtained from participating surgeons and all patients for sharing their data.
Data sharing: No additional data available.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.