Intended for healthcare professionals

Education And Debate

Multisource feedback: a method of assessing surgical practice

BMJ 2003; 326 doi: (Published 08 March 2003) Cite this as: BMJ 2003;326:546
  1. Claudio Violato, professor (violato{at},
  2. Jocelyn Lockyer, associate professora,
  3. Herta Fidler, coordinator—evaluation, research and special projectsb
  1. a Department of Community Health Sciences, Faculty of Medicine, University of Calgary, 3330 University Drive NW, Calgary, AB T2N 4N1, Canada
  2. b Office of Continuing Medical Education and Professional Development, University of Calgary
  1. Correspondence to: C Violato
  • Accepted 10 December 2002

New methods are needed for assessing surgeons' performance across a wide range of competencies. Violato and colleagues describe the development of a programme based on feedback from medical colleagues, coworkers, and patients for the assessment of surgeons throughout Alberta, Canada

The assessment and maintenance of competence of physicians has received worldwide attention,14 partly in response to concerns about poor performance by physicians and the safety of patients 5 6 and partly as a result of demands for accountability to patients and funding agencies.24 New approaches to quality improvement have resulted, as have initiatives focusing on identifying and assessing poor performance.79

Throughout the Western world, thinking about competence has shifted. Medical expertise and clinical decision making are increasingly recognised as only components of competence. Communication skills, interpersonal skills, collegiality, professionalism, and a demonstrated ability to continuously improve must also be considered when assessing physicians. 24 7 8 10 11

Multisource feedback, using questionnaire data from patients, medical colleagues, and coworkers, is gaining acceptance and credibility as a means of providing primary care physicians with quality improvement data as part of an overall strategy of maintaining competence and certification. 1 7 8 Work with Canadian, American, and Scottish generalist physicians shows that this method is reliable, valid, and feasible. 7 8 1215 Research in both industry and medicine shows that multisource feedback systems (or 360° feedback) can result in individual improvement and the adoption of new practices. 12 1618

The College of Physicians and Surgeons of Alberta, the statutory medical registration body for the province of Alberta, adopted a performance appraisal or multisource feedback system for all physicians in its jurisdiction—the physician achievement review program. This system focuses on quality improvement and operates entirely separately from the complaints and disciplinary procedures. Medical colleagues, coworkers (for example, nurses, pharmacists, and psychologists), patients, and the physician (self) all provide survey based data, which are summarised by item and category and compared with the physician's specialty group. The instruments for family physicians were psychometrically tested and adopted. 7 8 19 As part of its overall goal of ensuring that all physicians in the province participate in a multisource feedback process every five years, the college asked a committee of surgeons and social scientists to design and test instruments that could be used for the surgical specialties. This paper describes the development and evaluation of a multisource feedback system for surgeons designed to assess a broad range of competencies.

Summary points

The general competencies of generalist physicians (family physicians and internists) can be assessed by medical peers, coworkers, and patients

Valid and reliable multisource feedback questionnaires are a feasible means of assessing the competencies of practising surgeons in communication, interpersonal skills, collegiality, and professionalism

These quality improvement data can be used to supplement information provided through traditional sources of hospital surgical outcome data

Many surgeons in this study used the feedback to contemplate or initiate changes to their practice


Development of the instrument

The committee of surgeons from the major surgical disciplines developed questionnaires that could be used for all surgical specialties. Their work was based on

  • The generic performance template previously developed for family physicians (medical knowledge and skills, attitudes and behaviour, professional responsibilities, practice improvement activities, administrative skills, and personal health)8

  • Copies of the instruments being used for family physicians19

  • The seven roles or competencies that the Royal College of Physicians and Surgeons of Canada had identified as integral to specialty practice (medical expert-clinical decision maker, communicator, health advocate, manager, professional, collaborator, and scientist-scholar).11

  • The committee was asked to develop instrument for physician colleagues, coworkers, and patients. The self assessment instrument would use the items in the medical colleague instrument, rewritten in the first person.

  • The medical colleague and self assessment questionnaires consisted of 34 items that rated the physician on a five point scale (1=among the worst; 5=among the best) or unable to assess. The items examined communication, diagnostic and treatment skills, medical records, transfer and coordination of care, respect for patients, collaboration, professionalism, ability to assess the medical literature, continuing learning, and stress management. The 19 item coworker questionnaire used the same scale and focused on communication, collaboration, respect for patients and colleagues, accessibility, and support for colleague and coworker learning. The 39 item patient questionnaire asked patients for their level of agreement with statements about selected aspects of care and used a five point scale (1=strongly disagree; 5=strongly agree). The items focused on communication, respect, the office and office staff, and information received.

Testing the instrument

We selected a proportionate stratified (by surgical specialty) random sample of 252 surgeons. We invited up to 25 surgeons from each of vascular surgery, obstetrics and gynaecology, plastic surgery, otolaryngology, orthopaedics, general surgery, cardiovascular and thoracic surgery, neurosurgery, ophthalmology, urology, and general practice surgery to participate. Some specialties contributed fewer than 25 surgeons, and all were included.

In this type of study, generalisability (with a goal of Ep2>0.70) is a key consideration. 7 8 1214 The generalisability coefficient (Ep2) is calculated to determine what modifications can be made to an instrument, by examining both the numbers of items and the numbers of raters needed to achieve data stability. Adding items and adding observers will both increase generalisability. Instruments that are too short will decrease content validity, whereas instruments that are too long produce redundancy and inefficiency. Similarly, it can be difficult to find sufficient raters able to assess someone, and quality of data is reduced. On the basis of our previous generalisability analyses of data stability (Ep2>0.70), 7 8 we asked surgeons to identify eight coworkers and eight medical colleagues to whom the survey would be sent. We instructed the surgeons to ask 25 consecutive patients to complete surveys and place them in sealed envelopes. Each surgeon completed a self assessment survey.

Response rate and results of factor analysis

View this table:

We enhanced content validity (sampling of appropriate content and skills) by using a table of specifications based on the list of core competency areas provided by the College of Physicians and Surgeons of Alberta and asking the working group of surgeons to ensure that each competency was covered within the instruments. The surgeon committee's endorsement of the items confirmed face validity (appearance). We did exploratory factor analyses for each instrument to ensure that the items grouped into factors consistent with the competencies identified as critical for this quality improvement initiative. We used principal component analyses using varimax factor rotation to identify internal relations in the ratings and extract the factors (that is, the group of items on each instrument that were most closely correlated with other items on the instrument). These became the factors used to develop summary scores (subscales) to provide surgeons with aggregate data. We confirmed the number of scales for each instrument on empirical grounds (eigenvalues were greater than 1) and assessed for concordance with previous empirical work. 7 8 We used Cronbach's α to determine internal consistency reliability. We conducted a three month follow up survey to assess whether surgeons had contemplated or initiated changes to their practice on the basis of the multisource feedback.


We received feedback about the draft instruments from 99 (15.6%) of the 635 surgeons in the province. We made relatively few adjustments.

A total of 201 surgeons provided data for the study. Participants comprised 25 general surgeons, 25 orthopaedic surgeons, 24 obstetricians and gynaecologists, 24 otolaryngologists, 24 ophthalmologists, 20 plastic surgeons, 20 urologists, 15 cardiovascular and thoracic surgeons, 13 neurosurgeons, 6 general practice surgeons, and 5 vascular surgeons. This represented 31.7% of the surgeons in the province. The table presents the response rates and percentage of the possible total for each of the instruments. The response rates for all instruments exceeded 80%. Response rates for each surgeon by type of rater were similarly high (table). For most (67 of 92) of the items on the coworker, patient, and medical colleague instruments, less than 20% of respondents reported being unable to assess the physician on that item. The factors derived from the exploratory factor analyses (table) were consistent with the intent of each of the instruments and the overall areas identified for assessment. The eigenvalues for each of the factors were greater than 1 and accounted for 69.0% of the total variance for the medical colleague instrument, 65.1% of the total variance for self, 69.8% of the total variance for coworkers, and 73.7% of the total variance for the patient instrument.

The mean ratings on all of the instruments were between 4.0 and 5.0. Overall, the surgeons rated themselves less highly than their medical colleagues, coworkers, and patients rated them.

All of the Cronbach's α reliability indices were >0.90, indicating internally consistent instruments. In the three month follow up survey 144 (71.6%) of the surgeons contemplated or initiated change on the basis of the multisource feedback provided to them (range of changes 1-30; mean (SD) 12.6 (3.3)). These changes focused on communication with patients and colleagues, collaboration, office systems, and stress management.


Our results indicate that multisource feedback is feasible for assessing surgeon competencies for quality improvement purposes. Recruitment and response rates were high, consistent with the mandatory nature of the programme, although participation was not enforced during the development stage. Relatively few surgeons reported difficulty acquiring sufficient patient surveys or identifying sufficient numbers of coworkers and medical colleagues.

The factor analysis indicated that the instruments had theoretically meaningful and cohesive factors consistent with the overall intent of the competency areas determined by the College of Physicians and Surgeons of Alberta and consistent with previous research. 7 8 The high Cronbach's α levels confirmed the reliability of the instruments.

These results indicate that multisource feedback systems can be used to assess key competencies such as communication skills, interpersonal skills, collegiality, medical expertise, and ability to continually learn and improve, which medical organisations and the public believe need attention. 24 10 11 Moreover, the feedback from the assessment provoked contemplation or initiation of change in many surgeons. Research on the relation between multisource feedback ratings and direct observation of surgeons' performance or results from objective structured clinical examinations, for example, could be used to confirm the validity of our method. Meanwhile, procedural competence or surgical outcomes that are routinely monitored in hospitals by annual appointment procedures, morbidity and mortality reviews, and critical incident investigations should be used in conjunction with our multisource feedback techniques to enhance performance.


Contributors: All authors contributed to the concept of the study. CV and JL wrote the paper, with contributions from HF. HF supervised the collection and preparation of the data. CV did the data analysis and is the guarantor. Tina Vonhof, project coordinator, managed the data collection. John Swiniarski, assistant registrar, College of Physicians and Surgeons of Alberta, provided ongoing direction and support for the study. Ray Lewkonia provided editorial advice.


  • Funding Contract with the College of Physicians and Surgeons of Alberta.

  • Competing interests None declared.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
View Abstract