Improving oral examinations: selecting, training, and monitoring examiners for the MRCGPBMJ 1995; 311 doi: https://doi.org/10.1136/bmj.311.7010.931 (Published 07 October 1995) Cite this as: BMJ 1995;311:931
- Richard Wakeford, staff development officera,
- Lesley Southgate, professor of general practiceb,
- Val Wass, general practitionerc
- aThe Old Schools, Cambridge University, Cambridge CB2 1TT
- bMedical Colleges of St Bartholomew's and the Royal London Hospitals, London EC1M 6BQ
- cChislehurst, Kent BR7 5AX
- Correspondence to: Mr Wakeford.
- Accepted 14 July 1995
Unless examiners are carefully selected, trained, and monitored, examinations may become haphazard. This is perhaps most true of oral or viva voce (“viva”) examinations, which can generate marks unrelated to competence. To help other bodies to short circuit some years of experiment in connection with the oral component of the Royal College of General Practitioners' membership examination (MRCGP), this paper describes the selection, training, guidance, and monitoring arrangements that have been developed.
The oral or viva voce examination (“viva”)—a general non-patient based encounter between a candidate and one or more examiners—has held an important place in medicine for centuries.1 Tradition aside, it is used for its flexibility, its apparent fidelity (much medicine concerns oral encounters over issues of diagnosis and management), and its potential for testing higher order cognitive skills.
Unfortunately oral examinations are prone to many errors.2 These include errors relating to halo effects (a judgment of one attribute influences judgments of others); errors of central tendency (judgments cluster in the middle); so called errors of logic (mistakes); a general tendency towards leniency; and errors of contrast (judgments of a candidate are influenced by impressions of preceding candidates). Oral examinations tend to test at a low taxonomic level—for example, factual knowledge rather than problem solving.3 Scores are related to irrelevant attributes of candidates such as appearance or confidence.4 Agreement between examiners is often poor.4 It is, moreover, difficult to establish in any formal way how valid an oral examination is.5
Largely abandoned in North America, oral examinations are still widely used in undergraduate and postgraduate examinations in the United Kingdom. In 1990, 19 out of 27 medical schools used vivas in their final qualifying examinations, 11 as a major assessment method.6
The membership examination of the Royal College of General Practitioners (MRCGP) also uses oral examinations. Hitherto, a practical clinical component has not been thought feasible as there are some 2000 candidates each year. Given the centrality of the consultation in general practice, an oral examination component has been regarded as appropriate, often being referred to informally as a “clinical by proxy.”
Evidence suggests that oral examinations can be improved by the careful selection7 and training7 8 of examiners. Much effort has been expended towards enhancing the reliability and validity of the MRCGP, especially by addressing the selection, training, and monitoring of examiners. We describe our approaches and make general recommendations from our experience.
The MRCGP examination currently comprises three written components—a multiple choice test, a modified essay question paper, and a critical reading question paper—and two half hour oral vivas, each conducted by two examiners. Major changes to the examination, planned for 1996, will not affect the vivas.
Poor performance on the aggregate of the written papers excludes some candidates from the oral examinations. A pass is achieved on the aggregate of the five marks. Statistical monitoring ensures that each component contributes equally to the total. The examination and its impact on candidates' learning behaviour have been described.9 10
Selection of examiners
Examiners need knowledge and skills relating to their subject and towards participation in the design and conduct of the examination. The ability to conduct oral examinations effectively (and to participate in the planning of the written components) requires three further attributes: an approach to the practice of medicine and the delivery of health care that is within the limits of that acceptable to the examiners as a whole; effective interpersonal skills; and the ability to act as a productive member of a small team.
Examiners have consistent and contrasting marketing behaviours—for example, hawk or dove and restrained or theatrical.11 Unless extreme (for instance, an examiner who gives the same mark to all candidates), such behaviours are containable, as they can be changed either statistically or by training.12 13 Behaviour that cannot readily be changed, however, is disagreeing with fellow examiners about what is a better answer and what is a worse answer—in other words, a low rank order correlation with colleagues' marks.
To ensure that examiners are of high quality the college requires that potential examiners must be members of the college and in active general practice. If they took the examination more than 10 years ago, they must retake it. They are required to undertake two activities before they are assessed: answering a selection of recent written questions and marking 20 selected examination papers. They also sit half of a recent multiple choice paper.
The results of the potential examiners' marks on the written work and the estimates of how well their marking agrees with each other and that of the original examiners are analysed. The Examination Board's educational consultant advises whether any of these should give cause for concern.
On the day of assessment, potential examiners are asked to undertake two further tasks: (a) simulated group work on setting and marking written questions and (b) simulated oral examinations, acting in turn as examiner, candidate, and observer. Over a day potential examiners are thus observed undertaking interactive procedures that constitute the core of activities relating to the examination. Experienced examiners independently judge them in terms of their approach to general practice and their skills in interpersonal communication and teamwork. At the end of the day the performance of each potential examiner is reviewed at a meeting chaired by an independent person; the names of those about whom there is serious concern are noted.
When concern about a potential examiner emerges from both parts of this procedure he or she is not invited to join the panel of examiners. The letter sent emphasises that examining skills and clinical skills are different and that rejection as an examiner is no criticism of the person's work as a doctor. (This is analogous to comparisons between the skills needed to interview patients and those needed to interview candidates for a job.14)
Examiners are first accepted for a probationary period of two years. Training takes place as described below, but after 18 months, after a routine video training session, a new examiner's performance is formally reviewed against a set of criteria derived from analyses of examiners' tasks (see box 1). A poorly performing examiner might be counselled to undertake additional, specified training or not to seek formal reappointment.
Examiners' tasks in the oral examination
The parts of an examination need to be defined in terms of their function and content. The function of the oral component of the MRCGP examination is to judge candidates' approach to practice, their decision making skills, and their justification for their decisions.
More difficult, though, when there is no detailed syllabus for the examination, is to define the boundaries of the content of the oral examinations. This is now undertaken by means of a modified Delphi technique, codifying the views of the panel of examiners as a whole. Such a study was first undertaken in 1985 and has been repeated in 1994.9 The attributes identified could be clustered under seven headings, which were used to create a grid (fig 1) to encourage examiners to balance their oral examination and constrain it within the agreed limits, and to be more systematic generally.
A final constraint is the so called high case specificity of performance in medicine generally: a doctor's performance in solving different problems may vary substantially as doctors are often good at some things and bad at others.15 It is vital that as many topics as possible (at least six) are covered in each viva. Such considerations have resulted in a specification for the tasks of examiners during an oral examination (box 1).
Box 1—Examiners' tasks in oral examinations
To mark at least six different topics in each half hour viva
To include adequate exploration of all agreed areas of competence specified for the examination—for example, diagnosis, therapeutics, communication skills, ethics
To explore the candidate's approach to the practice of medicine, searching for coherence, rationality, and consistency
To obtain justification of reported behaviours, approaches, opinions, and attitudes—for example, by reference to published work
To attempt to link stated behaviours and approaches to performance
To avoid topics included in current written examination papers, as appropriate
To grade the candidate on each topic (to include recording each grade)
To make and record an overall judgment, with weighting of individual topics
To conduct the examination with respect for the candidate and fairly with regard to equal opportunities for all
Problems and strategies for oral examiners
Discussions with examiners over 10 years and watching video recorded examinations have identified a variety of practical problems which examiners face. Little published work applies to oral examinations in higher or medical education, but a related subject is the selection interview, which forms a focus in occupational psychology research.16 A summary of problems derived from both sources is shown in box 2.
Box 2—Problems for oral examiners
Practical problems experienced in practice
Dysfunctional start to the oral
Difficulty in covering the ground fast enough
Problem candidates (for example, show spoken, slow witted, or garrulous)
Losing control to the candidate
Candidate talking about unmarkable yet cognate issues—for example, training experiences
Coexaminer overrunning on a topic
An uncomfortable or dysfunctional end to the oral
Disagreeing with coexaminer about the overall grade
Problems adduced from published work on selection interviews
First impressions will be overly influential on a final judgment
The appearance (attractiveness, particularly) of a candidate will influence the grade given
The contrast with previous candidates can affect an examiner's judgment—for example, after two poor candidates a moderate candidate may seem very good
Examiners will tend to treat preferentially people like themselves—for example, those holding similar values—and people they like
Examiners may be especially critical of faults in candidates which they know they also have
Examiners are trying to make global unidimensional judgments of people such as “good” or “poor”; in practice, most candidates will have good and bad aspects
Reviewing the tasks of examiners in the light of these problems and with the benefit of experience has led us to identify strategies for planning oral examinations (box 3) and practical techniques to assist in their conduct (box 4).
Box 3—Strategies in planning vivas
Spend a few moments initially putting a candidate at ease, shaking hands, and inviting a comment about transport or weather; this develops rapport and avoids initial dysfunction
Introduce each topic and define its area. For example, “I'm going to ask you about juvenile onset diabetes, but I want to concentrate on issues of doctorpatient communication”
Because you are limited to five minutes per topic, go to the core of the question quickly (“what I'm really getting at…”), using short questions and avoiding the verisimilitude of detailed scenarios, which waste time
Avoid factual questions and unmarkable questions. Factual knowledge cannot be reliably tested in a viva and must be the focus of a written test. Unmarkable questions produce information on which you cannot make judgment—for example, about a candidate's previous colleagues
If you plan to use props (letters, pictures, electrocardiograms, etc) make sure that their function is clear and that they enhance the testing process and do not waste time. (Our experience in the MRCGP examination is that they rarely add much and often waste time and confuse)
There are often no clear cut right and wrong answers in medicine. Because of this, it may be helpful to use a model when presenting a question of choice—for example, the “options, implications, decision, justification” model, asking: What are the options open to you now? What are the implications of each? What would you do? What is the justification for this decision?
Plan tactics for difficult candidates—for example, a slow candidate may be encouraged non-verbally and with specific questions (“Gived me three advantages of…”), and such questions can be used to control an overbearing, bulldozing candidate. A garrulous candidate may be slowed by asking for clarification and interrupted and controlled with body language
Poor candidates may need to be encouraged. But for legal reasons, avoid using terms that may be taken as a statement that they are doing well—for example, “that's good.” Best, use non-verbal encouragement
Arrange a code for communicating with your coexaminer if he or she overruns
When you feel you can award a grade to a topic, do so and finish. The more topics covered in an oral examination the better
When the bell goes, let the candidate finish his or her sentence before closing the examination. Otherwise, examiners may seem abrupt to the point of rudeness when stopping candidates at the bell
Grading and marking
Two main difficulties confront the examiner when marking an examination: the varying attributes given to numbers under different marking systems that examiners may be used to—for example, 55 may be the pass mark under one system and 53 a good pass under another—and how to reach an overall mark from several component marks.
To avoid these problems and to encourage examiners to think about a candidate's performance in the oral examination as a whole, we have developed a grading scale based on simple epithets and more extended descriptions of these (fig 2). General guidance given about grading and marking is shown in box 5.
Box 5—Grading and marking
Each topic should be graded by reference to the list of grades and descriptions. If you are unhappy about a grade's accuracy, annotate it—for example with brackets. Make a note of reasons for giving a grade
Use some questions regularly for calibration purposes. Note on your record card characteristics of answers from poor, average, and good candidates
If you are giving a fairly high grade to a topic ask yourself what the candidate would have to have done better to get a better grade? We find that in this way, examiners may extend their use of the grading scale
At the end of the viva review the list of grades given to each topic. Refer to the list of grades and descriptions: which fits the candidate best? Is an average obvious? If not, consider the firmness of each mark: does this help? Otherwise, are there any examination policies—for example, to err on the side of generosity or caution—which will help you?
When considering your overall grade, review the list of hidden problems in box 2. Would these on balance be tending to push your mark inappropriately high or low?
Beware of the common feeling that candidates improve towards the end of a viva and thus raising a grade. This feeling is more likely to reflect true variations in a candidate's ability among the topics discussed than the candidate's true ability
Do not let your coexaminer browbeat you into changing your grade. Independent judgments are required. Unless it transpires that you have missed a catastrophic or brilliant answer, maintain your judgment
Training of examiners
Examiners for the MRCGP examination are trained by participating in preliminary practice vivas; by observing vivas; by regularly reviewing themselves on video and receiving feedback from others; and by attending an annual workshop.
When they are appointed, new examiners undertake and review practice vivas with volunteer trainees. Supernumerary examiners regularly observe oral examinations, and review them afterwards with the actual examiners. In this way, questions are refined and standards discussed in a way that is helpful to both the observer and the observed, with exchanges of ideas and approaches. At the end of a morning of observing, a structured review session enables observers (and any visitors) to discuss issues with members of the royal college's oral development group.
Formal review of and feedback from videos is provided roughly every year to all examiners. Two pairs of examiners who see the same candidates are recorded on video during a morning examination session of six oral examinations. The afternoon is set aside for them to review the recordings with an educational consultant (a psychologist). There are two sessions. The agenda for the first is set by the consultant (RW), who identifies teaching points for each examiner and identifies them by means of excerpts from the video. The second session is devoted to a consideration of the vivas that caused the examiners the most concern or interest and which they wish to review. These sessions are guided by rules for feedback which ensure that it is supportive at the same time as being effective.
Each year, examiners are invited to attend a three day workshop, which serves various functions. In particular, it affords an opportunity to discuss with the panel as a whole possible developments and directions for change. Training can be targeted at specific groups of examiners. And new approaches can be practised by everyone, together, and difficulties collectively resolved.
The reliability of oral examinations can best be estimated from the extent of agreement between pairs of examiners. The correlation between the two examiners' judgments has steadily risen, and the percentage of grades from the two that are within one grade of each other is now 94%. This encouraging position is seen as resulting from careful selection, monitoring, and training of examiners, including defining the function of the oral examination within the overall examination and specifying its process.
Conducting an effective oral examination requires a great deal of commitment and effort. Without commitment and effort you are likely to generate something approaching random numbers. This should be noted by examining bodies who give equal weight to marks in written papers, clinical examinations, and vivas to obtain an overall mark. We believe that five key elements provide a defensible oral examination.
Identifying the main tasks of examiners, and selecting examiners for these tasks
Careful planning of each oral examination as a whole
Contingency planning for difficult candidates
Providing preliminary and ongoing training of a supportive nature, and ensuring the participation of all examiners in continuing discussions about the oral component and its development
Monitoring the examiners and the examination overall, both statistically and within the training process.
RW is consultant to the Examination Board of Council of the Royal College of General Practitioners, LS is convenor of the Panel of Examiners, and VW is the convenor of the board's Oral Development Group. Much of the content of this paper has been generated in discussions with college examiners, and we thank them for their contributions, especially past and present members of the Oral Development Group, Peter Tate and George Smerdon in particular. This article expresses our views and not those of the college.