Intended for healthcare professionals

Education And Debate

Evaluating and researching the effectiveness of educational interventions

BMJ 1999; 318 doi: (Published 08 May 1999) Cite this as: BMJ 1999;318:1267
  1. Linda Hutchinson, senior registrar in paediatrics (I.hutchinson{at}
  1. Department of Child Health, St George's Hospital Medical School, London SW17 0RE

    Members of the medical profession seem reluctant to value research into the effectiveness of educational interventions.1 One reason for this reluctance may be that there is a fundamental difficulty in addressing the questions that everyone wants answered: what works, in what context, with which groups, and at what cost? Unfortunately, there may not be simple answers to these questions. Defining true effectiveness, separating out the part played by the various components of an educational intervention, and clarifying the real cost:benefit ratio are as difficult in educational research as they are in the evaluation of a complex treatment performed on a sample group of people who each have different needs, circumstances, and personalities.

    Summary points

    • Health professionals are often reluctant to value research into the effectiveness of educational interventions

    • As in clinical research, the need for an evidence base in the practice of medical education is essential

    • Choosing a methodology to investigate a research question in educational research is no different from choosing one for any other type of research

    • Rigorously designed research into the effectiveness of education is needed to attract research funding, to provide generalisable results,and to elevate the profile of educational research within the medical profession


    Choosing a methodology to use to investigate a research question is no different in educational research than it is in any other type of research. Careful attention must be paid to the aims of the research and the validity of the method and tools selected. Educational research uses two main designs: naturalistic and experimental. 2 3

    Naturalistic design

    Naturalistic designs look at specific or general issues as they occur—for example, what makes practitioners change their practice, how often is feedback given in primary care settings, what processes are occurring over time in an educational course, what are the different experiences and outcomes for participants, and can these differences be explained? Like case reports, population surveys, and other well designed observational methodologies, naturalistic studies have a place in providing generalisable information to a wider audience.4

    Experimental design

    In contrast to a naturalistic research design, experimental designs usually involve an educational intervention. The parallels with clinical interventions highlight the three main areas of difficulty in performing experimental research into educational interventions.

    Complex nature of education

    An educational event, from reading a journal article to completing a degree course, is an intervention; it is a complex intervention, and often several components act synergistically. Interventions that have been shown to be effective in one setting may, quite reasonably, not translate to other settings. Educational events are multifaceted interactions occurring in a changing world and involving the most complex of subjects. Many factors can influence the effectiveness of educational interventions (fig 1).

    Fig 1
    Fig 1

    Examples of factors which may influence the effectiveness of educational interventions


    The randomised controlled trial is regarded as essential to proving the effectiveness of a clinical intervention. In educational research, especially in postgraduate and continuing medical education, the numbers that can be enrolled in a study may not be large enough to allow researchers to achieve statistically significant quantitative results. Comparable control groups may be susceptible to cross contamination from access to some of the elements of the intervention under scrutiny (for example, students may pass their handouts to other students).

    Purposive sampling may give different but valid perspectives.3 For instance, if 25 out of 30 people are shown to have benefited from a course, the most interesting question might be “why didn't the other five people benefit?” Were they demographically different? Did they have different learning styles? More detailed interviews with those five students may be more informative about the philosophy and utility of a course than the simple statistics. Analysing the deviant cases as a project proceeds is important for the validity of results.5

    Outcome measures

    Kirkpatrick described four levels of evaluation in which the complexity of the behavioural change increases as evaluation strategies ascend to each higher level (fig 2).6 The length of time needed for the evaluation, the lack of reliable objective measures, and the number of potential confounding factors all increase with the complexity of the change. Researchers in medical education are aware that the availability of funds for research and development is limited unless a link can be made between the proposed intervention and its impact on patient care, yet this is the most difficult link to make.

    Fig 2
    Fig 2

    Kirkpatrick's hierarchy of levels of evaluation. Complexity of behavioural change increases as evaluation of intervention ascends the hierarchy

    When does evaluation become research?

    Evaluations of the effectiveness of educational interventions may not reach the rigour required for research even though they may use similar study designs, methods of data collection, and analytical techniques. The evaluation of an educational event may have many purposes; each evaluation should be designed for the specific purpose for which it is required and for the stakeholders involved. 7 8 For a short educational course, for example, the purpose might be to assist organisers in planning improvements for the next time it is held. The systematic collection of participants' opinions using a specifically designed questionnaire may be appropriate. Just as a well designed audit, although not true research, may be informative and useful to a wider audience, so an evaluation may be useful if it is disseminated through publication. But that does not mean that it is research.

    For an evaluation of an educational intervention to be considered as research, rigorous standards of reliability and validity must be applied regardless of whether qualitative or quantitative methodologies are used. 9 10 For example, if the questionnaires used are not standardised and have not been validated for use in a population similar to the one being studied, they should be piloted and should include checks on their internal consistency. A high response rate—of 80-100%—is required.11 Numerical scores will be meaningless if they are derived from poorly designed scoring systems. If a control group is used, randomisation or case control procedures should meet accepted standards.11

    The Cochrane Collaboration module on effective professional practice details quality assessment criteria for randomised studies, interrupted time series, and controlled before and after studies that have been designed to evaluate interventions aimed at improving professional practice and the delivery of effective health care.11 Educational strategies and events for healthcare professionals fall into this remit.


    As in clinical research, the need for an evidence base in the practice of medical education is essential for the targeting of limited resources and for informing development strategies. The complexity of the subject matter and the limited availability of reliable, meaningful outcome measures are challenges that must be faced. These difficulties are not enough to excuse complacency. Rigorous research design and application are needed to attract research funding, to provide valid generalisable results, and to elevate the profile of educational research within the medical profession.


    • Competing interests None declared.