Intended for healthcare professionals

Analysis And Comment Research methodology

Assessment of generalisability in trials of health interventions: suggested framework and systematic review

BMJ 2006; 333 doi: (Published 10 August 2006) Cite this as: BMJ 2006;333:346
  1. C Bonell, senior lecturer (chris.bonell{at},
  2. A Oakley, professor2,
  3. J Hargreaves, lecturer1,
  4. V Strange, research officer2,
  5. R Rees, research officer2
  1. 1 London School of Hygiene and Tropical Medicine, London WC1E 7HT
  2. 2 Social Science Research Unit, Institute of Education, University of London, London
  1. Correspondence to: C Bonell
  • Accepted 19 May 2006

Most evaluations of new treatments use highly selected populations, making it difficult to decide whether they would work elsewhere. Systematic evaluation and reporting of applicability is required

Randomised trials of health interventions generally describe outcomes among participants with little consideration of whether the effects can be generalised. However, generalisability cannot be assumed with either biomedical interventions or more complex social interventions.w1 If their results are to be translatable into policy and practice decisions, trials must provide evidence about how relevant the interventions might be to other sites and populations.1 w2 Such information is particularly crucial for resource poor settings.2

Although CONSORT criteria for reporting randomised trials include assessment of generalisability,3 a framework for empirically assessing and reporting this is lacking. We consider the factors affecting generalisability using examples from HIV and sexual health, examine how a sample of trials looked at generalisability, and suggest how to improve evaluation.

Can the intervention be delivered elsewhere?

Several factors affect whether an intervention can be delivered and received in other sites. Firstly, an intervention must be feasible. Providers will vary in their capacity to implement an intervention,w3 as will institutions in being suitable places for an intervention.w4 The presence of local “champions” may influence feasibility in a particular site.4 Some interventions require the existence of other health services4—for example, services for treating sexually transmitted infections require microbiology laboratories to target the right patients. Interventions may also require adequacy in other sectors such as transport. Feasibility has a cost dimension: an unaffordable intervention lacks general feasibility.

Secondly, an intervention must achieve adequate coverage. This may depend on the overall comprehensiveness of health systems or on whether providers can reach people in other ways—for example, through outreach. Adequate coverage may be more difficult in some sites or sub-populations.

Finally, an intervention generally must be acceptable to be effective. Acceptability refers to participants' assessment of their experience of an intervention and will influence whether recipients adhere to treatment plans, act on health advice, or return for follow-up.4 For example, condom promotion has proved acceptable and subsequently effective in urban Tanzania but not in rural regions.w5 Acceptability will vary between populations as it depends on cultural norms and can have economic dimensions. For example, HIV voluntary counselling and testing services that require clients to attend clinics twice (first for testing and then for results) may be acceptable in high income settings but not low income settings because transport or opportunity costs are too great.w3

Factors relating to delivery of an intervention are best documented by embedding an evaluation of process in trials.5 The study collects quantitative and qualitative data on planning, delivery, and uptake and how context affects them.

Does the intervention meet recipients' needs?

To be effective an intervention must meet recipients' needs—that is, the recipients must have capacity to benefit from an intervention. Thus potential recipients of an intervention should have similar needs to those of the original study participants. Trial participants may be untypical of the general population even in the study site, let alone in other sites. Trials tend to under-represent certain groups, such as minority ethnic and low income groups, women, and older people, whose needs may differ from those of people included in trials.6 Trials should therefore describe the sociodemographic profile of participants and report the extent to which they are representative of the target population.

If the needs of future potential recipients differ from those of the study participants, interventions may not work in a new population or have to be adapted. For example, provision of antiretroviral drugs in low income countries, or to certain sub-populations may have to be accompanied by support to promote adherence in order to achieve similar outcomes to those achieved among trial participants.w6

This is also true of public health interventions. The extent to which a factor contributes to the incidence of a particular disease, and therefore needs intervention, varies across populations. For example, treating ulcerative sexually transmitted infections may have a significant effect on HIV incidence in an HIV epidemic localised within high risk groups but not in a more generalised epidemic.w1 Assessing whether an intervention has met recipients' needs, or will meet those of future recipients, requires investigators to be explicit about the causal pathways through which an intervention is expected to act and to measure relevant pathway variables.

Current assessment of generalisability

We reviewed whether trials of HIV prevention targeting homosexually active men explored generalisability or factors affecting this. We obtained and examined all available evaluation reports of eight interventions that a recent systematic review reported to have rigorously evaluated outcomes.7 Two reviewers independently assessed whether the studies had empirically examined local factors affecting feasibility, coverage, and acceptability; evaluated process; assessed needs; and assessed the potential generalisability of interventions.

Six of the eight trials had integral process evaluations,814 16 18 but only three of these collected quantitative and qualitative data on the planning, delivery, and receipt of the intervention (table 1).9 12 18 Only one process evaluation stated that consideration of generalisability was an aim.10 Six trials gave some information about participants' ethnicity (usually the proportion described as white).8 9 1518 Seven trials provided data on educational level.8 9 12 1417 None commented on the extent to which study samples were representative of the populations being targeted.

Table 1

Interventions and process evaluation in eight studies of HIV prevention

View this table:

Only those trials incorporating process evaluations identified contextual factors influencing the feasibility, coverage, and acceptability of their intervention (table 2). Elford et al, for example, reported that recruitment and retention of peer educators to provide HIV prevention in gyms was extremely difficult because of educators' low confidence.10

Table 2

Discussion of contextual factors and generalisability in eight studies of HIV prevention

View this table:

Only one study reported on needs (table 2).19 Although other studies reported baseline sexual behaviour8 9 12 1417 or sexual health related attitudes or knowledge8 12 16 of the target population or participants, the purpose was to check for baseline differences between intervention and comparison groups rather than to describe normative need.

Most of the studies speculated about the potential generalisability of their intervention to other sites but did not consider this empirically. Rosser et al, for example, wondered whether their intervention might prove more effective among populations with more risky sexual behaviour.17 The trials that examined contextual barriers and facilitators to delivering the intervention could make more considered assessments of generalisability. Two reports referred to sociological theory to hypothesise what contextual factors might have influenced the effect of the intervention in the study site compared with other sites.10 12 However, these trials both reported on interventions previously reported as effective in other contextsw7 that were largely ineffective in their own sites. Therefore, rather than consider the scope for transferring the interventions to new sites, they (reasonably) considered the contextual reasons for failure of transfer.

Systematic evaluation

To make informed decisions about whether they should implement interventions, providers require more information than simply whether interventions are effective in original study sites. They need information on context and needs. However, most of the studies we looked at did not empirically examine generalisability. Phase III trials should be judged not only in terms of the designs and methods they use to examine outcomes3 but also how they assess generalisability. To enable this trials should:

  • Include process evaluations as integral elements5

  • Develop evidence based theories about how intervention processes are influenced by contextw8 and how processes might differ if interventions are implemented in other sitesw9

  • Report the extent to which their participants are representative of the population being targeted6

  • Describe the prevalence of the needs being met by the intervention, informed by clear hypotheses about the intervention's mechanism.

We believe that these elements are essential to comply with the existing CONSORT requirement to report on “clinical characteristics” of participants if clinical is interpreted as meaning need for health intervention.

The most useful information on the potential for, as well as the barriers to, transfer of interventions comes from studies that compare an intervention in one site with similar interventions provided elsewhere, as in the study by Elford et al.10 Future phase III research might build on such work by setting out to examine interventions implemented across diverse contexts in multisite studies. These would examine differential effects by site and explore contextual determinants of success to generate hypotheses for future research and guidelines for the implementation of interventions outside trials.w9 This approach is compatible with a phased approach to intervention trials. Assessing generalisability in phase III should inform choice of sites for phase IV replicability research.20 However, such multi-site evaluations are unlikely unless funding for such work is increased.

Summary points

Few randomised trials assess the generalisability of their results

Such information is essential to decisions about adopting new interventions

Trials should include evaluations of the feasibility, coverage, and acceptability of interventions

They should also examine exactly for whom and what interventions are effective

Finally, systematic reviews should consider generalisability. Currently, many do not examine intervention process or context and do not comment on the potential for and limits to intervention effects being generalised to other settings and populations.w10


  • Embedded ImageReferences w1-w10 are on

  • Contributors and sources This article is based on an analysis of trials of HIV prevention for men who have sex with men that were identified in a systematic review. The authors have experience and expertise in primary evaluations and systematic reviews of public health interventions and the integrated analysis of outcome and process data. All authors contributed to the conception and design of the analysis presented and to analysis and interpretation of the studies reviewed. All contributed to drafting and revising the intellectual content of the article. CB is the guarantor.

  • Competing interests None declared.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
View Abstract