Bias in identifying and recruiting participants in cluster randomised trials: what can be done?

BMJ 2009; 339 doi: (Published 9 October 2009)
Cite this as: BMJ 2009;339:b4006
  1. Sandra Eldridge, professor of biostatistics1,
  2. Sally Kerry, senior lecturer in medical statistics2,
  3. David J Torgerson, director 3
  1. 1Centre for Health Sciences, Barts and the London School of Medicine and Dentistry, Queen Mary, University of London, London E1 2AT
  2. 2Department of Community Health Sciences, St George’s, University of London, London SW17 0RE
  3. 3York Trials Unit, Department of Health Sciences, University of York, York YO10 5DD
  1. Correspondence to: D J Torgerson djt6{at}
  • Accepted 13 July 2009

Blinded recruitment of participants presents particular challenges for cluster trials, but careful design can minimise the risk of selection bias

Concealment of allocation is regarded as crucial for individually randomised controlled trials. The process ensures that the people selecting participants for randomisation do not know their allocation and avoids selective recruitment. In cluster randomised trials, groups (or clusters) of participants are randomised rather than individuals, yet data are collected on individual participants. Selective recruitment of individual participants can occur in these trials if the people recruiting participants know the participants’ allocation, even when allocation of clusters has been adequately concealed.

Summary points

  • Poor design of cluster trials risks bias in selection of participants

  • Ideally participants should be identified before the cluster is randomised

  • When this is not possible recruitment should be by someone masked to the cluster allocation

  • Statistical solutions to selection bias are less satisfactory than design solutions

  • Cluster trials need to report cluster sizes to enable the readers to ascertain any differences in recruitment between treatment groups

Two recent reviews found that up to 40% of cluster trials published in major medical journals may be biased.1 2 Yet articles and books on cluster randomised trials have tended to focus on statistical issues rather than the need to ensure unbiased sampling of individuals within the cluster.3 4

Cluster randomised trials are well established in education, where their use is relatively straightforward.5 6 Children are usually identified from school or class registers with little or no problem of selection bias being introduced after randomisation, and no consent is required from the children or their parents. Many healthcare trials can be designed along similar lines, with potential participants being all those who belong to a cluster (box 1).7 We describe how bias may occur when individual participants are identified or recruited in cluster randomised trials and discuss how it can be avoided.

Box 1 Individual participants identified and recruited before randomisation: Kumasi Stroke Prevention Trial

  • Clusters—12 villages in Ghana

  • Eligible participants—All residents aged 40 to 75

  • Methods of selecting participants—1896 selected from village census using random sampling, stratified for age and sex

  • Recruitment method—Selected participants invited to attend temporary field station for health screening in their village; people with serious illness and pregnant or lactating women were excluded; 1013 (53%) consented

  • Intervention—Community education to reduce dietary salt

  • Outcome—Reduction in blood pressure after 6 months

  • Comment—After two villages had completed recruitment these were randomised, one to intervention and one to control. As villages varied in the age and sex of their populations stratified sampling ensured balance between intervention groups

Trials in which individual participants are not recruited

Even in today’s increasingly restrictive research environment it may sometimes be ethically acceptable not to recruit individual patients. For example, in an ongoing trial to increase the identification and referral of women who are victims of domestic violence (box 2), the intervention takes place at the cluster level, outcomes are collected from routine data with adequate regard to confidentiality and data protection, and there is no change in the care arrangements of participants.

Box 2 Individual participants are not recruited: identification and referral intervention to improve safety of women8

Clusters—48 general practices in east London and Bristol

Eligible participants—All registered women over 16

Intervention—Educational package to practices, enhanced referral systems, feedback to practices

Outcome—Referral rates per 100 women

Comment—Outcome data were obtained from practice notes by researchers and no patient identifiable data taken outside practices. Additional screening of women for domestic violence in consultations was not expected to affect usual care negatively

In trials without individual patient recruitment, bias can still occur if individual participants are identified by people who know the allocation status. In a trial randomising general practices,9 patients consulting for ischaemic heart disease were identified by the general practitioners and data collected by people who knew which group their practice was in, even though there was no patient recruitment.

Some people think that individual recruitment with consent is an ethical requirement for all trials.10 However, the balance between scientific considerations and the need for consent can be difficult in cluster randomised trials.11 Given the wide variety of interventions and designs each study needs to be considered on a case by case basis.

Trials that identify and recruit individual participants before randomisation

For trials of management of chronic disease, it may be possible to identify and consent patients before clusters are randomised. This is often not done, however, even when possible. Kannus and colleagues identified and randomised 22 clusters where older people are supported to live in the community.12 After randomisation, participants were asked if they would be prepared to take part in a study evaluating the use of hip protectors. In the control group 91% agreed to participate and allow the researchers to collect data on hip fractures, but only 69% of the intervention group participated, possibly because they did not want to wear the hip protectors. Studies using this approach tend to show a benefit of hip protectors whereas individually randomised trials do not; this difference might be due to selection bias.13

The key to preventing this sort of problem is trial design.6 Although statistical methods, such as propensity scores, are used to correct for observed group imbalances,14 analytical solutions are unsatisfactory because we can never be sure that we can fully correct for unobserved covariate imbalances.

Even for acute conditions, recruitment before randomisation is possible if risk factors for the condition are well established. For example, women who have had a urinary tract infection have a higher risk of developing another acute episode.15 Investigators wishing to conduct a cluster randomised trial of a treatment for urinary tract infection could recruit a cohort of women who have had an infection and then randomise clusters. The drawbacks of this approach are that more people are recruited than are needed and, depending on the latent time to symptom development, the trial may need to be relatively long, both of which will increase costs.

More generally, one drawback of identifying and recruiting participants before randomisation is the possibility of a long delay between recruitment and implementation of the intervention, which is generally undesirable. To avoid this, individual clusters or a block of clusters can be randomised immediately all their patients have been recruited. The advantage of the block approach is that balance in the number of clusters in each intervention group can be assured while keeping allocation status concealed from the last cluster to enter the trial. The Kumasi trial (box 1) used blocks of two, but larger blocks also work.7 In a trial of patient held records for adults with learning difficulties, the number of patients per cluster was small and blocks of 10 practices were randomised simultaneously.16

Trials that have to identify or recruit individual participants after randomisation

If it is not possible to identify and recruit before cluster randomisation, then masked recruiters can be used. Ideally they should recruit outside the cluster setting to avoid unmasking by contact with staff aware of allocation. In the East London Randomised Controlled Trial for High Risk Asthma (ELECTRA) patients with an acute episode were recruited from a clinic in an accident and emergency department by researchers blind to allocation status (box 3).17 In this example, all eligible patients were attending secondary care. In some trials, this method might bias the sample towards more serious cases or result in recruitment of patients not belonging to study clusters.

Box 3 Individual participants are recruited outside the cluster setting: East London Randomised Controlled Trial for High Risk Asthma17

Clusters—44 general practices in east London

Eligible participants—All registered patients attending or admitted to the Royal London Hospital or general practice out of hours service with acute asthma attack

Methods of selecting participants—Researcher, blind to allocation status of patients, recruited patients in accident and emergence department

Allocation to intervention—All patients saw the liaison nurse, who informed them which group they were in

Intervention—Patient review in a nurse led clinic and liaison with general practitioners and practice nurses comprising educational outreach, promotion of guidelines for high risk asthma, and ongoing clinical support

Outcome—Percentage of participants receiving unscheduled care for acute asthma over one year and time to first unscheduled attendance

Comments—Although control patients saw the nurse briefly, the contamination was minor and thought to be better than risking bias in recruitment

Even when participants are recruited within cluster settings, masked recruitment may still be possible if potential participants can be identified by masked recruiters outside the clinical consultation (box 4).18

Box 4 Participants recruited within cluster setting by masked recruiters: effect of counselling on care seeking behaviour in families with sick children18

Clusters—12 primary healthcare centres in rural India

Eligible participants—Children under 5 presenting for curative care and their mothers

Methods of selecting participants—Field workers masked to allocation status enrolled mothers and children attending health centres

Intervention—Doctors had 5 day training on counselling, communication, and clinical skills. They were also given cards to help them, copies of which could be given to mothers

Outcome—Care seeking behaviour of mothers

Comments—The recruiting field workers did not know the precise objectives of the study. They were told that the study aimed to assess children’s illness load and how families respond to illness. Field workers provided similar information to the families when seeking their consent

When masked recruitment is not an option, some effort can be made to standardise recruitment across intervention groups. King and colleagues evaluated an educational intervention to manage patients with incident depression.19 The trialists trained reception staff to recruit the participants. Because the control and intervention recruitment staff had the same training the possibility of recruitment bias was reduced. Nevertheless, reception staff may still introduce selective recruitment, and using non-clinical recruiters may result in some ineligible participants being recruited. In this instance we should include the ineligible participants in the analysis (that is, intention to treat) as the dilution effects are not as serious a threat as selection bias.

Another proposal is to try to mask allocation status by using a partial split plot design.20 Clinicians are masked to the allocation of their clusters, and to maintain blinding a small proportion of patients in the control clusters are randomised to the intervention group and a small proportion of intervention patients to the control group. A study in the Netherlands suggests that this design can be used successfully to mask recruiting clinicians with cluster sizes of up to 10 participants.19 However, the design cannot be used in trials evaluating interventions aimed at cluster staff, and the masking is likely to be maintained only if cluster sizes are small; when cluster sizes are large health professionals are likely to be able to guess which group they are in. In addition, the presence of people receiving the intervention within the control clusters could contaminate the rest of the cluster. This will dilute estimates of effect size. Some of the dilution effects may be offset by using latent variable analytical methods, such as complier average causal effect analysis, although these approaches tend to increase the width of the confidence intervals.21

Bias may also occur if clusters allocated to an undesired treatment group withdraw between randomisation and recruitment of first patient. To avoid this, clusters can be randomised after an index case has been recruited.14 22 Implementing the intervention when a patient presents is likely to cause practical difficulties for interventions aimed at cluster staff and, more importantly, does not necessarily prevent future selective recruitment beyond the index case. However, it may be useful when the number of eligible patients per cluster is likely to be small.

In a pilot study of an intervention for back pain, doctors who had been trained in the intervention recruited twice as many incident cases, with lower severity, as doctors in the control group.23 One solution might have been to use masked researchers to identify patients from the practice computer and gain their consent to participate. An alternative could have been to recruit people with chronic back pain from practice lists before randomisation of the clusters. But in the event, investigators abandoned a clustered design for this part of their study.

Patient information

Whatever the timing of recruitment, both intervention and control groups should be given similar information about the trial before consent. However, fully informing the controls of the intervention could dilute the intervention effect, causing bias. In the Kumasi trial (box 1), the information sheet did not mention salt but referred to changing diet, and all participants were asked to attend health education sessions on a variety of subjects with the addition of salt reduction in the intervention clusters.

How much information control patients should receive in these circumstances has been the subject of debate.24 This problem is not restricted to cluster randomised trials,25 but in a cluster trial there may be a temptation to have very different information sheets for the intervention and control groups. This should be avoided.

Participants may also receive information from cluster staff, who may know the aim of the intervention but be unaware of the importance of masked recruitment and possibility of contamination. All staff must be adequately trained, and information about the trial’s aims and the cluster allocation should be on a “need to know” basis.

Measuring possible selection bias

When there is no design solution to masking recruitment, authors should report sufficient information to enable readers to judge for themselves whether differential recruitment has taken place. Firstly, authors should report the size of the potential eligible population as well as numbers recruited. When eligible patients can be identified from practice computers before consent this is fairly straightforward. If participants are recruited as a result of a consultation or event, it may be possible to check for eligible patients who were not recruited retrospectively. A trial in pregnant women to reduce use of baby walkers reported the number of participants as a percentage of live births in each intervention group (box 5).26 If it is not possible to obtain an unbiased estimate of the total eligible population, total cluster size could be used instead.21

Box 5 Trial giving data on recruitment rates: baby walker trial

Clusters—64 general practices in Nottinghamshire

Eligible participants—All pregnant women of at least 28 weeks’ gestation

Methods of selecting participants—Midwives recruited in practices after training to standardise recruitment

Intervention—Educational package aimed at discouraging owning and using a baby walker, delivered by midwives and health visitors

Outcome—Percentage of women possessing and using a baby walker when baby was 9 months old

Recruitment rate—Number of women recruited as a percentage of live births was 21.4% in the intervention group and 22.9% in the control group. Percentage of women intending to get a baby walker at baseline was 24.9% in the intervention practices and 36.7% in the control practices

Comments—Women in the intervention group may have been exposed to the intervention before recruitment

Even if there are no differences in recruitment rate, there may be differences in the make-up of the groups. This could be examined, although differences may occur in unmeasured covariates. Statistical testing for baseline imbalances has been recommended to detect possible selection bias due to subversion,27 although we would strongly caution against this because significant differences may occur by chance.

Reports of cluster trials should be clear how clusters were recruited. This should include how participants were identified and at what stage, and the total potential population within each intervention arm, where possible. This will facilitate the assessment of the internal validity of a cluster trial.


Cite this as: BMJ 2009;339:b4006


  • We thank Tim Peters and Carol Coupland for helpful comments.

  • Competing interests: None declared.

  • Provenance and peer review: Not commissioned; externally peer reviewed.