The intracluster correlation coefficient in cluster randomisation
BMJ 1998; 316 doi: https://doi.org/10.1136/bmj.316.7142.1455 (Published 09 May 1998) Cite this as: BMJ 1998;316:1455- a Division of General Practice and Primary Care, St George's Hospital Medical School, London SW17 0RE
- b Department of Public Health Sciences
- Correspondence to: Mrs Kerry
We have described the calculation of sample size when subjects are randomised in groups or clusters in terms of two variances—the variance of observations taken from individuals in the same cluster, sw2, and the variance of true cluster means, sc2.1 We described how such a study could be analysed using the sample cluster means. The variance of such means would be sc2+sw2/m, where m is the number of subjects in a cluster. We used this to estimate the sample size needed for a cluster randomised trial.
This sum of two components of variance is analogous to what happens with measurement error, where we have the variance within the subject, also denoted by sw2, and between subjects (sb2).2 One way of summarising the relation between these two components is the intraclass correlation coefficient, the correlation which we expect between pairs of observations made on the same subject. This is equal to sb2/(sb2+sw2).2 We can calculate a similar intraclass correlation coefficient between our clusters, rI=sc2/(sc2+sw2). This is also called the intracluster correlation coefficient.
For cholesterol concentration in the Medical Research Council thrombosis prevention trial the two components of variance were sw2=1.28 and sc2=0.0046. 1 3 This gives the intracluster correlation coefficient rI=0.0046/(0.0046+1.28)=0.0036. Such intracluster correlations are typically small. This trial had an intervention aimed directly at the patient and an outcome measurement for which the variance between practices is low compared with the variability between patients within a practice. Studies where the intervention is aimed at changing the doctor's behaviour may have a greater intracluster correlation. For example, in a trial of guidelines to improve the appropriateness of general practitioners' referrals for x ray examinations, the intracluster correlation was 0.0190. 4 5 We might expect the intracluster correlation to be higher in a trial where the intervention is directed at the doctor rather than the patient, because it includes the variation in the doctors' responses.
The design effect is the ratio of the total number of subjects required using cluster randomisation to the number required using individual randomisation.1 It can be presented neatly in terms of the intracluster correlation and the number in a single cluster, m: D=1+(m−1)rI. If there is only one observation per cluster, m=1 and the design effect is 1.0 and the two designs are the same. Otherwise, the larger the intracluster correlation—that is, the more important the variation between clusters is, the bigger the design effect and the more subjects we will need to get the same power as a simply randomised study. Even a small intracluster correlation will have an impact if the cluster size is large. A trial with the same intracluster correlation as the x ray guidelines study, 0.019, and m=50 referrals per practice, would have design effect D=1+(50−1)×0.019=1.93. Thus it would require almost twice as many subjects as a trial where patients were randomised to treatment individually.
The main difficulty in calculating sample size for cluster randomised studies is obtaining an estimate of the between cluster variation or intracluster correlation. Estimates of variation between individuals can often be obtained from the literature but even studies that use the cluster as the unit of analysis may not publish their results in such a way that the between practice variation can be estimated. Recognising this problem, Donner recommended that authors should publish the cluster specific event rates observed in their trial. This would enable other workers to use this information to plan further studies.
In some trials, where the intervention is directed at the individual subjects and the number of subjects per cluster is small, we may judge that the design effect can be ignored. On the other hand, where the number of subjects per cluster is large, an estimate of the variability between clusters will be important.