Statistics notes: Sample size in cluster randomisation
BMJ 1998; 316 doi: https://doi.org/10.1136/bmj.316.7130.549 (Published 14 February 1998) Cite this as: BMJ 1998;316:549- a Division of General Practice and Primary Care, St George's Hospital Medical School, London SW17 0RE
- b Department of Public Health Sciences
- Correspondence to: Mrs Kerry
Abstract
Techniques for estimating sample size for randomised trials are well established,1 2 but most texts do not discuss sample size for trials which randomise groups (clusters) of people rather than individuals. For example, in a study of different preparations to control head lice all children in the same class were allocated to receive the same preparation. This was done to avoid contaminating the treatment groups through contact with control children in the same class.3 The children in the class cannot be considered independent of one another and the analysis should take this into account.4 5 There will be some loss of power due to randomising by cluster rather than individual and this should be reflected in the sample size calculations. Here we describe sample size calculations for a cluster randomised trial.
For a conventional randomised trial assessing the difference between two sample means the number of subjects required in each group, n, to detect a difference of d using a significance level of 5% and a power of 90% is given by n=21s2/ d2 where s is the standard deviation of the outcome measure. Other values of power and significance can be used.1
For a trial using cluster randomisation we need to take the design into account. For a continuous outcome measurement such as serum cholesterol values, a simple method of analysis is based on the mean of the observations for all subjects in the cluster and compares these means between the treatment groups. We will denote the variance of observations within one cluster by sw2 and assume that this variance is the same for all clusters. If there are m subjects in each cluster then the variance of a single sample mean is sw2/ m. The true cluster mean (unknown) will vary from cluster to cluster, with variance sc2. The observed variance of the cluster means will be the sum of the variance between clusters and the variance within clusters—that is, variance of outcome=sc2+sw2/m. Hence we can replace s2 by sc2+sw2/m in the formula for sample size above to obtain the number of clusters required in each intervention group. To do this we need estimates of sc2 and sw2.
For example, in a proposed study of a behavioural intervention in general practice to lower cholesterol concentrations practices were to be randomised into two groups, one to offer intensive dietary intervention by practice nurses using a behavioural approach and the other to offer usual general practice care. The outcome measure would be mean cholesterol values in patients attending each practice one year later. Estimates of between practice variance and within practice variance were obtained from the Medical Research Council thrombosis prevention trial6 and were sc2=0.0046 and sw2=1.28 respectively. The minimum difference considered to be clinically relevant was 0.1 mmol/l. If we recruit 50 patients per practice, we would have s2=sc2+sw2/m=0.0046+1.28/50=0.0302. The number of practices is given by n=21x0.0302/0.12=63 in each group. We would require 63 practices in each group to detect a difference of 0.1 mmol/l with a power of 90% using a 5% significance level—a total of 3150 patients in each group.
It can be seen from the formula for the variance of the outcome that when the number of patients within a practice, m, is very large, sw2/m will be very small and so the overall variance is roughly the same as the variance between practices. In this situation, increasing the number of patients per practice will not increase the power of the study. The 1 shows the number of practices required for different values of m, the number of subjects per practice. In all situations the total number of subjects required is greater than if simple random allocation had been used.
The ratio of the total number of subjects required using cluster randomisation to the number required using simple randomisation is called the design effect. Thus a cluster randomised trial which has a large design effect will require many more subjects than a trial of the same intervention which randomises individuals. As the number of patients per practice increases so does the design effect. In the 1, the design effect is very small when m is less than 10. This would involve recruiting a total of 558 practices, and the nature of the intervention and difficulties in recruiting practices made this impractical. Thus it was decided to recruit fewer practices. The design effect of using 126 practices with 50 patients from each practice was 1.17. This design requires the total sample size to be inflated by 17%. If the study involves training practice based staff it may be cost effective to reduce the number of practices even further. If we chose to use 32 practices then we would need 500 patients from each practice and the design effect would be 2.98. Thus the cluster design with 32 practices would require the total sample size to be trebled to maintain the same level of power.
We shall discuss the use of the intracluster correlation coefficient in these calculations in a future statistics note.