Bias in randomised controlled trials: comparison of crossover group and parallel group designsBMJ 2015; 351 doi: https://doi.org/10.1136/bmj.h4283 (Published 07 August 2015) Cite this as: BMJ 2015;351:h4283
- Philip Sedgwick, reader in medical statistics and medical education
- Correspondence to: P Sedgwick
The efficacy and side effects of the synthetic cannabinoid nabilone were compared with those of the weak opioid dihydrocodeine in the treatment of chronic neuropathic pain. A randomised, double blind, crossover trial study design was used. Participants received a maximum daily dose of 2 mg nabilone or 240 mg dihydrocodeine. The trial lasted for 14 weeks and comprised two treatment periods, each of six weeks’ duration, separated by a two week washout period. Participants were patients with chronic neuropathic pain aged 23-84 years, who were recruited using convenience sampling from outpatient units at three hospitals in the United Kingdom. The sample comprised 96 patients, who were randomised to nabilone (n=48) and dihydrocodeine (n=48) in the first treatment period.1
The main outcome was pain as measured on a visual analogue scale over the final two weeks of each treatment period. Side effects were measured by a questionnaire. The researchers reported that dihydrocodeine provided better pain relief than nabilone and had slightly fewer side effects, although no major adverse events occurred for either drug.
Which of the following statements, if any, are true?
a) The crossover design is referred to as a “within subjects” study design
b) The sample was prone to selection bias
c) The crossover trial was prone to a carryover effect
d) The crossover trial would be expected to take longer than the equivalent parallel groups design
e) The crossover trial design would be expected to promote internal validity compared with the equivalent parallel groups design
Statements a, b, c, d, and e are all true.
The aim of the trial was to compare the efficacy and side effects of nabilone with dihydrocodeine in the treatment of chronic neuropathic pain. A randomised double blind crossover trial was conducted. Participants received both treatments in succession, with the order—nabilone followed by dihydrocodeine, or vice versa—determined at random. Each treatment was delivered over a six week period. Because participants received both treatments the comparison between treatments was made within the participants. The trial is therefore referred to as a “within subjects” study design (a is true). Crossover trials have been described in more detail in a previous question.2
When conducting a trial it is important that potential biases are minimised, in particular biases associated with recruiting participants, recording the outcome measurements, the loss of patients to follow-up, and patients not providing outcome measurements. By minimising any potential biases, external validity and internal validity are promoted. External validity is the extent to which the study results can be generalised to the population that the sample represents.3 Internal validity is the extent to which differences in outcome can be ascribed to differences in treatment and not differences in characteristics or prognostic factors at baseline, thereby permitting the inference of causality to be ascribed to a treatment.3
Some of the biases that can exist for randomised controlled trials with a crossover study design are those that may also be found in a randomised crossover trial with a “parallel groups” study design (as described in a previous question4). These include selection bias (b is true), ascertainment bias, and resentful demoralisation, which are described further below. If a trial with a parallel groups study design had been used to compare nabilone with dihydrocodeine in the treatment of chronic neuropathic pain, participants would have been allocated to a treatment and then received that treatment for the entire study period. The treatment groups would be followed alongside each other—hence the term parallel groups. The design is sometimes referred to as “between subjects” because the outcomes would be compared between independent groups of patients—that is, between subjects.
The trial participants were patients with chronic neuropathic pain in outpatient units at three hospitals in the UK. As for any study, the sampling method and the availability of participants invited to take part will influence the extent of selection bias. Selection bias would have occurred in the above trial if the sample was systematically different from the population it represented. The sample was obtained using convenience sampling,5 so it would have been expected to be systematically different from the population with respect to its demographics and health outcomes. Hence the sample would be prone to selection bias (b is true). If selection bias exists it results in a lack of external validity. Selection bias is a general term used to describe a group of biases and effects, and it includes non-response bias and volunteer bias. Non-response bias describes the potential differences between those who accepted the invitation to be in the trial (responders) and those who did not (non-responders). Volunteer bias describes the potential differences between those who volunteered to be in the sample and the population.
Ascertainment bias would have occurred if the measurements of the outcomes were systematically different from the participants’ experiences. Such biases in data collection can be unconscious or otherwise and can originate from the investigators or participants. When ascertainment bias occurs on behalf of the participants it is referred to as response bias, and when it occurs on behalf of the researchers it is referred to as assessment bias. Ascertainment bias is more likely to occur when the participants and the researchers are not blind to the treatment allocation. Ascertainment bias may occur for a variety of reasons. For example the researchers may favour one of the treatments and wish to show that it is the most effective treatment. If the participants are aware of this they might subsequently report their pain in a way they perceive would please the researchers. If the researchers measure the outcomes, they may then record their responses inaccurately or encourage the participants to respond so that their experiences favour a particular treatment. Because the above trial was double blind, ascertainment bias would have been minimised.
When participants are aware of their treatment allocation a trial is typically prone to resentful demoralisation. It is possible that participants’ responses to treatment might be influenced by knowledge of their treatment allocation. Although participants volunteer to take part in a trial and are aware that they will be randomised to a treatment group, they might still have a preference for one of the treatments, particularly if they had received it previously. Patients who receive their preferred treatment are more likely to be better motivated and show greater adherence to their treatment regimen. By contrast, patients who do not receive their preferred treatment might exhibit resentful demoralisation, whereby they comply poorly and possibly withdraw from the trial. In particular, patients’ preferences for a particular treatment might have an important effect on the perceived benefit of and reporting of side effects for their allocated treatment. Because the above trial was double blind, resentful demoralisation would have been minimised.
Compared with a parallel groups study, the use of the crossover trial design in the above study would have minimised certain biases while making the study prone to others. In particular, when compared with a parallel groups study the use of a crossover trial design is expected to minimise confounding. However, the crossover trial study design is prone to a period effect and also a carryover effect (c is true). These biases and effects are described further below. Because participants must take both treatments in a crossover trial the study period is expected to be longer than for the equivalent parallel groups design, where participants take only one treatment (d is true). Because the crossover trial design makes greater demands on patient time, a higher proportion of participants would be expected to drop out of the trial or be lost to follow-up. Compared with the equivalent parallel groups design, the outcome measurements in a crossover trial are therefore more likely to be incomplete, which presents challenges in statistical analysis.6
A crossover trial design is used when the condition being investigated is chronic and treatment is for the short term relief of symptoms rather than a cure. Furthermore, the patient’s underlying condition and potential to respond to treatment must remain unchanged from the first to the second treatment period. If not, a period effect may exist—that is, a systematic difference between the treatment periods in the outcome scores for a treatment. If a period effect existed in the above trial, the average pain score for patients who received, for example, dihydrocodeine in the first treatment period, would have differed (better or worse) from that seen for those who received dihydrocodeine in the second period. If a period effect had occurred, hopefully it would have applied equally to both treatment regimens—dihydrocodeine followed by nabilone, or vice versa. Therefore, by randomising patients to a treatment regimen any period effect would potentially have been averaged out and minimised in the comparison of treatments.
The trial was prone to a carryover effect (c is true)—both pharmacological and psychological effect—of the treatment received in the first period to the second treatment period. It was therefore imperative that the trial incorporated a washout period between treatment periods, the purpose of which was to minimise any carryover effects. It would have been unethical to expect patients not to receive treatment for their pain during this period. In the washout period patients were weaned off the treatment received in the first treatment period and subsequently allowed to take only paracetamol or codeine. However, a carryover effect may have still existed at the beginning of the treatment periods. To minimise such biases, the analyses were based on data from the last two weeks of each treatment period.
Because the participants received both treatments, the outcome measurements were paired. Each participant acted as his or her own control. The characteristics of the treatment groups would therefore have been similar, if not the same, at baseline. The crossover trial design was therefore less prone to confounding than the equivalent parallel groups design. Confounding is a difference between treatment groups in the characteristics that influence the association between the treatments and outcomes. These include demographics, prognostic factors, and other characteristics that may influence someone to participate in or withdraw from a trial. If confounding is minimised at baseline, any differences between treatment groups in outcomes at the end of the trial are more likely to be due to differences in treatment received than differences in baseline characteristics. The comparison between treatment groups was also made more precise because the comparison within subjects removed any natural biological variation that may have occurred in the measurement of the outcomes. Hence, the use of a crossover trial design is expected to promote internal validity compared with the equivalent parallel groups design (e is true).
Cite this as: BMJ 2015;351:h4283
Competing interests: None declared.