CONSORT 2010 statement: extension checklist for reporting within person randomised trialsBMJ 2017; 357 doi: https://doi.org/10.1136/bmj.j2835 (Published 30 June 2017) Cite this as: BMJ 2017;357:j2835
- Nikolaos Pandis, senior lecturer1,
- Bryan Chung, clinical instructor2,
- Roberta W Scherer, senior scientist3,
- Diana Elbourne, professor of healthcare evaluation4,
- Douglas G Altman, professor of statistics in medicine5
- 1University of Bern, Medical Faculty, School of Dental Medicine, Department of Orthodontics and Dentofacial Orthopedics, Bern, Switzerland
- 2Division of Plastic Surgery, University of British Columbia, Victoria, BC, Canada
- 3Johns Hopkins Bloomberg School of Public Health, Epidemiology Mailroom E6138 Baltimore, MD, USA
- 4London School of Hygiene and Tropical Medicine, Department of Medical Statistics, London, UK
- 5Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK OX3 7LD
- Correspondence to: N Pandis
- Accepted 13 May 2017
Many journals now require that reports of randomised controlled trials (RCTs) conform to the recommendations in the Consolidated Standards of Reporting Trials (CONSORT) statement.1 The CONSORT statement includes a checklist of items that should be included in the trial report. The most recent version of the checklist was published in 2010.1 These items are based on evidence whenever possible. The statement also recommends including a flow diagram to show the flow of participants from before enrolment to final analysis. Explanation and elaboration of the rationale for checklist items is provided elsewhere.2
The primary focus of the CONSORT statement is the most common type of RCT, with two treatment groups using an individually randomised parallel group design.2 Almost all elements of the CONSORT statement apply equally to RCTs with other designs, but some elements need adaptation, and in some cases additional matters need to be discussed. Members of the CONSORT group have published several extension papers3456789 that augment the CONSORT statement. Extensions of CONSORT 2010 to different trial designs have been published for cluster randomised trials,10 non-inferiority and equivalence trials,11 and N-of-1 trials.12 As part of that series, in this paper we extend the CONSORT 2010 recommendations to RCTs in which participants receive two or more treatments to different body sites.
In some RCTs the unit of randomisation is not the individual person but an organ, such as an eye, or other body site, such as a venous ulcer.13 These RCTs do not have a generally accepted name, although some specialties have specific terms; for example, a “split mouth” design is used in oral health, “contralateral” study in ophthalmology, and “split face” or “split body” in dermatology. To encompass all possible medical specialties, we call these trials “within person” randomised trials. They are not to be confused with trials in which randomisation and treatment are at the participant level, with multiple organs or body sites contributing to the outcome assessment. These are a type of cluster randomised trial that is discussed elsewhere.10 Within person trial designs have some similarities with N-of-1 and crossover trials. Within person trials differ from crossover trials, however, because the interventions are delivered at the body site level rather than the patient level. This extension will not cover N-of-1 and crossover trial designs; a specific CONSORT extension for N-of-1 trials has already been published,12 and an extension is under development for crossover trials.
Scope of this paper
Within person randomised trials present some particular challenges. One problem is the potential for a “carry across effect,” whereby, for example, an intervention applied to one eye or in an area of the mouth can affects the other eye, systemically,14 or other areas of the mouth, locally.1516 Success or failure of the first replacement hip in a patient requiring bilateral hip replacement can affect the success or failure of the second hip operation.17 A related problem is the possibility of participants dropping out of the trial if the two interventions are not applied concurrently.
In the simplest within person randomised trials two interventions (one of which may be a control or usual care) are applied to each participant at two separate body sites, either concurrently or sequentially. More complicated designs include trials with more than two interventions, more than two sites within the same participants, and a mixture of patients with bilateral and unilateral disease.
Here, we summarise the key methodological features of within person randomised trials. We consider the empirical evidence about how common such trials are and summarise published studies of the quality of reporting of such trials. Following these literature reviews, we make suggestions for additions and amendments to the CONSORT checklist adapted for within person RCTs and give examples of good reporting. This guideline will focus on the simplest form of the within person randomised trial where all participants receive two interventions, with each intervention applied to one of the two randomised sites. Most of the recommendations also apply to the more complicated designs, and we discuss some specific issues later in this paper.
Methodological features of within person randomised trials
In a within person trial treatments are randomly assigned to two organs, body parts, or body sites, such as arms, eyes, or breasts, or to two sites of a single organ, body part, or body site, such as teeth or sides of the mouth, warts, burns, or bedsores. Key design questions for within person trials are shown in box 1.
Box 1: Key design questions for within person trials
Is the within person design appropriate (ie, carry across effects are unlikely)?
Will the treatments be administered concurrently or sequentially?
Are the sites for each participant similar in terms of baseline characteristics such as location, anatomy (eg, tooth type), and severity of disease?
If treatments are given sequentially will baseline information be recorded at the time of randomisation or at the time of treatment administration?
How will the order of treatments and allocation to body sites be determined (eg, right versus left)?
Will there be any provision to monitor that the assigned treatment was actually applied to the correct site?
Will the outcome evaluator be blind (masked) to the treatment assignment of each site, and if so how?
A crucial question is whether the within person design is suitable for the circumstances. It is appropriate for conditions that occur in at least two body sites within the same person, if the stage of the condition or disease is similar in the sites to be randomised, and for treatments that can be tested locally without influencing the outcome on the matching site—that is, without carry across effect. When the interventions are not applied simultaneously, the participant’s condition should have underlying stability. Surgical wound closure and tendon repair, for example, are non-stable conditions that require a concurrent design.
The carry across effect has been of concern in trials using within person designs in several specialties.141517 It can lead to bias and tends to dilute the treatment effect. It is similar to the temporal carry over effect in crossover trials, in which lingering effects of the first intervention may require adjustment for different baselines before the second intervention or the use of wash-out periods (which are more difficult to handle in a within person trial). A within person design is unlikely to be appropriate if there is an expectation of a substantial carry across effect.
Sequential and concurrent treatment
In a within person trial, the interventions can be applied sequentially or concurrently. The sequential approach is common in trials in which it is either undesirable or infeasible to administer the interventions at the same time. Examples are bilateral hand surgeries that render the patient unable to perform basic activities of daily living and bilateral eye interventions that would render the patient without functional vision for an unacceptable period of time.
Concurrent treatments can be applied when they do not substantially affect participants’ lives (for example, for skin conditions) or where the natural history of the disease might change too drastically in the time between interventions if applied sequentially. With concurrent treatment, loss to follow-up will automatically be matched across treatment arms, but harms (unintended effects) may be difficult to attribute to a specific treatment. Another concern in concurrent treatment trials is the potential for confusion as to which site receives which treatment, particularly when there is a long treatment period. Traditional methods for monitoring compliance might be insufficient in within person trials when the participant is responsible for administering the treatment.
As with crossover and cluster trials, an efficient sample size calculation requires an estimate of a correlation coefficient. For within person trials the expected within person correlation of outcomes with the two treatment options must be incorporated into the sample size estimation. In practice, for many trials it is unlikely that there will be data to support a realistic estimate of this value, yet ignoring it is likely to result in an overestimation of the sample size. Some attempt to estimate a correlation coefficient is desirable.
Key questions relating to sample size thus include whether the sample size calculation should take into account the expected within person correlation of outcomes, and, if so, how will this correlation coefficient be estimated. And how sensitive the sample size calculation is to deviations from the postulated correlation coefficient.
Appropriate statistical methods that consider the correlation between sites should be used. These methods can be quite simple, such as a paired t test. Other considerations include losses to follow-up and handling of missing data, which can include both sites in each participant or just a single site.
In concurrent trials, when harms affect participants in a way that is not specific to a site, such as headache or nausea, attributing the symptom to a specific intervention can be difficult or impossible.
How common are within person randomised trials?
Because no common terminology exists for within person randomised trials, they are difficult to identify using traditional electronic search methods.
A PubMed sample of 1360 randomised controlled trials published in 2012 found that 24 (1.8%) were labelled as “split body” or used a within person design (D Altman, personal communication). Two earlier samples yielded prevalences of 1.7% (9/519) of trials published in 2000 and 2.6% (16/616) of trials published in 2006.1819 Overall, about 2% of published RCTs seem to use a within person design. Within person randomised trials are more common in some specialties (ophthalmology, dentistry, and dermatology), than others (rheumatology), and apparently not done at all in others (cardiovascular medicine, hepatology).
A recent study identified 43 split mouth designs in a sample of 413 RCTs (10%) published in eight oral health journals with high impact factors from 1992 to 2012.20 Another study found that 67 of 276 (24%) RCTs published between 1989 and 2011 in implant dentistry journals used the split mouth design.21
Lee et al found that 13% (9/69) of a sample of ophthalmology RCTs had a within person design in which the two eyes of an individual were randomly assigned different treatments.22
What is the quality of reporting of within person trials?
Although articles on the quality of reporting of RCTs in relation to CONSORT are relatively common, only two investigators have specifically examined the quality of reporting of within person trials. Lesaffre et al examined the reports of 34 split mouth studies published in 2004.23 Just over half of the trials reported an appropriate statistical method for a within person design, and only 15% included comments on the potential correlation and treatment carry across effect that could occur with this study design. To assess quality of reporting, the authors adapted the checklist for the cluster RCTs extension to the CONSORT guidelines.10 Overall reporting was poor, with only 41% of split mouth trials reporting the method of random sequence generation and 26% reporting an allocation concealment mechanism.
Scherer et al in 2012 found that only 42% of 60 within person ophthalmology trials reported a rationale for using that design.24 Only 18% reported an adequate method of allocation concealment, and 52% reported that the person measuring the outcome was masked. Other studies indicated that most within person trials do not take into account the within person correlation in sample size calculations22232526 or in the statistical analysis.172223262728
Methods used to develop this CONSORT extension
This CONSORT extension was first discussed by Doug Altman, Diana Elbourne, Bobbi Scherer, and Barbara Hawkins in 2003, when the main focus was trials in ophthalmology. Subsequently Bryan Chung expressed an interest from the perspective of hand surgery. The work did not progress until 2013, when Nikolaos Pandis raised the matter from the dental perspective, and a “virtual” group comprising the authors of this paper was convened in 2013. This group met many times over the intervening years, mainly by teleconference, with occasional face-to-face meetings of two or more authors.
CONSORT checklist for within person RCTs
Initial work on this extension to the CONSORT checklist preceded the 2010 update of the CONSORT statement but was mainly conducted between 2013 and 2016. The checklist and explanatory text were informed by reviews of published randomised trials (as cited) and completed through teleconferences over several years. In the absence of any specific funding we were unable to follow all of the recommended procedures of the EQUATOR group,29 such as a face-to-face consensus meeting.
Fig 1⇓ shows the standard CONSORT checklist and our suggested modifications for within person randomised trials. In this section we discuss each of these checklist items, explain the background, and provide one or more examples of good reporting. We also discuss several checklist items for which we do not suggest any modification but for which implementation requires specific considerations for within person RCTs. For some items there are different considerations for concurrent and sequentially delivered interventions.
Title and abstract
Item 1a: Title
Standard CONSORT item—Identification as a randomised trial in the title.
Extension for within person trials—Identification as a within person randomised trial in the title.
Example 1—“A comparison of anterior and posterior chamber lenses after cataract extraction in rural Africa: a within patient randomised trial.”30
Example 2—“Effects of intra-alveolar placement of 0.2% chlorhexidine bioadhesive gel on dry socket incidence and postsurgical pain: a double blind split mouth randomised controlled clinical trial.”31
Example 3—“Randomised, double blind, split face study of small-gel-particle hyaluronic acid with and without lidocaine during correction of nasolabial folds.”32
Example 4—“Randomised, double blind, contralateral eye comparison of myopic LASIK with optimized aspheric or prolate ablations.”33
Explanation—Identification of the trial as a within person randomised trial ensures that readers will start thinking of the implications of the design in relation to sample size and analysis. We recognise that different terms to describe those designs are used depending on the specialty. Terms such as split mouth, split face, split body and contralateral convey the same within person design in different specialties and are suitable alternatives. In addition, it is desirable, even though not permitted due to length by all journals, to include in the title information on participants, interventions, comparators, and outcomes.
A review of split mouth trials published in 2004 showed that only two of 33 identified the trial as a split mouth in the title.23 A more recent review of published RCTs (1992-2012) in the eight oral health specialty journals with the highest impact factors found that only seven of 43 (16%) trials with a split mouth design identified the trial as split mouth in the title.20
Item 1b: Abstract
Standard CONSORT item—Structured summary of trial design, methods, results, and conclusions (for specific guidance see CONSORT for abstracts3).
Extension for within person trials—Specify a within person design and report all information outlined in table 1⇓.
Example—See fig 2⇓.
Explanation—Clear, transparent, and sufficiently detailed abstracts are important. Some readers might have access only to the abstract, and many others will skim it before deciding whether to read further. A well written abstract also helps retrieval of relevant reports from electronic databases. In 2008 a CONSORT extension on reporting abstracts was published,3 and those recommendations were incorporated into CONSORT 2010.
Abstracts for within person RCTs should indicate the paired or within person nature of the trial. Table 1⇑ shows the minimum information that should be included in the abstract of a within person trial, in addition to the items recommended for all trials.
We were not able to find examples of good reporting that tackled all the items required. We therefore developed an example abstract by enhancing a published abstract (fig 2⇑).
Item 3a: Trial design
Standard CONSORT item—Description of trial design (such as parallel, factorial) including allocation ratio.
Extension for within person trials—Rationale for using a within person design and identification of body sites.
Example 1—“In this study, we present the results of a contralateral eye study in which patients were randomised to undergo implantation with either the Tecnis ZM900 silicone multifocal intraocular lenses (MIOL) or the Tecnis ZMA00 acrylic multifocal IOL [intraocular lens]. Using a contralateral study model, we are able to reduce many of the variables that can occur between patient groups.”34
Example 2—“The GDPs recruited children who had caries affecting pairs of primary molar teeth, which were matched for tooth type, arch and extent of caries.”35
Explanation—The within person design avoids possible imbalance between interventions on participant level variables. The within person design is efficient, because a smaller sample size is required than for a standard design, and losses to follow-up are usually equal between treatment groups. However, carry across effects might reduce efficiency and bias the trial results—such a design should not be implemented if a carry across effect is expected. All alternatives must be considered, and if a within person trial design is used it must be made clear why it was judged to be the most appropriate and robust design. The treatment of the body sites can be concurrent or a sequential (see item 5).
In within person designs baseline characteristics are balanced at the participant level, but imbalances can occur for site specific variables, notably severity of disease. The identification and selection process of the included sites should be described, as shown in example 2, if applicable.
Item 4a: Eligibility criteria for participants
Standard CONSORT item—Eligibility criteria for participants.
Extension for within person trials—Eligibility criteria for body sites.
Example—“The inclusion criterion was uncomplicated age related bilateral cataract with the potential to see 20/40 or better in each eye. Exclusion criteria were any concurrent medication apart from ocular lubricants, any coexisting ocular pathology, unilateral amblyopia, previous intraocular surgery or laser treatment, retinal complications, pupil dilatation <7 mm, any surgical complications or inability to co-operate or maintain follow-up.”36
Explanation—In within person trials two sets of eligibility criteria are needed: the eligibility of the individual participant and the eligibility of the body site (such as limb or eye). For participants to be eligible in a within person design, they must be able to provide at least two body sites to be treated, one to receive each intervention. Eligibility criteria for the body sites should include criteria related to the comparability between sites within a person.
Item 5: Interventions for each group
Standard CONSORT item—The interventions for each group with sufficient details to allow replication, including how and when they were actually administered.
Extension for within person trials—Whether interventions were given sequentially or concurrently.
Example 1:Concurrent application of interventions—“Patients were given simultaneous injections of buffered and unbuffered 2% lidocaine with epinephrine 1:100 000. The needles were inserted simultaneously and the anesthesia was injected for a 20 second count for a total volume of 1.0 ml per injected side.”37
Example 2: Sequential application of interventions—“An investigator with no clinical involvement in the trial used the list to prepare directions assigning one of the intraocular lenses (IOLs) (iMics1 NY-60 IOL or AcrySof SN60WF IOL) for placement into the patient’s right eye, the first eye to be operated. The directions for each operation were placed in sequentially numbered and sealed envelopes. The surgeon opened the envelopes in sequence on the day of surgery after hydrodissection and phacoemulsification and implanted the randomly assigned IOL specified into the patient’s first eye. The second eye was implanted with the other IOL one week later.” 38
Explanation—In addition to the standard CONSORT explanation of detailed reporting of interventions for the purposes of reproducibility, it is important to describe whether the intervention was applied to different body sites concurrently or sequentially. There are several reasons for this. Firstly, the intervention on one site may dilute the effect on contralateral site due to potential carry across effect (although ideally trials with likely carry across effects would not have used a within person design). Secondly, in sequential designs the time between interventions might be long, so the baseline state of the untreated side might change in the period between interventions. Thirdly, loss to follow-up is not necessarily balanced in people who did not receive both interventions concurrently. Measures taken to avoid a potential carry across effect should be described along with the reasons for taking these measures.
We recommend that trial authors consult the template for intervention description and replication (TIDieR) checklist39 for a list of intervention details that authors should include in their reports. Authors might also find helpful the CONSORT extensions for non-pharmacological interventions,7 for herbal interventions,5 and for acupuncture,4 if applicable.
Item 6a: Outcomes
Standard CONSORT item—Completely defined prespecified primary and secondary outcome measures, including how and when they were assessed.
Extension for within person trials—Outcomes should be clearly defined as per site or per person.
Example 1: Efficacy end points—“The primary efficacy end point was complete mild actinic keratosis (AK) lesion response rate per side at week 12. Additionally, lesions that had a complete response after one session (at week 12) were followed up until week 24 to observe their maintained response rate. Complete lesion response rate for all lesions (mild and moderate lesions) at week 12 was a secondary efficacy end point.”40
Example 2: Safety end points—“The primary safety end point was the subject’s assessment of maximal pain reported just after the treatment session at the baseline visit. Secondary safety end points included the investigator’s local tolerance preference (one week after baseline session) and incidence of adverse events (AEs) throughout the study. Also, the investigator performed a clinical assessment of each lesion achieving complete response regarding the following signs and symptoms: scarring, atrophy, induration, redness, or change in pigmentation.” 41
Example 3: Patient preference outcome—“The order of needle sticks was randomized according to side of the hand (volar vs dorsal) and order of long fingers (right vs left). All needlesticks were performed with a standard technique by only two investigators. Participants were instructed to look away during the needlesticks. Following both needlesticks they had to rank the discomfort associated with each needlestick on a scale of 0 (no pain) to 10 (worst pain imaginable). The participant was then asked to rank ‘which hand they would prefer to receive an injection in if it was required in the future.’” 41
Example 4: Patient preference outcome—“Preference of the eyelid warming techniques (eye mask, eye bag, or no preference) was also recorded after treatment.”42
Explanation—Complete definition of outcomes should include the timing and method of the measurement. In trials with concurrent interventions, the timing of outcome measurement will not differ much from conventional parallel group trials. Authors should explain how outcomes for each site were measured independently.
When interventions are sequentially administered, however, site specific outcome measurements can be made at the same time after each intervention or simultaneously after the second intervention. Simultaneous case measurements might be affected by lag time bias, as the first treated site has a longer recovery time than the second site, thereby possibly seeming superior (or inferior) as a result of the time difference. In a randomised trial this effect is expected to be balanced out across participants unless there is an interaction between treatment and time, where, for example, the first site always does better, regardless of treatment. If such an interaction is expected, however, a sequential design should be avoided.
In the sequential design pretreatment baseline measurements might differ in time (before either site being treated) and the time to when the second site is treated, and this may be problematic particularly in diseases that might progress or evolve (eg tumour size, arthritis). The preferred option is to report baseline values at the time of randomisation; a second option is to report baseline values at the time of treatment allocation. It should be clear as to whether these values were similar in terms of baseline characteristics such as location, anatomy (eg tooth type) and severity of disease, and the time (at randomisation or at treatment allocation) when the values chosen to represent baseline values were recorded.
Investigators should report outcome measurement timing per site, as well as participant follow-up schedules and should clarify which value was used in the analysis (eg a six month participant follow-up or a six month body site follow up).
For any outcomes reported per person, authors should explain how their measurement is affected by each participant being exposed to two interventions despite the single measurement value. Per person outcomes are less relevant for between treatment comparisons but should be reported as they contribute to evidence. Participant level outcomes can include those related to participant’s preferences, harms, and quality of life.
Item 7a: Sample size
Standard CONSORT item—How sample size was determined.
Extension for within person trials—Report the correlation between body sites.
Example 1—“A sample size calculation was performed based on the assumptions that the main outcome measurement (changes in sum score between baseline and end of treatment on visual analogue scale) is continuous in nature, fairly normally distributed, and that an additional improvement in the intervention side of 10 percentage points (standard deviation=15 percentage points) is considered clinically relevant. If the incidence of the carpal tunnel syndrome on one wrist could be considered completely independent from the incidence on the other wrist, 36 independent observations in each group would be necessary to detect that difference at the 5% level (α=0.05) with an 80% chance (β=0.2).”43
Example 2—“To estimate sample size for the primary outcome—pain felt during insertion of the needle and injection of the anaesthetic according to the VAS [visual analogue scale]—we took into account the correlation induced by the paired nature of the data. In a previous trial, the corresponding SD [standard deviation] in the VAS score could be estimated at 1.2. Assuming that the SD is equal in the two randomisation groups and that the correlation between the pain scores for the same patient in the first and second treatment is 0.6, the difference in VAS scores would have a SD of 1.10. With a type I error risk of 0.05, we would need 30 patients to guarantee 80% power to detect a minimum true difference of 0.6 points in mean pain experienced during conventional infiltration and intraosseous anaesthesia.”44
Explanation—A key advantage of the within person design is the smaller sample size required than for a design in which the randomisation unit is the participant. This is because each participant acts as their own control, so the interindividual variability is reduced, resulting in increased study power and a decrease in the number of participants required compared with a study in which participants receive only one intervention.
For a continuous outcome, the reduction in sample size of using a within person design compared with a parallel group design increases as the within person correlation increases. As the coefficient of correlation (r) gets closer to one, the required sample size (N) can be dramatically reduced, as indicated by the following formula45: Npaired=(1−r)Nparallel/2. So for r=0.8, Npaired/Nparallel is 0.2/2=0.1 (10%). Reported correlation coefficients in ophthalmology,46 dermatology,47 and orthodontics48 were 0.80, 0.80, and 0.50, respectively. Balk et al49 calculated correlation coefficients for 811 within group correlation values from 123 studies with 281 study groups. The median within group correlation value across all studies was 0.59 (interquartile range 0.40-0.81). No heterogeneity of correlation values across outcome types and clinical domains was observed.49 It is important that trial authors report the usual quantities required for sample size calculation, including expected means (and standard deviations) for each treatment group, significance level, and power, but also the assumed correlation coefficient as shown in example 2 and the source of the correlation coefficient used. In example 1, the sample size calculation was performed without accounting for the potential correlation between the paired treatment outcomes. This approach will result in a larger sample size than if the correlation coefficient between treatment outcomes is not zero. The correlation coefficient is often not reported in published within person trials.27
With a binary outcome, not considering the paired nature of the data will result in a sample size that was the same as for a non-paired design and is thus conservative. Accounting for the paired design during sample calculation is complicated. Authors are encouraged to report if they have taken any steps to account for the paired design during sample size calculation and to give appropriate enough details so that for the sample calculation to be replicated.50
Any allowance in the sample calculation for losses to follow-up of individuals and or sites should also be reported.
Item 8b: Sequence generation
Standard CONSORT item—Type of randomisation; details of any restriction (such as blocking and block size).
Extension for within person trials—Methods used to determine the allocation sequence of body sites and treatments within an individual (eg how first site to be treated was decided).
Example—“The eye to be operated upon first was selected by a computer generated table of random numbers by one of the authors (VV). The second eye underwent cataract surgery after a gap of at least 2 weeks following surgery in the first eye . . . Patients were randomized to either receive enoxaparin in the intraocular infusion fluid (Group I) or not receive enoxaparin (Group II). The randomization code was allocated inside the operating room just before the surgery on the first eye. The second eye received alternate treatment.”51
Explanation—In within person RCTs interventions can be administered concurrently or sequentially. Randomisation is used to determine which intervention is applied to which body site and, in trials with sequential interventions, also to determine which site is treated first. Thus both how the site to be treated first was determined and which treatment was administered should be reported.
In the concurrent approach the two treatments are delivered at the same time, whereas in the sequential design there is a “non-trivial” time lag between the two interventions. In both designs which site will receive what treatment must be determined. A sensible approach would be to use one random allocation to determine which site is to be treated first and a second random allocation for which treatment will be administered first. Another approach would be to randomise in a single step to both body site and treatment. Under this scenario the randomisation list would require the allocation to all possible combinations of site and treatment, as in a four arm trial: site one-treatment one, site one-treatment two, site two-treatment one, and site two-treatment two. The method of minimisation,52 where future allocations are based on previous allocations, with site and treatment as the factors, would also be suitable.
Item 10: Implementation of randomisation
Standard CONSORT item—Who generated the random allocation sequence, who enrolled participants, and who assigned participants to interventions?
Extension for within person trials—replaced by item 10a.
Item 10a: Extension for within person trials: Who generated the random allocation sequence, who enrolled participants, and who assigned body sites to interventions.
Example—“The clinical research coordinator for this trial generated a randomization code with equal numbers (1:1 ratio) using computer software and assigned each patient to one of the two groups according to the computer generated randomization code. The group to which the patients were assigned was directly communicated by the coordinator to a member of the operating room staff who prepared the intraocular lens (IOL). The surgeon was informed about the type of surgery just before surgery . . . To ensure allocation concealment, the coordinator kept the assignment schedule until all data were collected.”53
Explanation—Reporting of how the random sequence was implemented—specifically, who generated the allocation sequence, who enrolled participants, and who assigned participants to trial groups—is recommended. In the given example only two eyes were available per participant. It is, however, important to explain how sites were selected when many were available.
Item 12a: Statistical methods
Standard CONSORT item—Statistical methods used to compare groups for primary and secondary outcomes.
Extension for within person trials—Statistical methods appropriate for within person design.
Example 1—“Statistical analyses included the paired t test and McNemar test. Onset was defined as the time to improve by at least 1 scale point. A paired Wilcoxon signed rank test was used to compare the differences in onset of action for abobotulinumtoxin A and onabotulinumtoxin A.”54
Example 2—“A Wilcoxon sign-rank test was used to compare treatment and control sides of the nasal cavity for both pain and discomfort.”55
Explanation—In line with recommendations made by the International Committee for Medical Journal Editors and the CONSORT group, analytical methods should be described “with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results.”56 Identification of the within person design and the statistical methods used allows readers to evaluate the methods of analysis. In examples 1 and 2 a McNemar’s test (proportions) and parametric and non-parametric tests for matched/within person designs were applied, which are appropriate.57
When treatments are received sequentially, problems can arise from carry across effects and a baseline adjustment may be required. For example, in split mouth trials baseline values and failure of dental implants loaded at different time points might be influenced by the time interval between the two interventions and the status of the early loaded implant. For example, if the early loaded implant results in a poor outcome or the time between operations is long, or both, the patient might rely excessively on the other side of the mouth, which might be related to the late loaded implant. This additional burden on the second implant can have a negative effect on that implant as well. Conversely, if the outcome in the first implant is good and the burden on the second implant is small, a satisfactory outcome in that implant can be more likely. For the sequential design, baseline values used for the adjustment should be preferably those collected at the time of randomisation and not at the time of treatment.
Item 13a: Participant flow (a flow diagram is strongly recommended)
Standard CONSORT item—For each group, the numbers of participants who were randomly assigned, received intended treatment, and were analysed for the primary outcome.
Extension for within person trials—Number of participants and number of body sites at each stage (fig 3⇓).
Example—see fig 2⇑.
Explanation—The flow diagram is a key element of the CONSORT Statement and has been widely adopted. For within person trials it is important to understand the flow of both participants and body sites. Although we recommend a flow diagram for communicating the flow of participants and body sites throughout the study, the exact form and content can vary in relation to the specific features of a trial.
Item 13b: Losses and exclusions
Standard CONSORT item—For each group, losses and exclusions after randomisation, together with reasons.
Extension for within person trials—Number of participants and number of body sites lost or excluded at each stage, with reasons.
Example—“The 93 subjects enrolled in the study were considered the Safety Dataset and underwent adverse event analysis. Twelve of these 93 subjects did not complete the primary endpoint, non-weight bearing passive flexion, 12 months after surgery. One subject withdrew consent 10 months after the bilateral surgeries; the withdrawal was unrelated to either implant. There were two protocol violations in which the STD [standard] components (femurs with lugs) were not listed in the study protocol, but the appropriate HF [high flexion] devices were implanted on the contra-lateral side. One of the two subjects with a protocol violation had their eligible HF knee complete the primary endpoint; therefore, that knee was used in unpaired analyses. There were two revision TKA [Total Knee Arthroplasty] procedures, one HF device and one STD device. The HF device was revised six months after the index surgery and the STD at seven months; both revisions were secondary to deep infection, and were performed at different centers. The remaining seven subjects were lost to follow-up, leaving 81 bilateral subjects available for the primary efficacy dataset and respective analyses and provides a 92.5% (86 of 93) subject follow-up compliance rate.” 59
Explanation—When interventions are delivered sequentially, a participant who drops out part way through the trial may have only one body site assessed for outcome. With concurrent interventions, when a participant drops out of the trial all included body sites also drop out. But it is also possible for a single randomised body site to drop out while the other body site from the same person remains within the trial. For example, in a concurrent split mouth design, even though both interventions are applied to both sites, one site may later drop out due to an unexpected event such as abscess or an extraction that does not allow outcome recording on that site. The event may or may not be related to the intervention.
Authors should indicate the loss of body sites for each intervention, preferably in the flow diagram.
Item 15: Baseline data
Standard CONSORT item—A table showing baseline demographic and clinical characteristics for each group.
Extension for within person trials—Baseline characteristics for body sites and individual participants as applicable.
Example—See table 2⇓.
Explanation—Random assignment by individual person ensures that any differences in group characteristics at baseline are the result of chance rather than some systematic bias.61 For within person randomised trials, the risk of chance imbalance is lower as all participants receive both interventions, so the baseline characteristics are identical between groups. But treatment sites can have different characteristics at baseline. Although important differences can be controlled for in the analysis, reporting of baseline values for both the person and the site enables the reader to judge whether any observed differences owing to chance might have clinical relevance.
Item 16: Numbers analysed
Standard CONSORT item—For each group, number of participants (denominator) included in each analysis and whether the analysis was by original assigned groups.
Extension for within person trials—Number of randomised body sites in each group included in each analysis.
Example—“Forty five patients with bilateral carpal syndrome (90 wrists) fulfilled all inclusion criteria; 11 (24%) of these patients discontinued treatment after randomisation (eight patients early after randomisation because of non-compliance in keeping appointments, and three patients because of excessive pain requiring additional therapeutic measures). Thus 34 patients—that is, 34 actively treated and 34 sham treated wrists—completed the study . . . Thirty of them (67% of the initial 45 patients) completed a follow-up at six months.” 43
Explanation—The number of participants and sites that contribute to the analysis of a trial is essential to interpreting the results. But the analysis of each outcome might not include all participants or all participant sites. If participants do not contribute to the analysis in a within person trial, the corresponding sites might be lost. One site, however, can contribute to the data if the other site is lost. Because the sample size, and hence the power of the study, is calculated on the assumption that all sites and participants will provide information, the number of participants and sites contributing to a particular analysis should be reported so that any potential drop in statistical power can be assessed. In addition, and as explained in detail in the CONSORT 2010 guideline,2 it should be specified whether the analysis was per protocol or intention-to-treat, with specific details on how the selected analysis approach was implemented. In the included example it is not explicitly stated how many wrists were analysed, but it is implied that possibly 30 of 45 patients were analysed at six months.
Item 17a: Outcomes and estimation
Standard CONSORT item—For each primary and secondary outcome, results for each group, and estimated effect size and its precision (such as 95% confidence interval).
Extension for within person trials—Observed correlation between body sites for continuous outcomes and/or and matched pair tabulation for binary outcomes.
Example 1—See table 3⇓.
Example 2—See table 4⇓.
Example 3—“Thirty-five (65%) of 54 patients reported that the buffered lidocaine was less painful than the unbuffered lidocaine on initial injection. Seven patients (13%) distinguished no difference, and 12 patients (22%) felt less pain with the unbuffered lidocaine.”37
Explanation—The standard CONSORT guideline should be followed when reporting the results of within person randomised trials: point estimates with confidence intervals should be reported for primary and secondary outcomes. Given the effect of the within person correlation on the power of the study, the correlation coefficient for each primary outcome being analysed should also be provided. However, if the mean difference and standard deviation of the differences between treatment groups are reported, then the sample size of a future trial can be calculated without the need of the correlation coefficient.
For binary outcomes, a presentation using the matched tabulation format (table 5⇓ and example 2) is desirable, as it allows the reader to see the concordant and discordant pairs. The matched tabulation facilitates the use of such trials in future meta-analyses as it allows using appropriate formulas to adjust the between treatment variance downwards by accounting for the within person correlation, even when not explicitly presented.646566 Presentation of the 2×2 table of results from a within person design in a parallel trial format does not allow for appropriate adjustments of the between treatment variance.66 The paired presentation is also helpful for future sample size calculations.
Ideally, patient preference outcomes should also be reported at the participant level, as in example 3.
Item 19: Harms
Standard CONSORT item—All important harms or unintended effects in each group (for specific guidance see CONSORT for harms9).
Extension for within person trials—Harms or unintended effects reported by participant and by body site.
Example 1—“Pimecrolimus cream was generally well tolerated. No severe adverse events were encountered during the study, although 20% of patients experienced an adverse event (three of the 15 patients who completed the study). The most common side effects were application site reactions (burning, stinging), which were self limiting. One patient complained of hyperpigmentation in an initially severely inflamed area, which was considered to be post inflammatory hyperpigmentation. No patient reported exacerbation of rosaceiform eruption after the use of pimecrolimus.” 22
Example 2—“Treatment related adverse events (AEs) occurred in 41/48 (85.4%) patients. All treatment related AEs were application site reactions, most commonly irritation. The majority of AEs were of mild-to-moderate severity. Almost all treatment related AEs occurred during the split face phase of the study, with only 11 patients (22.9%) having a treatment related AE during full face treatment with clindamycin 1%/benzoyl peroxide 5% gel (C/BPO). Three patients developed severe cutaneous AEs during the split face phase of the study, all of which subsided during continued treatment, treatment interruption or dose reduction. One patient discontinued treatment due to moderate application site irritation. There were no serious AEs. A post hoc analysis, which was conducted to determine on which side of the face AEs occurred, indicated that treatment related AEs, including irritation, dryness, and erythema, were significantly less common with C/BPO than the adapalene 0.1%/benzoyl peroxide 2.5% gel (A/BPO) (P≤0.01).” 67
Example 3—“Minor and transient adverse reactions included herpes simplex virus reactivation (confined to the lips) (one patient, 3.5%).”68
Explanation—Presentation of harms or unintended effects at the body site (examples 1 and 2) and at the participant level (example 3) is important for within person randomised trials. In concurrent trials in which harms have affected participants in a way not specific to a site (eg headache, nausea), it will often be impossible to attribute the symptom to a specific intervention. In this case no attempt should be made for those outcomes to be attributed to a particular intervention.
Item 20: Limitations
Standard CONSORT item—Trial limitations, addressing sources of potential bias, imprecision, and, if relevant, multiplicity of analyses.
Example—“Thirty two patients (23 women, 9 men) were initially enrolled for the study. Out of those, complete data were available for 26 patients (18 women, eight men). Two of the remaining six patients failed to attend for their second side operation. Both were women who had an open release on one side and reported complete relief of symptoms at two weeks: both then failed to attend any further appointments despite repeated reminders. Another, who reported good relief of her symptoms after a Knifelights release, refused to have the second side done under local anaesthesia.”69
Explanation—A limitation of the within person design is that the treatment of one member of the pair of organs or sites can affect the other member of the pair, either to improve the outcome with the other intervention or to suppress the effect. This carry across effect could potentially render a within person trial invalid, and such a limitation is unlikely to be reported given that it would invalidate the trial results. Possible limitations that should be reported include losses to follow-up before the second intervention is applied in sequential designs and mixing up of the interventions, such as which eye gets which eye drops.
Item 21: Generalisability
Standard CONSORT item—Generalisability (external validity, applicability) of the trial findings.
Example—“It is also interesting to note that the binocular vision was not affected by the conjunction of corneal and intraocular refractive procedures. This seems logical, as ametropias and accommodative effort were almost symmetrical in both eyes of the patients, before and after surgery. The visual improvement had no effect on binocularity; however, this may have been because the patients in this study had no previous asthenopic troubles. Perhaps a different result would have been obtained if the previous state of binocularity had been more fragile (eg in high unilateral myopia). Nevertheless, one can remark that the only patient with previous strabismus had no change in the postoperative tests. In this study, it has been shown that in symmetrical myopia, if preoperative binocular vision is correct, the use of the two different techniques (corneal or intraocular refractive surgery) on either of the eyes had no effect on binocularity, and the tremendous difference in keratometry power was well tolerated.” 70
Explanation—Generalisability refers to the applicability of trial findings to other settings; therefore, a question for within person trials is whether the findings are externally valid to patients with unilateral disease or who receive the same intervention to both sites. Bilateral disease can sometimes indicate poorer clinical status than unilateral disease. For example, diabetic neuropathy is a systemic consequence of diabetes that is considered worse if multiple limbs are affected, and the need for multiple dental implants is indicative of a worse dental condition.
Giving the participant the interventions with a time difference, eg early and late loaded implants or one hip replacement at a time, can potentially influence the outcome. The outcome of the first intervention could affect the outcome of the second intervention and hence the applicability of the within person trial findings in other settings. In some cases, however, the sequential approach is standard clinical practice (eg cataract surgery).
More complicated trial designs
We have largely discussed reporting of the simple within person design, where each participant has two sites that receive the two competing interventions either concurrently or sequentially. Here we briefly discuss more complicated variations of the simple within person design.
Asymmetric conditions (multiple lesions)
Some conditions (such as warts, bedsores, leg ulcers, psoriasis, and dental caries) can occur in multiple sites concurrently. Trials of such conditions require careful consideration of study design, with strong implications for data analysis and the presentation of results.
Study design and treatment allocation—Suppose, for example, we want to design a randomised trial to see which of two treatments leads to better outcome for treating lesions of some sort (eg faster resolution). Participants in a trial are likely to have a varying numbers of lesions or affected body sites. Several designs are possible:
Include just one lesion per patient either randomly selected or perhaps the most severe lesion.
Example: Watson et al71 compared high frequency ultrasonography for up to 12 weeks plus standard care with standard care alone to treat venous ulcers. The primary outcome was time to healing of the largest eligible leg ulcer.
Example: Rajak et al72 compared recurrence of trachomatous trichiasis in Ethiopia using either absorbable or silk sutures by randomising only one eye per participant.
Comment: This design avoids potential carry across effect by turning the trial into a parallel group design but loses the efficiencies of a within person design.
Choose exactly two lesions per patient (disregarding any additional lesions and patients with only one lesion), at random or in relation to severity. Select at random which treatment each lesion will receive and carry out a simple within person paired analysis.
Example: “The GDPs [general dental practitioners] recruited children who had caries affecting pairs of primary molar teeth, which were matched for tooth type, arch, and extent of caries. Where more than one pair of matched carious lesions were present in a child’s mouth, the dentist chose which pair should be part of the study. Any carious teeth outwith the study were managed as per the dentists’ normal treatment regime.”35
Comment: Turns the design into the simple within person design covered in this guideline; however site selection can be a source of bias.
Randomise patients to a treatment that is then applied to all their lesions and consider whether all the lesions disappeared or not.
Example: “[Participants] were 101 hands (79 patients) treated in the department. ECTR [endoscopic carpal tunnel release] was performed in 51 hands (40 patients), and OCTR [open carpal tunnel release] was performed in 50 hands (39 patients).73
Example: “Fifty one consecutive patients (44 women and 7 men) with unilateral or bilateral hallux valgus gave their informed consent before entering the trial. The type of osteotomy for each patient was randomised by the use of a computer generated list. In bilateral cases, both feet had the same selected operation during the same operating session. The Wilson group included 42 feet in 26 patients (three with rheumatoid arthritis) . . . The chevron group comprised 45 feet in 25 patients.” 74
Comment: This design is a cluster randomised trial in which the clusters are the individual participants, and it uses the maximum potential amount of data. In this design a combined severity index across all lesions can be calculated for a patient, such as the total lesion area. This converts the design into an individually randomised trial.
Randomise each lesion separately, possibly using blocking within patients to make sure that each patient receives both treatments (patients with just one lesion could be excluded).
Example: Stender et al75 randomised individual warts (1-19 per patient) using blocks of size two within patients.
Comment: This design avoids choosing only some affected lesions for participants who have many and is similar to a matched clustered randomised design with variable cluster size as some patients with more lesions contribute disproportionately to the overall result. Proper analysis can downplay the effect of a single patient with multiple lesions accordingly.
Group each patient’s lesions by site, eg by limb or by side of body, and randomise the sites within patients, so that all lesions in one site receive the same treatment.
Example: Helsing et al76 compared fractional CO2 laser assisted photodynamic therapy versus laser alone in 10 organ transplant recipients with a total of 680 actinic keratosis and 409 wart-like lesions on the dorsal hands.
Comment: This also resembles a matched clustered randomised design. As for cluster RCTs, including a small number of patients each with a larger number of lesions is less desirable than a large number of patients with few lesions each. Because of intra-individual correlations the greatest power comes from having more patients with fewer lesions. Not only is there much less impact of the intra-individual correlations, but there is better generalisability.
Within person trials can evaluate more than two treatments if all included patients have three or more lesions.
Example: “To be included in the study, participants were required to have at least three radiographically observed caries proximal lesions in the posterior teeth, with a score of 3 or 4 in the following modified radiographic scoring system: 0, no radiolucency; 1, outer half of enamel; 2, inner half of enamel; 3, around the enamel dentin junction; 4, outer third of dentin; 5, middle third of dentin; 6, inner third of dentin; and 7, not assessable. The three lesions were randomly allocated (in randomly permuted blocks generated by SPSS) to one of three groups undisclosed to the participants: A, infiltration; B, sealing; C, placebo.” 77
Comment: This is a three arm trial that allows the comparison of three treatments using a within person design.
Analysis—Statistical analysis will also vary according to the design, using methods appropriate for binary outcomes (eg disappearance of wart), time to event (eg time to heal), or continuous outcomes (eg reduction in size of lesion). Multilevel modelling can be implemented when multiple sites in participants are analysed.
A common error is to ignore the design and analyse data at the level of the lesion—that is, to assume that each lesion is from a different person. This leads to spurious precision. For example, Stender et al75 randomised individual warts (1-19 per participant) but analysed the data at the level of the wart not the participant.
For designs with multiple sites the data can be reduced to one observation per intervention by combining across multiple lesions. For example, the RECIST criteria78 are used to get an overall measure of severity for patients with multiple solid tumours (eg mesothelioma). Another approach is to take for each patient the proportion of lesions successfully treated (eg Wiegell et al79). The disadvantages of these approaches include loss of information and assignment of equal weight to all patients regardless of the number of affected lesions. Whether treatments may be less effective for patients with more lesions can be considered in a subsidiary analysis.
Presentation of results—Authors should report the distribution of the number of affected lesions across patients separately for each treatment.
Mixture of participants with unilateral and bilateral disease
Example—“If both eyes had high risk prethreshold ROP [retinopathy of prematurity], one eye was randomized to treatment at the prethreshold level, and the other (the control eye) was followed and managed conventionally. If the control eye reached threshold severity of ROP, and this was confirmed by a second examiner, the eye underwent peripheral retinal ablation. Otherwise, it was observed. When only one eye had high risk prethreshold ROP and the fellow eye had milder disease, a separate randomization scheme assigned such children with asymmetric ROP to one of the two study groups (early treatment of the high risk eye versus conventional management of the high risk eye, with treatment at threshold if needed). Restricted randomization was performed within each study center using a block size of 2-6. The exact block size was unknown to study center personnel. This ensured that after every block was completed, an equal number of infants with asymmetric disease would be in each study group. The small block size was necessary since only 20% of all children within a study center who meet the criteria for randomization were expected to have asymmetric disease. If the less severe fellow eye subsequently progressed, it was managed conventionally.”80
Comment: This design is a mix of a simple parallel design in which participants with unilateral disease receive a single treatment selected at random and a within person design in which the two available sites per individual are randomised to receive one of two treatments. The unilateral and bilateral datasets should be analysed separately using methods appropriate for independent and paired data, respectively. The two results can be possibly combined using meta-analysis methods to give an overall effect.8182
Tables 6 and 7⇓, adapted from the ETROP trial,82 show baseline data and estimates with associated 95% confidence intervals separately for the bilateral and unilateral cases. Some of the included values are fictional.
Reports of RCTs should include key information on the methods and findings to allow readers to accurately interpret the results. Similarly, to enable replication of methods and results requires complete reporting. 83 This information is particularly important for meta-analysts attempting to extract data from such reports. The CONSORT 2010 statement provides the latest recommendations from the CONSORT group on essential items to be included in the report of an RCT. We have described an extension of the CONSORT checklist specific to reporting within person randomised trials.
Use of the CONSORT statement for the reporting of parallel trials with two groups is associated with improved reporting quality.8485 We think that the routine use of this proposed extension to the CONSORT statement will eventually result in similar improvements to within person designs. When reporting a within person randomised trial, authors should address all items on the CONSORT checklist using this extension document in conjunction with the main CONSORT guidelines.2
The CONSORT statement can help researchers to design trials and can guide peer reviewers and editors in their evaluation of manuscripts. Many journals recommend adherence to the CONSORT recommendations in their instructions to authors. We encourage journals, especially those that publish trials from the fields of dentistry, dermatology, hand surgery, and ophthalmology, to direct authors to this and to other extensions of CONSORT for specific trial designs. The most up to date versions of all CONSORT recommendations can be found at www.consort-statement.org.
We thank Barbara Hawkins for her contribution to the early drafts of this paper.
Contributors: DGA, DE, RWS and Barbara Hawkins initiated the work. NP, BC, RWS, DE and DGA drafted the manuscript and all authors reviewed it and approved the final version. DGA is the guarantor.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.