Randomised study of n of 1 trials versus standard practiceBMJ 1996; 312 doi: http://dx.doi.org/10.1136/bmj.312.7038.1069 (Published 27 April 1996) Cite this as: BMJ 1996;312:1069
- Jeffrey Mahon, assistant professora,
- Andreas Laupacis, associate professora,
- Allan Donner, professor and chairmanb,
- Thomas Wood, associate professorc
- a Department of Medicine and Epidemiology, University of Western Ontario, London, Ontario, Canada
- b Department of Epidemiology and Biostatistics, University of Western Ontario
- c Department of Medicine, University of Western Ontario
- Correspondence and requests for reprints to: Dr Jeffrey Mahon, Room 60F-11, University Hospital, PO Box 5339, London, Ontario, Canada N6A 5A5.
- Accepted 11 March 1996
Objective: To compare outcomes between groups of patients with irreversible chronic airflow limitation given theophylline by n of 1 trials or standard practice.
Design: Randomised controlled study of n of 1 trials versus standard practice.
Setting: Tertiary care centre outpatient department.
Subjects: 31 patients with irreversible chronic airflow limitation who were unsure that theophylline was helpful after an open trial.
Interventions: n Of 1 trials (single patient randomised multiple crossover comparisons of theophylline against placebo) followed published guidelines. For standard practice patients theophylline was stopped and resumed if their dyspnoea worsened; if their dyspnoea then improved theophylline was continued. For both groups a decision to continue or stop the drug was made within three months of randomisation.
Main outcome measures: Exercise capacity as measured by six minute walking distance, quality of life as measured by the chronic respiratory disease questionnaire at baseline and six months after randomisation, and proportions of patients taking theophylline at six months.
Results: 26 patients completed follow up. 47% fewer n of 1 trial patients than standard practice patients were taking theophylline at six months (5/14 versus 10/12; 95% confidence interval of difference 14% to 80%) without differences in exercise capacity or quality of life.
Conclusions: n Of 1 trials led to less theophylline use without adverse effects on exercise capacity or quality of life in patients with irreversible chronic airflow limitation. These data directly support the presence of a clinically important bias towards unnecessary treatment during open prescription of theophylline for irreversible chronic airflow limitation. Confirmation in a larger study and similar studies for other problems appropriate for n of 1 trials are needed before widespread use of n of 1 trials can be advocated in routine clinical practice.
Several common clinical problems suit n of 1 tri- als, including prescription of theophylline for irre- versible chronic airflow limitation, yet they are rarely used
Among patients with chronic airflow limitation randomised to receive theophylline by an n of 1 trial or standard practice 47% fewer n of 1 trial patients were taking theophylline after six months without difference in exercise capacity or quality of life
There seems to be a clinically important bias towards unnecessary treatment in standard prac- tice in this setting; n of 1 trials may limit this bias
In their usual form n of 1 trials are randomised, double blind multiple crossover comparisons of an active drug against placebo in a single patient.1 2 3 They limit the biases of standard practice or open before and after trials of treatment. These biases are thought to lead to false conclusions that the treatment is effective and include the placebo effect, the tendency for physicians and patients to want the treatment to work, and the effect of regression to the mean.1 2 We hypothesised that the objectivity of n of 1 trials in determining treatment in a single patient would lead to a better outcome over standard practice—including the use of less medication—when n of 1 trials are used in groups of patients. Randomised studies confirming this hypothesis would support the wider use of n of 1 trials. At present n of 1 trials are rarely used despite their suitability for many problems.2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 We report a randomised study of n of 1 trials versus standard practice for theophylline for irreversible chronic airflow limitation.
Patients and methods
We chose to study treatment with theophylline for irreversible chronic airflow limitation because the disease is common and the treatment and disease meet prerequisites for n of 1 trials.1 Specifically, chronic airflow limitation is comparatively stable; theophylline acts and, once withdrawn, stops acting quickly; and theophylline does not change the natural course of the disease. In addition, though the efficacy of theophylline for irreversible chronic airflow limitation has been established in conventional randomised controlled trials, its efficacy in individual patients is often in doubt.18 19
PATIENT CHARACTERISTICS AND ETHICS
Patients were recruited from a chronic airflow limitation clinic and the outpatient practice of a general physician. Irreversible chronic airflow limitation required a forced expiratory volume in one second <70% of predicted and a ratio of forced expiratory volume in one second to forced vital capacity <70% of predicted on two occasions within two weeks. Twenty five of 31 randomised patients had forced expiratory volumes in one second that did not increase by more than 15% (or 200 ml) after inhaled salbutamol. The other six patients (four randomised to n of 1 trials, two randomised to standard practice) did not have spirometry before and after salbutamol at entry but were judged clinically to have non-significant reversibility.
All patients had taken theophylline for one to five years before entry with a dosing schedule established clinically and by monitoring blood concentrations of the drug, and all but two were taking the drug at the time of recruitment. These two patients (one randomised to n of 1 trials, the other randomised to the standard practice group) had stopped theophylline within three months before first contact by study personnel because of lack of apparent benefit. For both patients an open trial of theophylline was given for two weeks at the previously used dose and a predose theophylline concentration determined to be in the therapeutic range (50-110 µmol/l). All patients were uncertain that theophylline was helpful while taking it openly. This was established by the patient not answering yes to the question, “Are you certain that theophylline is helping you?” Patients fulfilling the entry criteria were randomised by coin toss to either n of 1 trials or standard treatment by a person unaware of their baseline characteristics.
The study was approved by the local ethics committee and patients' informed consent obtained after full explanation.
N OF 1 TRIALS
n Of 1 trials followed published guidelines.20 Patients identified their most troubling symptoms. Those symptoms likely to be improved by theophylline were incorporated into a diary rating severity of symptoms on a Likert scale of 1 to 7. In all cases the symptom was dyspnoea during routine activities. For example, if dyspnoea while climbing stairs at home was identified the question read, “How short of breath were you today when you were climbing the stairs?” Responses ranged from 1 (extremely short of breath) to 7 (no shortness of breath).
Sustained release theophylline (Theo-Dur) and identical placebo were dispensed in pairs of treatment periods (10 days for theophylline, 10 days for placebo). The starting dose of Theo-Dur was the same dose used before entry (if the patient had been taking Theo-Dur) or its pharmacological equivalent if another theophylline preparation had been used. The order of theophylline and placebo within pairs was randomly determined according to a computer generated scheme held in the participating pharmacy. The physician supervising n of 1 trials (JM) was blinded, and diaries were completed on days 5, 7, 9, and 10 of each treatment period. Patients could contact the physician before completing a treatment period if their symptoms became unacceptable.20 If the deterioration could not be explained clinically by an event other than withholding theophylline (for example, a respiratory infection), then patients were asked immediately to switch to the other treatment. If the deterioration occurred while patients were receiving the second drug in a treatment pair they were asked to stop the treatment and return to the clinic for review. For early switching or stopping treatment the event and its circumstances were recorded.
The main argument for early switching or stopping treatment is to limit the ethical problem of a patient becoming and remaining severely symptomatic during placebo treatment. Whether early switching or stopping occurs depends on the effectiveness of the treatment and how much discomfort the patient is willing to tolerate. The main impact of early switching on interpretation of an n of 1 trial result is to reduce the quantitative data (that is, diary scores) available for analysis. If the switch occurs before collection of any diary information, then analysis based on personal diaries cannot be performed for that treatment period. However, the use of blinding and randomisation still allows qualitative interpretation of the n of 1 trial result that may be less biased than the result of a standard open before and after trial of treatment.
Blood was drawn for measurement of the predose theophylline concentration once during the last five days of each 10 day period. The results were provided to the study physician without the date of collection and the theophylline dose adjusted for the next treatment period if the concentration was outside the therapeutic range.
Patients were reviewed at the end of each treatment pair. Up to four pairs could occur. However, the patient or physician could stop the n of 1 trial earlier for any reason. On stopping the trial the code was opened and the mean symptom score for each 10 day treatment period determined. The mean difference and 90% confidence interval of the mean symptom score between treatment pairs (theophylline minus placebo) were also determined when possible. Confidence intervals were based on Student's t distribution.20
If one or less than one treatment pair was completed a recommendation to stop or continue theophylline was made according to clinical judgment. If two or more treatment pairs were completed the mean difference in mean symptom scores between theophylline and placebo and its associated 90% confidence interval were used to make a recommendation according to the scheme shown in figure 1. This scheme was modified from that of Guyatt et al2 and accepted their arguments that (a) the minimal clinically important difference (either benefit or harm) for n of 1 trials shown by a seven point Likert scale was a mean change of 0.5 per symptom21 and (b) serial correlation in mean responses within a patient could safely be ignored.22
A “statistically conclusive” n of 1 trial result required that the 90% confidence interval of the mean difference in symptom score should not include 0; a “statistically inconclusive” trial result meant that the confidence interval included 0. Our criteria for deciding on treatment were weighted towards stopping theophylline, in the sense that beneficial trends in favour of theophylline (that is, point estimates on the mean difference in mean symptom score between 0 and 0.5) did not usually lead to a recommendation to continue the drug (see fig 1). The rationale for more stringent criteria for continuing theophylline was that the potential side effects of the drug justified not giving it to patients having only marginal symptomatic improvement. We accepted 90% confidence intervals in deciding about a treatment effect within an n of 1 trial as a compromise between the inherent low power of n of 1 trials having four or fewer treatment pairs and the need to avoid type I (false positive) errors.2 4
Patient and physician confidence in the treatment decision on completion of the n of 1 trial was assessed independently by the question, “How confident are you that theophylline should be stopped?” or “…continued?” with responses ranging from 1 (not at all confident) to 7 (extremely confident).
Patients treated according to standard practice stopped theophylline. They were asked to contact the study physician if their dyspnoea became worse. The drug was resumed if this deterioration was deemed clinically not to be due to a respiratory infection or heart failure. If the dyspnoea improved after resumption of theophylline the patient was asked to continue the drug. All standard practice patients were reviewed in the clinic three months after randomisation, when confidence in the decision about theophylline was assessed in the same way as for the n of 1 trial patients.
We determined the proportion of patients taking theophylline at six months. The chronic respiratory disease questionnaire and six minute walking distance were also assessed at baseline and six months. The chronic respiratory disease questionnaire is a responsive, valid quality of life index specific for chronic airflow limitation.19 23 It measures four domains, which can be combined into two larger domains of physical function (combining dyspnoea and fatigue) and emotional function (combining emotional function and mastery). The six minute walking test has been used to measure functional capacity in several chronic diseases, including chronic airflow limitation.19 Minimal clinically important differences have been suggested for both indices: for the chronic respiratory disease questionnaire it is a mean change of 4.5 in the physical function score and 5.5 in the emotional function score21; for the six minute walking test it is 30 m.19 Personnel administering outcome measures were blind to treatment group allocation and patients were instructed to maintain this blinding.
SAMPLE SIZE AND STATISTICAL ANALYSIS
One aim of the study was to evaluate the feasibility of a large scale randomised trial. A sample size of 30 randomised patients was judged to be sufficient to examine feasibility issues. The difference between the groups in the proportion of patients taking theophylline at six months was calculated, with 95% confidence intervals assessed by the normal approximation to the binomial.24 Between group differences (n of 1 trial minus standard treatment group) in the within group changes over six months in the chronic respiratory disease questionnaire scores and six minute walking distance were compared by 95% confidence intervals based on Student's t distribution.24 Analyses of covariance controlling for age, sex, and baseline six minute walking distance and chronic respiratory disease questionnaire scores were also performed and yielded very similar results to those based on the within group changes over six months. Physician and patient confidence for the n of 1 trial and standard practice groups were compared by unpaired t tests.
PATIENT CHARACTERISTICS AND FOLLOW UP
Over three months 16 patients were randomised to n of 1 trials and 15 to standard practice. Of the 16 patients randomised to n of 1 trials, one had an exacerbation of chronic airflow limitation after completing baseline studies but before starting the trial and chose not to continue. A second patient completed the n of 1 trial but died of end stage respiratory failure four months after entry. Of the 15 patients randomised to standard practice, two withdrew shortly after randomisation but before stopping theophylline (one had “second thoughts” and one was admitted to hospital for acute myocardial infarction) and one withdrew at three months on the discovery of metastatic liver cancer. Table 1 shows the baseline characteristics of the 26 (84%) patients followed up to six months.
N OF 1 TRIALS
Table 2 summarises the n of 1 trial results. Switching or stopping the assigned drug before completion of the 10 day treatment period occurred in three instances. All other 10 day treatment periods (73 in total) were completed without early switching or stopping. A decision to continue theophylline was made in five cases. In one (case 1) the patient withdrew after four days of the first treatment period (active drug), having become sure on reconsideration that theophylline was helpful. In case 3 the patient chose to continue with theophylline after completing one treatment pair. The mean difference in mean symptom score (-2.15) strongly favoured placebo but the n of 1 trial was confounded by the introduction of prednisone by another physician for an apparent exacerbation of chronic airflow limitation (worsening dyspnoea without fever or a change in sputum production). The prednisone was taken for 10 days, beginning on day 9 of the first treatment period (active drug) and continued through most of the second (placebo) treatment period. The patient refused further treatment pairs.
In case 5 the patient became dyspnoeic when taking placebo. This was not explained by a respiratory tract infection and resolved within two days of early crossover to theophylline. In case 6 the patient had a conclusive result in favour of theophylline after two treatment pairs (mean difference in symptom score 0.89; 90% confidence interval 0.41 to 1.37). Lastly, one patient (case 7) chose to continue with theophylline because the trend favoured a clinically important benefit (mean difference in symptom score 0.42; -0.25 to 1.09) and clinically important harm was unlikely.
Seventy six measurements of predose serum theophylline concentration were required in the n of 1 trials (based on 38 treatment pairs in 15 patients) and 53 were obtained. No patient had theophylline detectable in the serum (lower limit of detection 14 µmol/l) while taking placebo (27 measurements). The other 26 samples were drawn during theophylline treatment. The mean concentration in these samples was 66 µmol/l (range 26-143 µmol/l).
COMPARISON OF THE TWO GROUPS OF PATIENTS
Of the 12 standard practice patients followed up to six months, 10 resumed theophylline within one month of stopping and continued the drug for six months. In all cases resumption of theophylline followed patients' unsolicited reports of worse dyspnoea, which resolved within one week of restarting the drug. Significantly fewer n of 1 trial patients (5/14) than standard practice patients (absolute difference 47%; 95% confidence interval 14% to 80%) were taking theophylline at six months (table 3). The groups showed no significant differences in changes in the six minute walking distance and chronic respiratory disease questionnaire scores over the six months (table 3).
For the six minute walking distance and chronic respiratory disease questionnaire emotional function score the point estimates on these changes favoured the n of 1 trial patients—that is, they improved more over six months among n of 1 trial patients than among standard practice patients. In addition, the lower 95% confidence limits of the differences were well removed from the minimal values that would indicate clinically important declines among n of 1 trial patients relative to standard practice patients (30 m for the six minute walk,19 5.5 for the emotional function score21). For the chronic respiratory disease questionnaire physical function score the point estimate for the difference in the within group change over six months slightly favoured the standard practice group (-0.3) and the lower 95% confidence limit (-4.5) included the value judged to be of minimal clinical importance.21 There were no differences between the groups at six months in the use of other bronchodilators and home oxygen (data not shown).
Physician confidence in the decision about theophylline was significantly stronger in the n of 1 trial group than in the standard treatment group (mean scores 5.2 and 4.4 respectively; P=0.02) whereas patient confidence in the decision was stronger in the standard treatment group, though not significantly so (mean scores 6.3 and 5.7; P=0.38).
n Of 1 trials have been used to prescribe histamine receptor blockers for non-ulcer dyspepsia4; tricyclic antidepressants for fibromyalgia5; inhaled bronchodilators, inhaled steroids, and theophylline for chronic airflow limitation6; non-steroidal anti-inflammatory drugs and paracetamol for osteoarthritis3 16; antihistamines for atopic dermatitis7; and enalapril for hypertension.17 Other problems that suit n of 1 trials include oral steroids for chronic airflow limitation, gut motility agents for irritable bowel syndrome, antihistamines for allergic rhinitis, and anticonvulsants for epilepsy. These common problems lead to huge numbers of encounters between patients and physicians. Despite this and despite endorsements of the technique25 26 27 28 29 30 31 n of 1 trials are rarely used. This is probably because of the extra effort they demand from patients and physicians—which could be justified if randomised studies showed that n of 1 trials result in clinically important benefits over standard practice. This study is the first attempt at such a trial.
The difference in theophylline use at six months between the n of 1 trial and standard practice groups—without significant changes in exercise capacity and quality of life—suggests that the suspected bias of standard practice towards unnecessary treatment is real1 2 and that its impact is minimised by n of 1 trials. The potential clinical importance of this bias is underscored by the much greater use of theophylline among standard practice patients (difference 47%), the fact that the patients had taken the drug for up to five years, and the side effects and costs associated with theophylline.32 We doubt that bias in the decision about theophylline in the two groups accounted for the large difference in theophylline use at six months. Bias favouring n of 1 trials and leading to more patients in the standard practice group being told to continue the drug was unlikely because resumption of the drug in these patients followed an unsolicited complaint of worse dyspnoea. The low level of physician confidence (4.4) and high level of patient confidence (6.3) in the treatment decision in the standard practice group also argues against this. The decision about theophylline in the n of 1 trial group was usually governed by an objective, statistical result and not clinical judgment.
IMPORTANCE OF CONFIRMATION
Several qualifications make it important that our finding should be confirmed. Firstly, we cannot rule out that a clinically important decrease in at least one aspect of quality of life (physical function as measured by the chronic respiratory disease questionnaire) resulted from less theophylline use in the n of 1 trial group. This is because the 95% confidence interval of the difference between the two groups over six months for this domain included the value previously identified to be of minimal clinical importance. Secondly, the 95% confidence interval of the six month difference in theophylline use between the two groups was wide (14% to 80%). A larger study will improve the precision for this confidence interval and also fully ensure that stopping theophylline during n of 1 trials does not lead to an important decline in the chronic respiratory disease questionnaire physical function score.
Thirdly, follow up did not go beyond six months. Longer follow up (for example, to one year) would better establish the permanence and clinical importance of treatment decisions made through n of 1 trials in this setting. Fourthly, we did not compare the economic costs and savings of n of 1 trials relative to standard practice. Though it is likely that the higher initial costs of n of 1 trials (including those of extra physician and patient time, preparing and dispensing treatment, and preparing diaries) would be outweighed by longer term savings through reducing the use of ineffective treatment,1 2 3 prospective collection and comparison of these costs are needed for confirmation. Finally, these data cannot be generalised beyond the problem of theophylline for irreversible chronic airflow limitation nor to other physicians.
Notwithstanding these qualifications, we have established a strong rationale for further randomised studies of n of 1 trials versus standard practice in many settings. The burden from unnecessary drug use for the problems noted above, as well as other problems for which n of 1 trials have successfully been applied,8 9 10 11 12 13 14 15 is potentially very large. Randomised studies of n of 1 trials versus standard practice need to be done for these clinical problems, and in other patient populations by other physicians, to confirm and establish the generalisability of our results. If such studies show that the benefits of n of 1 trials are worth the effort, then their current limited role in routine clinical practice needs re-evaluation.
We acknowledge Dr Brian Larocque for help in recruiting subjects; Mrs Cindy Cartwright for preparing the manuscript; Mrs Linda Moyer, study nurse; Drs Gordon Guyatt and Ian Stiell for reviewing the manuscript; and Astra Pharma Incorporated (Mississauga, Ontario) for the study medication. AL is a scientist of the Medical Research Council of Canada.
Funding The Ontario Ministry of Health (grant No 04348F).
Conflict of interest None.