Sham device v inert pill: randomised controlled trial of two placebo treatments
(Published 16 February 2006)
Cite this as: BMJ 2006;332:391
- Ted J Kaptchuk, assistant professor of medicine ()1,
- William B Stason, lecturer in health policy and management2,
- Roger B Davis, associate professor of medicine3,
- Anna R T Legedza, instructor of medicine3,
- Rosa N Schnyer, research associate1,
- Catherine E Kerr, instructor in medicine1,
- David A Stone, director4,
- Bong Hyun Nam, fellow in medicine1,
- Irving Kirsch, professor of psychology5,
- Rose H Goldman, associate professor of medicine6
- 1 Osher Institute, Harvard Medical School, Boston, MA, 02215 USA
- 2 Harvard School of Public Health, Boston, MA 02215, USA
- 3 Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA
- 4 South-East European Research Centre, University of Sheffield, Thessaloniki, 54624 Greece
- 5 School of Applied Psychosocial Studies, University of Plymouth, Plymouth, Devon PL4 8AA
- 6 Cambridge Health Alliance, Cambridge Hospital, Harvard Medical School, Cambridge, MA 02139, USA
- Correspondence to: T J Kaptchuk
- Accepted 8 November 2005
Objective To investigate whether a sham device (a validated sham acupuncture needle) has a greater placebo effect than an inert pill in patients with persistent arm pain.
Design A single blind randomised controlled trial created from the two week placebo run-in periods for two nested trials that compared acupuncture and amitriptyline with their respective placebo controls. Comparison of participants who remained on placebo continued beyond the run-in period to the end of the study.
Setting Academic medical centre.
Participants 270 adults with arm pain due to repetitive use that had lasted at least three months despite treatment and who scored ≥3 on a 10 point pain scale.
Interventions Acupuncture with sham device twice a week for six weeks or placebo pill once a day for eight weeks.
Main outcomemeasures Arm pain measured on a 10 point pain scale. Secondary outcomes were symptoms measured by the Levine symptom severity scale, function measured by Pransky's upper extremity function scale, and grip strength.
Results Pain decreased during the two week placebo run-in period in both the sham device and placebo pill groups, but changes were not different between the groups (−0.14, 95% confidence interval −0.52 to 0.25, P = 0.49). Changes in severity scores for arm symptoms and grip strength were similar between groups, but arm function improved more in the placebo pill group (2.0, 0.06 to 3.92, P = 0.04). Longitudinal regression analyses that followed participants throughout the treatment period showed significantly greater downward slopes per week on the 10 point arm pain scale in the sham device group than in the placebo pill group (−0.33 (−0.40 to −0.26) v −0.15 (−0.21 to −0.09), P = 0.0001) and on the symptom severity scale (−0.07 (−0.09 to −0.05) v −0.05 (−0.06 to −0.03), P = 0.02). Differences were not significant, however, on the function scale or for grip strength. Reported adverse effects were different in the two groups.
Conclusions The sham device had greater effects than the placebo pill on self reported pain and severity of symptoms over the entire course of treatment but not during the two week placebo run in. Placebo effects seem to be malleable and depend on the behaviours embedded in medical rituals.
Questions and debate surround the scientific understanding of placebo effects.1 A recent National Institutes of Health conference declared that determining how placebo effects are modulated is an urgent priority,2–4 while a meta-analysis has cast doubt over whether placebo effects even exist in clinical settings.5 Devices, such as injections, invasive procedures, and acupuncture, are thought to have enhanced placebo effects but poor methods in the available research preclude definitive conclusions.6 Bioethicists at the National Institutes of Health have called for research “to test [whether] some treatments produce enhanced placebo effects.”7
We investigated whether a validated sham acupuncture device has a greater placebo effect than an inert pill in people with persistent upper extremity pain due to repetitive use, often called repetitive strain injury. This condition is the modern equivalent of “weaver's hand,” “sprout picker's thumb,” and “scrivener's palsy.”8
We carried out a parallel arm, single blind, randomised controlled trial created from the placebo run-in periods for two nested randomised controlled trials, one comparing acupuncture with a validated acupuncture sham device and the other comparing amitriptyline with placebo pill. Our primary investigation was the comparison of sham device with placebo pill during run in. We followed both placebo groups beyond the run-in period to examine the time course of placebo effects.
Participants in the acupuncture group received treatments twice a week. In the amitriptyline group, participants took one pill every day and a research assistant called them every other week to examine their progress and respond to questions. We deliberately compared placebo treatments as total entities and therefore did not control for each component of the intervention—for example, participants allocated to the device group had more contact with the practitioner, while those in the pill group took the placebo daily at home. At the end of the run-in period, participants were randomised again within each treatment group to receive either continued placebo or active treatment of the same type. Those in the acupuncture group were given two treatments each week over an additional four weeks (eight treatments was the minimum “dose” acupuncturists thought was necessary to effect improvement). The pill group received 25 mg of amitriptyline or continued taking the placebo pill daily for an additional six weeks. The longer treatment period for the pill group allowed adequate time for amitriptyline to achieve and maintain a steady state blood concentration for four weeks.
The study included adults (> 18 years old) with distal pain in the arms that had lasted for at least three months and resulted from repetitive use or prolonged static postures. Intensity of pain at enrolment had to be ≥3 on a 10 point numerical pain scale. Participants could have any of a range of clinical diagnoses involving the tendons, soft tissues, and nerves of the arm or non-specific symptoms related to repetitive movement or overuse. We excluded people with systemic connective tissue or muscular diseases, neurological disorders, or acute trauma to the arm and those using medications that interact with amitriptyline or who had had any previous treatment with acupuncture for arm pain or previous use of acupuncture within one year for any problem. Participants were allowed to continue any anti-inflammatory and non-excluded drugs but were discouraged from starting new treatments during the study.
We recruited participants from the community through advertisements and referrals from health professional. Eligibility and willingness to participate was determined during telephone screening interviews. Candidates were then scheduled for enrolment visits during which they gave informed consent, completed questionnaires, and underwent testing of grip strength. A study physician performed a targeted physical examination to identify excluded conditions and to assign a clinical diagnosis for the arm pain according to preset criteria.9 If the person had bilateral symptoms, outcomes were measured in the arm with the more severe pain at baseline. If the pain was equal in both arms, the dominant arm was used.
During the informed consent process, potential participants were told they had a 50% chance of receiving inactive treatment for the entire study and a 50% chance of receiving active treatment at some time during the study. They were explicitly told about the most common side effects of each treatment: temporary aggravation of pain with acupuncture and sleepiness, dry mouth, dizziness, and restlessness with amitriptyline. As a recruitment incentive, they were told they could receive either acupuncture or amitriptyline free of charge after participation if they received only placebo treatment during the study.
We randomly assigned participants to either the sham acupuncture or placebo pill group using permuted block randomisation with variable block sizes and assignments in sequentially numbered opaque sealed envelopes. An administrative assistant not otherwise involved in the study opened the next envelope in the sequence and recorded the assignment in a confidential log. Participants who completed the run-in period were randomised to continue their initial placebo treatment or to begin active treatment of the same type.
During the placebo run in, participants assigned to the acupuncture group received two treatments a week with a sham acupuncture device that looks exactly like a real acupuncture needle. When the needle is “inserted into the skin” participants see and feel the needle penetration. In fact, the needle has a blunt tip and retracts into a hollow shaft handle. The real needle is identical in appearance. Both sham and genuine needle are held in place with a plastic ring and surgical tape so the procedure looks identical. This sham device has been validated in several studies.10–13 After the run-in period, the acupuncturists followed identical protocols for administering real or continued sham acupuncture. They used at least five and a maximum of 10 sham needles in the upper extremities (depending on where the pain was located) and always placed one needle in the foot. If the patient had bilateral symptoms in both arms, the acupuncturists treated both arms even though we analysed outcomes for the more painful arm only.
Participants in the pill group were instructed to take one capsule each evening to minimise daytime drowsiness. The hospital's research pharmacy custom designed identical opaque blue placebo or amitriptyline capsules. The placebo capsule contained cornstarch, and the amitriptyline capsule contained cornstarch plus 25 mg of amitriptyline. If participants in the pill group complained of side effects during the study, the physician could “reduce” the dose by half or more.
Participants were blinded to whether they were receiving sham/placebo or active treatment. Research assistants were also blinded to treatment assignment. Participants received a written description of the assigned treatment regimen that was neutral on whether the treatments were effective. Acupuncturists were trained to maintain “neutral” communications with participants and to avoid providing cues that might reveal whether they were performing real or sham acupuncture. A research assistant routinely monitored acupuncture sessions to ensure adherence to the protocol.
The primary outcome was self reported intensity of pain in the most severely affected arm during the preceding week measured on a 10 point numerical rating scale ranging from no pain (1) to the most severe pain imaginable (10).14 Secondary outcome measures were the Levine symptom severity scale for upper extremity, with 11 items for frequency, severity, and duration of such symptoms as pain, numbness, or weakness each scored on a 5 point Likert scale with higher scores indicating worse symptoms and reported as the mean score on items applicable to the subject15; the Pransky upper extremity function scale, which rates the impact of symptoms on eight types of activities (including sleeping, writing, and lifting) on 10 point scales where the total score can range from 8 to 80 with higher scores indicating more functional impairment16; and grip strength measured by a Jamar hand dynamometer (Sammons Preston, Bolingbrook, IL). Participants reported on side effects of treatment using checklists plus a space for recording “other” effects.
This trial was prospectively designed to assess and compare placebo effects of a sham device and placebo pill.6 Evaluation of the effects of real acupuncture and amitriptyline were secondary objectives. We calculated that we need 135 participants in each placebo group to provide 80% power to detect a one point difference between group changes in the pain scores between baseline and the end of the two week run in. We estimated changes in pain scores using Student's t tests in intention to treat analyses. To adjust for drop outs, we used a last value carried forward approach, which implicitly assumes no change from baseline for participants who dropped out.
We assessed longitudinal trends in outcomes using baseline and two week data for all patients and mid-treatment and end of treatment data for participants who were randomised to continue on sham or placebo during the remainder of the study. Dependent variables in these regression models were pain scores and the three secondary outcome measures. Independent variables were study week, treatment group, and the interaction between them. We used generalised estimating equations17 18 to account for the correlation within patients of the repeated measures and to use all available data. These models estimated weekly changes in each outcome for each group defined according to intention to treat. We report P values for the interaction term to evaluate the difference in slopes for the two treatment groups. Baseline pain scores were added as an additional independent variable in a sensitivity analysis.
We enrolled participants from June 2001 to April 2003. Figure 1 shows details. A total of 1110 people completed telephone screening, 817 of whom were not eligible or refused to participate, leaving 293 people who attended the enrolment visit. During this visit, 23 people were found to be ineligible or refused to enrol. Hence, 270 people were randomised into the placebo run-in phase. Two participants in each group were later found to be ineligible and were excluded from the analysis. Eight participants in the device group and 10 in the pill group discontinued treatment and were not evaluated at two weeks. At the time of re-randomisation into phase 2 of the study, 60 continued on the sham device and 59 continued on the placebo pill. Table 1 shows the characteristics of participants at baseline and at the two week re-randomisation. Despite appropriately conducted randomisation, participants in the sham device group had more pain at baseline than those in the placebo pill group (difference on the 10 point scale 0.44, 95% confidence interval 0.05 to 0.83). As a sensitivity analysis, we used linear regression models to adjust for baseline pain scores. Otherwise the groups were well balanced. Results were similar for the second randomisation.
Table 2 shows the mean changes in outcomes at the end of the run-in period. The only significant difference between the sham device and pill groups was on the arm function scale and favoured the placebo pill group. Most of the difference in improvement was due to improved ability to sleep, open jars, and write. (The ability to sleep improved by 0.52 units in the pill group v 0.17 units in the sham device group.) These results went from marginally significant (P = 0.04) to marginally nonsignificant (P = 0.08) when we adjusted the analyses for differences in baseline pain scores.
Table 3 shows the results of longitudinal regression analyses for participants as long as they remained in their respective placebo arms. Pain scores per week decreased significantly more in the sham device group than in the pill group (−0.33 (95% confidence interval −0.40 to −0.26) v −0.15 (−0.21 to −0.09), P < 0.001). Similarly, scores on the symptom severity scale decreased more in the sham device group (−0.07 (−0.09 to −0.05) v −0.05 (−0.06 to −0.03), P = 0.02). Differences were not significant for arm function or grip strength. These findings persisted in significance, direction, and magnitude after we adjusted for baseline pain scores. Figures 2, 3, 4, 5 plot time trends for each outcome measure from baseline until the end of treatment period. At a subsequent one month follow-up visit, pain scores remained significantly lower than at baseline in both groups (−1.58, SD 2.06, P < 0.001, and −1.20, SD 1.64, P < 0.002), but the difference between groups was not significant (−0.38, −1.06 to 0.30, P = 0.27). By descriptive analysis, changes in pain scores at end of the placebo run in and at the end of treatment did not differ among participants in the three main diagnostic subgroups (tendonitis/epicondylitis, neuropathic/neuralgia, or other diagnoses) at any time point.
Nocebo effects of placebo treatments
The types of side effects were totally different in the two study groups and clearly mimicked the information given at informed consent (table 4). At two weeks, a quarter of the participants receiving the sham device reported one or more side effects compared with nearly a third in the pill group (P = 0.30). No reported effect was serious even though three participants with-drew from the placebo pill group because of fatigue or dry mouth.
The sham acupuncture device was the more credible treatment. At two weeks, 75% of participants who answered (93/124) in the device group believed they were receiving active treatment compared with 48% (59/123) in the pill group (P < 0.001). This difference continued to the end of the study. Within each treatment group, participants who believed they were receiving active treatment tended to improve more than those who thought they were getting inactive treatment, but the difference was significant only for the Levine arm symptom scale at the end of the placebo run in (pill group 0.25 v 0.12, P = 0.04; device group 0.17 v 0.02, P = 0.07). At the end of the treatment period, there was no significant difference on any outcome measure between “believers” and “non-believers.”
In this large prospective randomised controlled trial we found no evidence for an enhanced effect with placebo devices compared with placebo pills during the two week placebo run-in period, though an effect did become evident in participants who remained on placebo during the subsequent trials of active treatment. This result applied to the primary pain outcome and to severity of symptoms but not to other outcomes.
This finding of an enhanced placebo effect with a placebo device over time on self reported pain ratings has implications for current debates on the existence of placebo effects over and beyond the evolution of the course of disease, spontaneous remission, and regression to the mean. Recent studies of placebo in analgesia, in which intervention and assessment take place within minutes or hours, have shown genuine short term placebo effects beyond no treatment when the placebo dose is accompanied with deceptive expectations—for example, when participants are told that placebo is a “potent pain medication.”19–22 The results of our study provide evidence that a placebo effect exists over time, even when instructions are neutral. If the evolution of the disease alone accounted for the observed decreases in arm pain and severity of symptoms, the type of placebo should have made no difference, and we should not have been able to show a significant difference between device and pill placebos. That the differential placebo effect was confined to self reported measures (and not to grip strength) suggests an effect that may be confined to subjective outcomes. In this trial, the magnitude of this effect was small.
The placebo pill had a greater effect during the first two weeks on the function outcome. This may be due primarily to improvement in sleep, which in turn may have been due to the fact that sleepiness was explicitly emphasised in the informed consent as a possible side effect of the amitriptyline. This finding disappeared by the end of the treatment phase.
Our findings contribute to the debate on the influence of information provided at informed consent and subsequent reported adverse effects. Results of previous smaller prospective randomised controlled trials of different types of information in informed consent have been contradictory, with some reports finding a positive correlation23–25 and some finding none.26–28 We found that reported side effects entirely mirrored the information provided to participants.
What is already known on this topic
Placebo devices are thought to have enhanced placebo effects compared with oral pills but rigorous evidence is lacking
Controversy exists over the existence of placebo effects over and beyond the natural evolution of a disease and whether information provided by informed consent influences reports of adverse events
What this study adds
A validated sham acupuncture device has a greater placebo effect on subjective outcomes than oral placebo pills
A placebo analgesia effect beyond the natural evolution of disease is detectable over time
Adverse events and nocebo effects are linked to the information provided to patients
Limitations of our study need to be acknowledged. Firstly, we did not have a group of participants who remained on a waiting list and had no treatment, which would have helped to clarify the role of the natural evolution of the disease. Without this, the differential placebo effects we found should be interpreted cautiously. None the less, our comparison of two different placebos has the advantage of being less susceptible to bias than an unblinded waiting list control group. A second limitation was the relatively short placebo run-in period we used. We originally planned a four week run-in period.6 Concern for burdening participants led us to shorten this to two weeks before the first patient was randomised. A longer placebo run in might well have permitted more definitive conclusions. Thirdly, we chose a longer treatment period during the second phase for the amitriptyline arm than the acupuncture arm (six v four weeks.) The rationale was based on the time required for amitriptyline to reach steady state blood concentrations. While this was reasonable for evaluating real treatment, the net effect was to create complexity for the analysis of placebo effects and to require use of reductions per week in levels of outcome measures rather than outcomes measured at the end of the treatment period. Finally, we chose to compare the placebos as unified entities and not to examine how components of the interventions such as daily treatment versus twice weekly treatment, attention from the practitioner, or investment of the patient's time may have influenced results.
We thank Peter Goldman, Fred Mosteller, and John Hedley-Whyte for scientific mentorship; Melbeth Marlang, Lin Nulman, Patricia Muldoon, and Pat Wilkinson for research assistance; Joe Kay, Claire McManis, Bella Rosner, Ellen Highfield, Jonathan Ammen, Nancy Jenkins, Marks Mills, Karen Kirchoff, and Dinah Shatz for performing the sham acupuncture treatments; Sidney Klawansky, Dan Moerman, Robert Scholten, Frank Miller, Julie Buring, Marnie Naeser, and the university seminar on effective and affordable health care at Harvard University for input on study design and analysis; and Jacqueline Savetsky, Andrea Hrbek, Robb Scholten, and Sally Andrews for administrative coordination.
Contributors TJK is guarantor and led the conception, design and analysis of the study. WBS, RHG, RBD, RNS, CEK, DAS, and IK contributed to the conception and design. RHG, WBS, RNS, and CEK were especially important for developing treatments and assessments. RBD and ATRL performed statistical analysis with input from WBS, RHG, BHN, and IK.
Funding This study was made possible by Grant No 1RO1 AT00402-01 from the National Center for Complementary and Alternative Medicine (NCCAM) at the NIH.
Competing interests TJK works as a consultant for Kan Herbal Co, Scotts Valley, Ca.
Ethical approval The institutional review boards of Cambridge Health Alliance, Beth Israel Deaconess Medical Center, Harvard Medical School, and Harvard School of Public Health approved the study.