Intended for healthcare professionals

Rapid response to:

Primary Care

Acupuncture for chronic headache in primary care: large, pragmatic, randomised trial

BMJ 2004; 328 doi: (Published 25 March 2004) Cite this as: BMJ 2004;328:744

Rapid Response:

Response from the authors

Our trial has clearly raised a great deal of comment. We will try to
address the issues raised by each respondent. A number of correspondents
seem to have misunderstood the question we were seeking to answer in this
trial. This was a pragmatic rather than explanatory trial. The research
question was ‘Does acupuncture make a worthwhile difference to primary
care patients with headache?’, not ‘Is acupuncture better than a placebo?’
The intervention was one which could easily be applied in the NHS, and
used a group of acupuncturists with common professional background and
training, while allowing them judgment in precisely what intervention they
applied. We believe that we have shown that this intervention is
effective and that the conclusions are directly generalisable to NHS
primary care.

Dr Klimek complains that we claimed lower resource use (e.g. GP
visits) in the acupuncture group despite lack of statistical significance.
Whilst this point is well taken, our statement is correct as written.
Moreover, we think that statistical significance has a questionable role
as regards resource use. Our analytic strategy was to A) determine whether
acupuncture has clinical effects such as reduced headache; B) if so,
estimate the resource implications of acupuncture. Statistical
significance is an important part of A, but not B, because your decision
is a clinical one. Now you might find out that the resource implications
of a clinical decision might make you change that decision, for example,
if a treatment worked, but was far too expensive. But you don't start by
looking at cost, otherwise you would use ineffective drugs just because
they were cheap.

Klimek also argues that no patients dropped out of the control group
because treatment was ineffective. This is simply because these patients
did not receive on-study treatment and we did not offer them this as an
option when asking about reasons for withdrawal. It is not true that "this
bias was not addressed" as we undertook careful sensitivity analysis as
described on Note that 96% of patients provided post-
randomization data. We can also recommend Klimek consults table 2 to
answer questions about concurrent medication.

We agree with Professor Ernst that pragmatic trials do not stand
alone and require other types of data. Were our study to be the only trial
ever conducted on acupuncture for headache we agree that it would not form
a basis for policy decisions. However, our trial can be put in the context
of a Cochrane review. This review included a total of 11 trials comparing
acupuncture to placebo acupuncture in patients with migraine. Two found no
effects over sham acupuncture, three showed trends in favor of acupuncture
and five trials reported that patients in the acupuncture group did
significantly better than those in the sham acupuncture group. The final
trial reported a positive trend but was judged to be uninterpretable due
to the high drop-out rate.

We would like to assure Dr Ramey that comparison of clinically
reasonable alternatives such as standard care versus standard care plus
new therapy are routine in the medical literature. Ramey claims that
acupuncture is not "cost effective" on the grounds that it "doubled the
cost of care". This is despite our careful analysis of whether acupuncture
was worth the additional costs using standard techniques and varying the
threshold of willingness to pay. Ramey argues that acupuncture might be
less effective than other interventions, and gives banjo lessons as an
example. We completely agree. This is, of course, an argument against any
trial. For example, one could look at the recent topiramate trial and
claim "okay, topiramate was better than placebo. But was it better than
banjo playing?"

We repudiate Ramey's suggestion that we engaged in statistical
sleight-of-hand. Ramey misreads figure 2: he questions the results of the
control group "at 28 weeks", when the X axis is "headache days at
baseline", thus turning his own mistake into a deliberate attempt to
mislead on behalf of the authors. Ramey engages in further ad hominem
argument when claiming that we repeatedly state: "the NHS should consider
further funding of acupuncture" (which we never do) because our goal was
to obtain funding for a therapy. This reduces scientific debate to
accusations about motives, a datum, of course, to which we have no access.

Numerous respondents, including De Prato, Mansfield, Patterson,
Brookan and Brinkman ask, in short, could acupuncture have been a placebo?
Morris puts it well when he points out that "the acupuncture intervention
actually included referral and transportation to a specialist, the
touching of skin with needles, and the penetration of the acupuncture
needles". We agree that, taken alone, our results do not address the
extent to which the effects of acupuncture are due to "penetration of the
acupuncture needles". This is because our trial did not set out to answer
this question. Nonetheless, our trial, like all clinical trials, needs to
be put in the context of other research, such as the Cochrane review cited
above. We do accept the criticism that our choice of words was not always
optimal. We randomised patients to policies of either "use acupuncture" or
"avoid acupuncture" and our conclusions might better have been stated in
terms of "a policy of use acupuncture leads to …." rather than
"acupuncture leads to …". The clinical implications of our study are
unaffected by this semantic point.

Morris is simply incorrect to state that we required powerful
mathematical manipulation to demonstrate differences between groups. Our
main analysis, ANCOVA, is not only absolutely standard, but has been
widely recommended by statistical groups such as the ECH and FDA.
Moreover, as shown in our sensitivity analysis, an unadjusted ttest finds
highly significant differences between groups. Morris (and also Rienks)
is also incorrect to state that we conducted an effectiveness trial before
evidence from an efficacy trial indicated the intervention to be
efficacious. We refer to the results of the Cochrane review, discussed
above. Morris repeatedly describes the effect of the intervention as
"small". We will leave it to patients to decide whether an additional
three weeks per year free of headache is a small improvement.

Ronellenfitsch makes a reasonable point: only patients interested in
acupuncture would take part in the trial, they might be different from
other migraine patients and hence the results may not be applicable to
"every migraine patient". However, we see no reason why we might want to
apply the results of an acupuncture trial to a patient who would not want
to undergo acupuncture.

Ng and Parkinson complain that different treatments were given by
different practitioners. The key point is made by Ng in the assertion
that: "Prescriptions of acupuncture by different methods give
significantly different outcomes". If this is true, then Parkinson would
be correct is asserting that our trial was the equivalent of assessing
"any drug for chronic headache". But Ng does not provide any data or given
an argument to support the claim that the effects of acupuncture are
highly dependent on exactly which one of several reasonable point
prescriptions are used: it is simply an unsubstantiated assertion.
Conversely, we analyzed our data to determine whether there was
heterogeneity between practitioners, that is, whether different
practitioners obtained different results. Our I squared statistic, which
measures how much the observed variation is due to differences in
practitioners and how much due to chance, was zero. This suggests to us
that differences in prescriptions, as well as other aspects of acupuncture
practice, given by well-trained practitioners do not have a large impact
on outcome. We deliberately selected acupuncturists from one “school” of
acupuncture, AACP training. These means we are assessing one particular
group of acupuncturists which is statutorily regulated and which therefore
would meet clinical governance criteria for referral by other NHS
purchasers. If you take the attitude that only a stardardised protocol can
be researched, this would be the least generalisable type of study as not
even individual practitioners work in a completely standardised way.

Ng also claims that lack of a standard acupuncture protocol renders
our study "irreproducible" and therefore "invalid scientifically". This is
untrue. We have a sample of practitioners in the trial. Even if the
effectiveness of different acupuncturists varies, it remains true that a
patient visiting an acupuncturist in the is likely to have results close
to the mean that we reported in the trial. This is basic statistics:
height varies, but the mean height of a sample is likely to be close to
the population mean. Two experiments assessing the effects of X on Y are
expected to have similar results whether X is constant or a sample of X is
made where X varies. Finally on this point, as we remark above, we saw no
evidence that the effects of treatment do vary between practitioners.

Ng has misunderstood my previous paper. I did not claim that the high
proportion of positive results of acupuncture trials conducted in China
were related to methodologic rigor; I merely noted the phenomenon and gave
methodologic rigor, along with several other factors, as a possible
explanation. Ng claims that our "positive results [can be] attributed to
lack of methodologic rigor" but do not state clearly what it is about our
study methodology that makes it likely that our positive results were a
false positive.

Park argues that the difference between groups may have been due to
the disappointment felt by patients allocated to the control group. The
question is ultimately the same as the question whether the acupuncture
was a placebo (see above). Though this is not an unreasonable suggestion,
it is essentially a speculation, against which stand several lines of
evidence. First, headache scores improved in the control group. Second,
the difference between groups was larger than empirical estimates of the
type of bias Park describes. Third, our results were very similar to the
prior placebo-controlled trial of Vincent. Moreover, it might be argued
that being told there is nothing new that can be done (disappointment)
compared to a possible novel intervention (placebo) mirrors closely the
real world of clinical decision making.

Van den Burg suggests that "If a pharmaceutical medicine had been
studied in this way there would be an outburst of indignation in relation
to the conclusions." We can assure Dr Van den Burg that pharmaceuticals
are compared with no treatment controls routinely. Indeed, I work in
oncology and a typical trial compares chemotherapy for metastatic cancer
with "best supportive care" or adjuvant therapy with surgery alone. We
would also like to restate that our results do not stand alone, and must
be put in the context of other placebo-controlled trials of acupuncture.

Schoonman et al. make a large number of remarks. They first claim
that "the observed differences versus control were clinically irrelevant".
As we have already remarked, we will leave that to patients to tell us
whether 22 fewer days with headache pain each year are worth having. They
go on to state that "the control group was doing much worse than to be
expected rather than the acupuncture group doing better". This is very
hard to square with the actual data we presented: patients in the control
experienced a considerable improvement in headache with approximately 30%
reporting clinically relevant improvement during the trial.

The respondents raise doubts over our primary efficacy score as it is
"not recommended by accepted [International Headache Society] guidelines".
The differences between our 6 point measure (adopted to ensure
comparability with the placebo-controlled trial) and the International
Headache Society's four point measure are hardly drastic. In addition, an
unequivocally accepted endpoint is days with headache. The correlation in
our data set between our primary endpoint and days with headache is close
to 0.8 and the results of the two endpoints are highly comparable. So if
Schoonman et al. are suspicious of our primary endpoint, we invite them to
ignore it and use days with headache instead: their conclusions should be
highly similar. For example, they reinterpret our difference between
groups in terms of the maximum on the scale, rather than the mean reported
in the trial. This is equivalent to claiming that a 35% reduction in the
cost of a $20,000 dollar car is "irrelevant" on the grounds that $7,000 is
only 2 or 3% of the cost of the most expensive cars. With respect to the
average headache severity of 1.7, we can assure Schoonman et al. that
patients in the study did reported high scores, 3 - 5, during migraine

Schoonman et al make a number of reading errors. They claim that "out
of 9 SF-36 health status scales, only one showed a statistically
significantly improvement". In fact, there were statistically significant
differences for three scales at the primary endpoint of one year, with a
further four showing statistical trends. They state that they were unsure
as to whether patients completed diaries throughout the year or for just
for one week at 12 months. As explicitly stated under "outcome assessment"
patients completed the diary for four weeks at the one year follow-up. The
respondents also suggest that we exclude patients without a migraine
diagnosis from the analysis. Such a subgroup analysis is presented in the
results section.

Andrew Vickers and the study team.

Competing interests:
Andrew Vickers is the first author of the paper

Competing interests: No competing interests

20 May 2004
Andrew J Vickers
Assistant Attending Research Methodologist
Rebecca W Rees, Catherine E Zollman, Rob McCarney, Claire M Smith, Nadia Ellis, Peter Fisher, Robbert Van Haselen, David Wonderling, Richard Grieve
Memorial Sloan-Kettering Cancer Center, NY, NY 10021