Alteplase for stroke: money and optimistic claims buttress the “brain attack” campaignCommentary: Who pays the guideline writers?Commentary: Thrombolysis in stroke: it works!
BMJ 2002; 324 doi: https://doi.org/10.1136/bmj.324.7339.723 (Published 23 March 2002) Cite this as: BMJ 2002;324:723All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
I have written a number of rapid response letters to bmj.com
questioning the validity of the NINDS trialists' interpretation of the
results of the 91-180 minutes arm of the NINDS study -- on the basis of an
imbalance in baseline stroke severity between the treated and placebo
groups. I could never precisely quantify the "degree" to which this
imbalance problem affected the "correct" interpretation of the NINDS
study's results, because I did not have access to the study's patient-
level raw data.
I recently received the NINDS study's patient-level raw data. The raw
data enabled me to more throughly test my theory that imbalances in
baseline stroke severity between the treatment and placebo groups of the
91-180 minute arm of the NINDS study invalidated the "official"
interpretation of the NINDS study's results. After studying the raw data,
I became increasingly convinced of that fact. I therefore decided to write
a new manuscript that deeply analyses the NINDS study's results.
The manuscript is called "A personal analysis of the NINDS study
using patient-level data".
It is available at
http://www.homestead.com/emguidemaps/files/NINDSpersonalanalysis.html
If you cannot access that website file, go to http://www.homestead.com/emguidemaps/JeffMannEMguidemaps.html, and click on the manuscript's title in the soapbox section.
My general conclusion is that the imbalance in baseline stroke
severity between the treated and placebo groups in the 91-180 minutes arm
of the NINDS study accounted for approximately 50% of the estimated
"apparent" efficacy of tPA therapy for stroke patients treated after 90
minutes. What do you think of my "theoretical" estimation?
Jeff Mann. MD.
Competing interests:
None declared
Competing interests: No competing interests
It is interesting to discover how the NINDS investigators deal with
my criticism of the NINDS trial's result-analysis. I have repeatedly
questioned the validity of the results of the 91-180 minute arm of the
NINDS trial, because of the imbalance in baseline stroke severity between
the tPA-treated patients and placebo patients in that arm of the trial.
In a talk called "Accomplishments in Stroke Care" presented at the
NINDS stroke symposium in December 2002 (freely
available online at reference number number 1), Patrick Lyden presents a
slide (slide number 17) that demonsrates that the OR is still 1.7 if one
leaves out patients with a baseline NIHSS score of <5 and >20.
However, look at the wide confidence intervals which cross an OR figure of
1.0 (unity). That "fact" demonstrates that the NINDS trial was too small
in size to allow a person to be confident in the validity of the results
for those stroke patients treated between 91-180 minutes.
In another slide (slide number 7) Patrick Lyden suggests that one
doesn't need a larger study sample size than 600 patients for tPA-for-
stroke trials because the ARR is 12% (compared to 2% for cardiac tPA
trials). I think that his argument is without merit, because the 12% ARR
figure is only based on one controversial study -- the NINDS trial. That
would be like arguing that one only needs a sample size of a few hundred
patients for clinical trials of PG2b3a inhibitors in ACS patients, because
the EPISTENT trial showed a significant mortality reduction at one year.
However, the EPISTENT study was an "outlier", and all the other trials of
PG2b3a inhibitors in ACS patients showed no mortality reduction (Read
pages 69-73 in reference number 2 to see how all the experts agreed that
there was no mortality reduction in all those trials).
Another issue that Patrick Lyden doesn't deal with are the "strange"
results in the NIHSS 11-15 subgroup (91-180 minutes arm). If you read my
rapid response letter to the bmj [3], you will note that the placebo group
only had a 14% excellent stroke outcome rate. That 14% figure is very low,
and it makes the overall OR results of 1.7 for the 91-180 minute arm
(excluding stroke patients with a NIHSS score <5 and >20) appear
much better than it probably really is -- compared to a more realistic
situation where the placebo group is more representative of community
patients with a stroke severity in the NIHSS 11-15 range (who can be
expected to have an excellent stroke outcome rate of ~30%). That is why I
would like to get the NINDS study's patient-level data to examine that
particular subgroup's results.
I have previously sugggested that the NINDS investigators should make
all the patient-level raw data from the NINDS
study publically available. However, they have apparently failed to so,
despite repeated requests from multiple people over an extended period of
time. How can they ethically justify not making their patient-level raw
data publically available?
References:
1. "Accomplishments in Stroke Care" talk by Patrick Lyden. Presented
at the December 2002 NINDS Stroke Symposium. The
narrated talk and powerpoint slideshow is freely available online at the
following website http://www.ferne.org.
2. FDA Advisory meeting transcripts. Available from the FDA's website
=> Go to the list of Cardiovascular and Renal
Drugs Advisory Committee meetings for 1999 => Go to October 14th 1999
=> Choose Microsoft word document number
3555t1.rtf.
Also available at the following webpage:
http://www.fda.gov/ohrms/dockets/ac/cder99t.htm
3. Rapid response letter -- Mann J. The difference between the
"apparent" and "true" efficacy of tPA in the NINDS trial.
Available online at http://bmj.com/cgi/eletters/324/7339/723#23927.
Competing interests:
None declared
Competing interests: No competing interests
I have read the above discussion with great interest. While I cannot
pretend to sort out the statistical and procedural arguments I think one
thing is clear: further discussion alone is not going to resolve the
various questions and doubts presented. It would appear to me that there
is at least a reasonable possibility that we are doing our patients more
harm than good using thrombolysis for stroke. It is clear that some
patients are harmed by thrombolysis, a statement that can be made about
many theraputic endeavors. However, as opposed to most therapies the
number needed to harm here is quite low, not even one order of magnitude
less than the number needed to treat.
For me this brings up a serious
ethical dilemma. Were it the case that those patients with the most
severe strokes and terrible outcomes were the only ones who might suffer
intracerbral hemorrhage from thrombolysis one might justify the low treat
to harm ratio as their outcome would have been terrible in any situation.
But are we not taking patients, some of whom would otherwise have had a
reasonable outcome, and converting them to patients with terrible or
lethal outcomes? Do we really have the right to do so? This is usually
not an ethical dilemma since either the risk of doing harm is so small
compared to the benefit that it is not an ethical question or the disease
process is so lethal that it is worth the risk (e.g. bone marrow
transplant for leukemia.) But in this case I think we need to be able to
fall back on a whole lot more than a P value. We each need to know that
if that patient who right now is a little dysarthric with some
lateralizing weakness is suddenly aphasic, hemianopic, hemiparetic and
bedridden, or possibly dead due to the therapy we advised, that we know
that it was still the right therapy to have advised. I don't think many
of us feel that secure about the data. Let's repeat the studies and
examine the raw date from the NINDS study as well.
Mark Miller MD
Diplomate, ABIM, ABEM
Seattle
Competing interests:
None declared
Competing interests: No competing interests
Many educational events are sponsored by pharmaceutical companies;
postgraduate tutors in general practice have an uneasy relationship
because of the absence of any funding for their meetings. There seems to
be an increasing tendancy for Prinary Care Organisations to use Drug
Company Sponsorship for "educational meetings"; some of which will be more
about policy changes than "pure education" eg A County diabetic group
changing its policy on the use of newer oral hypoglycaemics, discussing
this over dinner at a hotel, all paid for by both manufacturers of the
said drugs. The need for HAs and PCOs to be aware of this and perhaps
change their policy of industry sponsorship should be tackled in the same
way as larger national organisations
Competing interests: No competing interests
As a result of recent e-mail discussions with a number of tPA-
proponents, I have come to realize that they cannot
appreciate the validity of my criticism of the NINDS trialists'
interpretation of the NINDS trial, and I have thought of another method of
making my critical points more vividly real and also easier to understand.
If one reviews Dr. Grotta's presentation of data in table 1 in his
rapid response letter to the bmj [1], one will note that the efficacy of
tPA for patients treated between 91-180 minutes in the NINDS trial was
calculated by dividing the total number of patients having a favorable
response by the total number of patients in that arm of the trial. The
figure for the treated patients was 70/153 (46%) and the figure for the
placebo patients was 42/167 (25%) so that the calculated efficacy of tPA
was 21%. I have argued that the calculated efficacy figure is flawed
because there were 22 more tPA patients than placebo patients in the NIHSS
1-5 subgroup and 18 more placebo patients than tPA patients in the
NIHSS>20 subgroup. Many people apparently have difficulty understanding
how that imbalance could affect the validity of the NINDS trial's efficacy
figure of 21%, and the following theoretical demonstration may make it
much easier to understand how that imbalance affected the calculation of
tPA's efficacy.
Was it fair to have 18 more placebo patients than tPA patients in the
most severe stroke severity subgroup (NIHSS >20), and 22 more treated
patients than placebo patients in the mildest stroke severity group (NIHSS
1-5) in the NINDS trial? The answer may become much clearer if one simply
corrected the situation and ensured that there were an equal number of
patients in the NIHSS 1-5 and NIHSS >20 subgroups, while leaving all
the other patient numbers and response rates the same. Then a
"hypothetically fairer" trial would have the following results.
The results from Dr. Grotta's table 1 would look like this:-
Baseline NIHSS subgroup--tPA patients with favorable outcome--placebo
patients with favorable outcome
1-5 subgroup -- 24/29 (83%) -- 25/29 (86%)
6-10 subgroup -- 23/37 (62%) -- 23/46 (50%)
11-15 subgroup -- 10/26 (38%) -- 5/35 (14%)
16-20 subgroup -- 9/33 (27%) -- 6/33 (18%)
> 20 subgroup -- 4/28 (14%) -- 1/28 (4%)
ALL patients -- 70/153 (46%) -- 60/171 (35%)
Calculated efficacy of tPA = 11%.
The calculated efficacy of tPA would be 11% (and not 21% as was
originally calculated in the NINDS trial). In other
words, the "apparent" therapeutic benefit of tPA would only be 50% of the
figure calculated by the NINDS trialists. Also, note that the placebo
patients would have an overall rate of favorable stroke outcome of 35%,
and that figure is much closer to the results obtained by the placebo
patients in the other tPA-for-stroke RCTs, than the NINDS trialists'
placebo figure of 25%.
Is there a tangential way to prove that the "apparent" efficacy
figure of 11% is more likely to be accurate than the "apparent" efficacy
figure of 21%?
Consider the calculated efficacy results of tPA for each subgroup
from table 1 in Dr. Grotta's letter.
NIHSS 1-5 subgroup = minus 3% (83% - 86%)
NIHSS 6-10 subgroup = 12% (62%-50%)
NIHSS 11-15 subgroup = 24% (38%-14%)
NIHSS 16-20 subgroup = 9% (27%-18%)
NIHSS >20 subgroup = 10% (14-4%)
Note that if one removed the NIHSS 1-5 subgroup from consideration
(because patients with very mild strokes regularly have an excellent
recovery rate due to the natural course of the disease), then there would
be 4 subgroups remaining with the following efficacy results.
NIHSS 6-10 subgroup = 12%
NIHSS 11-15 subgroup = 24%
NIHSS 16-20 subgroup = 9%
NIHSS >20 subgroup = 10%
Note that the NIHSS 11-15 subgroup is a statistical outlier, and that
if one removed its results from consideration, then the "average" efficacy
result would be between 10-11%. How does one explain the wayward efficacy
result of 24% for the
NIHSS 11-15 subgroup? Salim Yusuf in his article on the analysis and
interpretation of subgroup results [2] states "Even when treatments have
similar effects in different subgroups, the play of chance may well
exaggerate, dilute, or occasionally reverse the results in a particular
subgroup." In other words, the wayward results of that particular subgroup
could be due to chance. Another method of gauging that chance probably
affected the results of the NIHSS 11-15 subgroup is to examine the
efficacy results of the subgroups immediately above and below that
particular subgroup's results. Physiological expectations, based on the
expectation that tPA should have similar therapeutic effects in all
subgroups in the middle of the stroke severity range, suggest that the
NIHSS 11-15 subgroup should have efficacy results between 9-12%. In fact,
the NIHSS 11-15 subgroup's results would be between 9-12% if the placebo
patients' rate of favorable stroke outcome result was in the range of 26-
29% (rather than 14%). I have previously argued [3] that the TOAST graph
predicted that the NIHSS 11-15 placebo group's rate of favorable stroke
results should be around 30%. Therefore, if one corrected for that fact,
then the 91-180 minute subgroup results would appear as follows:-
NIHSS 6-10 subgroup = 12%
NIHSS 11-15 subgroup = 9-12%
NIHSS 16-20 subgroup = 9%
NIHSS >20 subgroup = 10%
Then the calculated "average" efficacy of tPA would be 10-11%, which
is identical to the figure obtained by the standard methodology (as used
by the NINDS trialists) -- after correcting for stroke severity
imbalances.
Is this type of statistical correction for wayward subgroup results
acceptable? Yusuf in his article on the analysis of
subgroup results [2] stated that one should "interpret the results in the
context of similar data from other trials, from the architecture of the
entire set of data on all patients, and from principles of biological
coherence". Therefore, I think that it is entirely appropriate to make
that type of correction, which is both physiologically coherent and
consistent with the data from the other NIHSS subgroups and other stroke
trials. Some people would still argue that post hoc subgroup analyses are
fraught with potential inaccuracies because of small sample sizes. That
argument is entirely valid. However, isn't it interesting that the
"average" efficacy figure of 10-11%, which was determined by the
"averaging" methodology of post hoc
subgroup analysis, is identical to the efficacy figure obtained using the
standard methodology -- after making an appropriate correction for stroke
severity imbalances? As an aside -- the figures are not really identical,
because one would also need to make the same statistical correction for
the NIHSS 11-15 placebo group when using the standard methodology, and
that would decrease the "apparent" efficacy results by 3% [3], so that the
"apparent" efficacy of tPA using the standard methodology (after
correction for stroke severity imbalances) would only be 8% (46%-38%).
If the "true" efficacy of tPA is approximately 8-11%, then it is less
than 50% of the calculated "apparent" efficacy of tPA (21%) as determined
by the NINDS trialists using their standard methodology (without
correcting for imbalances in
baseline stroke severity).
If the "true" efficacy of tPA is only 8-11% for patients treated
between 91-180 minutes, and the NINDS trialists hypothesize that tPA's
efficacy wanes throughout the 91-180 minute time period [4], then what is
the likely efficacy of tPA for stroke patients treated between 150-180
minutes? Would the likely efficacy rate be greater-or-less than the likely
rate of a symptomatic ICH (~7%)? The "estimated" answer may markedly
affect one's calculation of the risk:benefit ratio of tPA, and one's
decision whether to utilize the drug for stroke patients who can only be
treated >150 minutes from the time of stroke onset.
Jeffrey Mann.
References:
1. The NINDS Stroke Study Group response. bmj rapid response letter.
2 July 2002.
Available online at http://bmj.com/cgi/eletters/324/7339/723#23369
2. Yusuf S, Wittes J, Probstfield J, Taylor HA. Analysis and
Interpretation of Treatment Effects in Subgroups of Patients in Randomized
Clinical Trials. JAMA 1991; 266:93-98.
3. Mann J. The raw data of the NINDS trial should be made public. bmj
rapid response letter. 8 July 2002.
Available online at http://bmj.com/cgi/eletters/324/7339/723#23369
4. Representative copy of figure 2 from the Marler article.
Available online at
http://emguidemaps.homestead.com/files/marlergraph.html
Competing interests: No competing interests
I appreciate Dr Grotta's willingness to submit a detailed response on
behalf of the NINDS study group, and welcome his provision of more
detailed subgroup data for further analysis. The NINDS study group's rapid
response letter is a testament to the social value of the bmj's rapid
response section, because it demonstrates that the rapid response section
can serve as a valuable forum for serious scientific discussion. The
readers of the bmj (and the wider public) are much better served if they
can study and analyse the arguments of multiple discussants, who obviously
have different points-of-view. All the discussants obviously "spin" their
analysis of the data, and the independently-thinking reader is hopefully
able to determine the likely "scientific truth" by carefully perusing the
different points-of-view.
Dr. Grotta stated "Dr. Mann writes that in order to obtain valid
results, critical prognostic variables have to be
prespecified, and corrected for, in the design of any randomized
controlled trial. This is true, but Dr. Mann’s
choice of prognostic variables is different than ours. What we think is
important is predicting who will respond to TPA. Dr. Mann is most
concerned with who will do well despite therapy." Dr Grotta is entirely
mistaken if he presumes that I am more concerned with who will do well
despite therapy. Surely, both factors have to be carefully considered when
designing a tPA-for-stroke trial? I can appreciate the fact that the NINDS
investigators mainly focused their attention on predicting who will most
likely respond to tPA when designing their stroke trial -- along the
practical lines suggested by David Sackett [1] who stated that confidence
in a trial's results is greater when the signal/noise ratio of the trial
is enhanced. According to Sackett the "signal describes the differences
between the effects of the experimental and control treatments". By
deliberatedly choosing patients who would most likely have a substantial
response to tPA, the NINDS trialists would obviously maximise the
"signal", which would subsequently increase one's confidence in tPA's
efficacy if the trial's results turned out to be positive. However, surely
it is equally important to decrease the "noise" in order to be confident
in the validity of the NINDS trial's results? Consider Sackett's
definition of "noise", which he defines as "noise (or uncertainty) in an
RCT is the sum of all the factors ("sources of variation") that can affect
the absolute risk reduction or absolute difference". In the case of tPA-
for-stroke trials that are poorly balanced for baseline stroke severity,
variations in the expected rate of a favorable stroke outcome (due to the
natural course of the disease) vary according to the degree of imbalance
in baseline stroke severity. Significant imbalances in baseline stroke
severity between treated and placebo patients create considerable "noise"
-- because they produce a significant chance-variability in the rate of a
favorable stroke outcome that may obscure (magnify/diminish) the "true"
efficacy of tPA and cause the "apparent" efficacy of tPA to be greater-or-
less than the "true" efficacy of tPA.
Now that Dr. Grotta has published the favorable stroke outcome
results for each subgroup, it is much easier to
demonstrate the degree of "noise" caused by stroke severity imbalances in
the NINDS trial by simply reviewing the results presented in table 1 in
Dr. Grotta's rapid response letter. The NINDS investigators' own data-
presentation shows that the "apparent" efficacy of tPA for patients
treated between 91-180 minutes is 21% (46% minus 25%), and only 14% (37%
minus 23%) when the NIHSS 0-5 subgroup's results are eliminated from
consideration. That represents a one-third reduction in tPA's "apparent"
efficacy due to the elimination of "noise" from the biggest single source
of confounding due to stroke severity imbalances between treated and
placebo patients in the NINDS trial -- the stroke outcome results from the
NIHSS 0-5 subgroups. The 7% absolute difference (due to the recruitment of
such a large percentage of very mild stroke patients) was chalked up as
being due to tPA therapy, when it was obviously due to the natural course
of the disease. It is ironic that such
a high proportion of very mild stroke patients were recruited into the
NINDS trial. Patients with very mild
strokes (NIHSS 0-5) represented about 20% of the total number of tPA
patients treated between 91-180 minutes. Recruiting patients with very
mild strokes is contrary to Sackett's basic principle of mainly recruiting
high risk
patients, who would more likely show a substantial response to tPA
therapy. It was also contrary to the NINDS investigators' own policy of
discouraging the recruitment of patients with a NIHSS score of <4 [2].
There may be another significant source of "noise" due to stroke
severity imbalances that may cause the "apparent" efficacy of tPA to be
different from the "true" efficacy of tPA -- "noise" that would be
generated if the placebo and tPA patients within EACH subgroup were not
near-perfectly balanced for baseline stroke severity (even though the
total number of placebo and tPA patients in each subgroup was near-equal).
I explored that particular issue at great length in a letter to the CMAJ
[3] and I wonder to what degree stroke severity imbalances within each/all
of the NINDS subgroup's could be a confounding factor. The true answer to
that question will only become fully apparent when the NINDS investigators
make all the raw data from the NINDS
trial publically available -- so that the public can much more accuratedly
determine how well-balanced EACH of the subgroups were for baseline stroke
severity. According to the TOAST graph [4] a two-point difference in the
"average" baseline NIHSS score between treated and placebo patients in the
NIHSS range of 16-20 can cause a 3-5% absolute difference in the rate of a
favorable stroke outcome, which could alter the "apparent" efficacy of
tPA for those patients by a factor of 30-50%. The figure of 30-50% is
obtained by simply examining the NIHSS 16-20 subgroups' results in table
1, which showed that the apparent" absolute efficacy of tPA for those
patients was 9% (27% minus 18%).
Another probable example of "noise" due to stroke severity imbalances
within the NINDS trial's subgroups can be ascertained by looking at the
NIHSS 11-15 and NIHSS 16-20 placebo subgroup's rate of favorable stroke
outcome results in table 1 in Dr. Grotta's letter. The rate of favorable
stroke outcome for the NIHSS 11-15 placebo subgroup was 14%, which was
less than the figure of 18% for the NIHSS 16-20 placebo subgroup. That
result is obviously surprising because patients with a baseline NIHSS
stroke severity score of 11-15 are naturally expected to have a much
better stroke outcome result than patients with a baseline stroke severity
NIHSS score of 16-20. The figure of 14% seems to be extraordinarily low
for untreated patients with a baseline NIHSS score in that stroke
severity range and it is much less than would be predicted. It would be
very informative if the NINDS trialists would publish the rate of
favorable stroke outcome results for the NIHSS 11-15 placebo subgroups
from the 0-90 minute arm of the NINDS trial, ECASS trial, ECASS II trial
and ATLANTIS trial (including their "average" baseline stroke severity
scores). It would be extremely useful to know whether the "average" rate
of favorable stroke outcome of the NIHSS 11-15 placebo patients from those
other trials are closer to the 34% figure predicted by the TOAST graph
[4], and whether the comparable NINDS placebo results from the 91-180
minute cohort is a statistical outlier that artefactually inflates the
"apparent" efficacy of tPA in that subgroup of patients. How much of an
effect could this particular imbalance have if the "true" rate of a
favorable stroke outcome for placebo patients in the NIHSS 11-15 subgroup
was 30%? The answer is that an additional 5 patients would have a
favorable stroke outcome.
Finally, there is another "noise" element factor due to stroke
severity imbalances in the NINDS trial that should
be considered. Note that there were 18 more patients in the placebo group
in the NIHSS >20 subgroup (compared to the tPA group), and that those
patients only had a 4% chance likelihood of a favorable stroke outcome. If
those 18 patients were equally distributed between the NIHSS 5-10, 10-15
and 16-20 subgroups, then an additional 6 patients would have a favorable
stroke outcome. Adding that figure of 6 patients to the 5 additional
patients from the NIHSS 11-15 subgroup means that an additional 11 placebo
patients would have a favorable stroke outcome. Then the computed figure
for the placebo group (excluding the NIHSS 0-5 subgroup) would be 47/160
and not 36/160, which translates to 29% and not 23%. That means that the
"apparent" efficacy of tPA would be reduced by another 6%, and it would
only be 8% and not 14%.
How useful is this post hoc conjecturing about the NINDS trial's
subgroup data? Dr. Grotta stated "It is also very
important to remember that these post-hoc analyses involving subgroups
without sufficient statistical power to answer the meaningful question
should be considered as “hypothesis generating” and providing a rationale
for further study only." I agree with Dr. Grotta, and I think that my post
hoc conjecturing about the "noise" influence of stroke severity imbalances
within the NINDS trial's subgroups is simply a "hypothetical" explanation,
which only becomes a "realistic" explanation if the raw data supports my
theory's basic tenets. That is why I have requested that the NINDS
investigators make the pooled raw data from all the tPA-for-stroke trials
publically available [5], so that the raw data can be independently
examined. The pooled results from all the tPA-for-stroke trials should be
examined for EACH level of baseline stroke severity (from a baseline NIHSS
score of 1-25) for different times-to-treatment, so that the "noise" due
to stroke severity imbalances can be eliminated as a confounding factor.
By also examining the favorable stroke outcome results for different times
-to-treatment, it will immediately become clear to what degree delays in
time-to-treatment affect the "apparent" efficacy of tPA -- without having
to depend on the hypothetical model constructed by the NINDS investigators
[6].
A number of other statements made by Dr. Grotta deserve further
commentary. He stated "Contrary to Dr. Mann’s
assertion, NIH Stroke Scale was a prespecified variable that was known to
predict outcome and it was corrected for in the usual way in the original
publication." I have parsed that original publication countless times and
I have never read any statement that implied that the NINDS investigators
had corrected for imbalances in baseline stroke severity between treated
and placebo patients. Hopefully, the NINDS ivestigators, or other bmj
readers, could point out the particular "statement/statements" in that
original publication that I must have missed. Dr. Grotta also stated "The
facts are that the baseline imbalance of stroke severity DOES NOT explain
the entire results of the trial. The imbalance explains some of the
difference, but there is no question that there is still benefit from TPA
91-180 minutes after stroke onset." I wholeheartedly agree with Dr. Grotta
-- the imbalance only explains some of the difference. However, the
critical question is how much of the difference is due to stroke severity
imbalances and how much is due to the "true" efficacy of tPA? That
question has presently not been answered, and I strongly suspect that an
accurate answer will only become apparent when all of the NINDS trial's
raw data is made available to the public. Do the NINDS investigators have
a valid reason for not making the raw data available -- considering that
the study was funded with public money through the NIH? There is a
disturbing dissonance between the refusal of the trial's investigators to
make patient-level data publicly available, and the NIH's traditional
stance on the dissemination of the results of NIH-sponsored research.
Indeed, this is unequivocally explicated in a draft policy statement
concerning data sharing released on March 1, 2002 [7]. The statement
states "There are many reasons to share data from NIH-supported studies.
Sharing data reinforces open scientific inquiry, encourages diversity of
analysis and opinion, promotes new research, makes possible the testing of
new or alternative hypotheses and methods of analysis -----". In fact, the
NIH draft statement makes some definite recommendations and it explicitly
states "The NIH will expect investigators supported by NIH funding to make
their research data available to the scientific community for subsequent
analyses."
Finally, Dr. Grotta also stated "The results of the NINDS study have
been confirmed by numerous independent reports from both academic and
community hospitals." How is that possible if those studies did not have a
placebo arm? By what means could post-marketing tPA-for-stroke studies
determine the "true" efficacy of tPA if they did not have an absolute or
relative comparator? In the absence of an absolute comparator (equally
balanced group of placebo patients in a RCT), one could theoretically only
determine that the "other" study had a similar degree of efficacy as the
NINDS trial if the "other" study had tPA patients with an identical stroke
severity distribution as the original NINDS trial. Does anyone know of
such a study? In the absence of knowledge of such a study, I took the
Multicentre Stroke Survey's group of >1,000 tPA patients, who had an
"average" rate of a favorable stroke outcome of 33%, and I calculated the
likelihood of a similar group of untreated stroke patients having a
favorable stroke outcome due to the natural course of the disease (using
data from the NINDS trial and not the TOAST study). The calculated results
were reported in my rapid response letter [3] and the estimated "average"
figure was 31.7%. That figure suggests that tPA was probably not
significantly efficacious in those patients. Although the results are only
based on a relative comparison, which is not universally regarded as being
statistically valid, I would be interested in knowing if anyone has a
better means of demonstrating how the results of post-marketing studies
(which are not RCTs) can accuratedly confirm-or-refute the positive
results of the NINDS trial.
Jeffrey Mann.
References:
1. Sackett, David L. Why randomized controlled trials fail but
needn't: 2. Failure to employ physiological statistics, or the only
formula a clinician-trialist is ever likely to need (or understand!) CMAJ:
Canadian Medical Association Journal. 165(9):1226-1237, October 30,
2001.
Available online at http://www.cmaj.ca/cgi/content/full/165/9/1226
2. Comment by Patrick Lyden at the FDA Advisory Committee meeting -
June 6th, 1996. From the meeting's transcripts - lines 15-16 on page 183.
3. Mann J. To what degree do stroke severity imbalances affect the
"apparent" efficacy of tPA. Canadian Medical
Association Journal rapid response letter. June 25th 2002.
Available online at http://www.cmaj.ca/cgi/eletters/166/13/1652#113
Also available at http://emguidemaps.homestead.com/files/cmaj-reply2.html
4. Adams HP, Davis PH, Leira EC, Chang KC, Bendixen BH, Clarke W, et
al. Baseline NIH stroke scale score strongly predicts outcome after
stroke: a report of the Trial of Org 10172 in Acute Stroke Treatment
(TOAST). Stroke 1999; 30 (11): 2496.
5. Mann J. An open letter to the stroke interventionist community.
bmj rapid response letter. 19 May 2002.
Available online at http://bmj.com/cgi/eletters/324/7339/723#22326
6. Representative copy of figure 2 from the Marler article.
Available online at
http://emguidemaps.homestead.com/files/marlergraph.html
7. NIH announces draft statement on sharing research data. Release
Date: March 1, 2002.
Available online at http://grants.nih.gov/grants/guide/notice-files/NOT-OD-02-035.html
Competing interests: No competing interests
The NINDS investigators would like to respond to the recent articles
in the British Medical Journal (1) and Western Journal of Medicine (2-4).
Please excuse the necessary length of our combined reply.
First Dr. Mann’s comments (2). We also encourage physicians to think
independently and would like to provide further information in response to
Dr. Mann’s four main points. This is done in the spirit of discussion and
not debate. We hope the readers of the BMJ and WJM will take time to
consider our response. We thought our previous peer-reviewed reports had
addressed Dr. Mann’s questions.
Dr. Mann writes that in order to obtain valid results, critical
prognostic variables have to be prespecified, and corrected for, in the
design of any randomized controlled trial. This is true, but Dr. Mann’s
choice of prognostic variables is different than ours. What we think is
important is predicting who will respond to TPA. Dr. Mann is most
concerned with who will do well despite therapy. There is good evidence
from the NINDS study that over a broad range of the baseline NIH stroke
scale, patients treated with TPA do better than those who are not treated
with TPA. Contrary to Dr. Mann’s assertion, NIH Stroke Scale was a
prespecified variable that was known to predict outcome and it was
corrected for in the usual way in the original publication (5). The data
in the two trials composing the study confirm that NIHSS does not reliably
predict response to therapy (6). The only group in which the benefit is
not apparent without correcting for other variables is the group with very
mild strokes having 0-5 on their baseline NIHSS (9% of the patients in the
study, 7% <_90 minutes="minutes" _11="_11"/>90 minutes). In this case, there are
few patients to evaluate and the outcome variables may not have been
sensitive to different degrees of minimal disability. As Dr. Mann points
out, almost all of these patients had minimal or no disability at three
months whether or not they were treated with TPA. Even though it makes
the argument more complicated, we would also like to point out that there
are other baseline variables that predict outcome. Age of the patient is
just one example.
Dr. Mann writes that randomization into the TPA and placebo groups
was flawed in the National Institute of Neurological Disorders and Stroke
(NINDS) trial. There is no evidence that the randomization process was
flawed. The result of the randomization, however, was not an equal
assignment of baseline stroke scale in every small subgroup. This happens
in all trials. Fortunately, as you will see below, the effect of the drug
is so large that it overpowers these imbalances. In the 91-180 minute
group, the average baseline NIHSS score was lower for the patients
assigned randomly (in a process that was not flawed) to the placebo group.
We feel that our published analyses did account for this imbalance in our
assertion that the drug reduced disability at three months. We hope that
the data provided here will make the validity of the statistics more
clear.
We understand how Dr. Mann could at first glance suspect that this
imbalance in randomization could alone account for the apparent
effectiveness of TPA shown in the NINDS trial. We had recommended as a
group that there be a goal of one hour door to needle time because it was
so apparent and logical to us that treating early was critical (7).
Several years after the results of the trial were published, though, it
was clear that the door to needle time varied widely and that physicians
were delaying treatment of patients even within the three hour window.
Therefore we looked more thoroughly to see if time to treatment predicted
a better response to treatment. In our post-hoc analysis (8), it did
appear that door to needle time was important even within three hours from
stroke onset. What we failed to make clear was that the baseline
difference in stroke scale in the 91-180 minute group did NOT account for
the effectiveness of TPA in the entire trial. In other words, there were
not so many more patients in the TPA treated group with baseline NIHSS 0-5
that were treated 91-180 minutes that their predictably good outcome could
have explained the entire effect of the drug. To make this as clear as
possible, we provide the following data tables with the qualification that
they do not account for other significant variables that predict outcome
such as age. It is also very important to remember that these post-hoc
analyses involving subgroups without sufficient statistical power to
answer the meaningful question should be considered as “hypothesis
generating” and providing a rationale for further study only (9). The
facts are that the baseline imbalance of stroke severity DOES NOT explain
the entire results of the trial. The imbalance explains some of the
difference, but there is no question that there is still benefit from TPA
91-180 minutes after stroke onset. Until now we had thought that this
benefit had been made clear in our previous publications (5,6,8).
Table 1 Rankin (Good outcome = 0, 1) at 3
months for patients treated 91-180 minutes from stroke onset
Baseline NIHSS (Patients treated 91 to |
Patients with Rankin Good |
Relative Risk (95%CI) |
|
TPA |
Placebo |
||
1-5 |
24/29 (83%) |
6/7 (86%) |
1.0 (0.7,1.4) |
6-10 |
23/37 (62%) |
23/46 (50%) |
1.2 (0.8,1.8) |
11-15 |
10/26 (38%) |
5/35 (14%) |
2.7 (1.0,6.9) |
16-20 |
9/33 (27%) |
6/33 (18%) |
1.5 (0.6,3.7) |
>20 |
4/28 (14%) |
2/46 (4%) |
3.3 (0.6,16.8) |
All Patients |
70/153 (46%) |
42/167 (25%) |
1.8 (1.3,2.5) |
>5 (All, excluding 1-5) |
46/124 (37%) |
36/160 (23%) |
1.6 (1.1,2.4) |
Table 2 Symptomatic hemorrhage within 36 hours
of treatment for patients treated 91-180 minutes from stroke onset
Baseline NIHSS (Patients treated 91 to |
Patients with Symptomatic |
Relative Risk (95%CI) |
|
TPA |
Placebo |
||
1-5 |
0/29 (0%) |
0/7 (0%) |
Undefined |
6-10 |
2/37 (5%) |
1/46 (2%) |
2.5 (0.2,26.4) |
11-15 |
2/26 (8%) |
0/35 (0%) |
Undefined |
16-20 |
2/32 (6%) |
1/33 (3%) |
2.1 (0.2,21.6) |
>20 |
4/28 (14%) |
0/46 (0%) |
Undefined |
All Patients |
10/152 (7%) |
2/167 (1%) |
5.5 (1.2,24.7) |
>5 (All, excluding 1-5) |
10/123 (8%) |
2/160 (1%) |
6.5 (1.5,29.1) |
Table 3 Number of patients treated 91-180
minutes after onset who died within 90 days post-treatment
Baseline NIHSS (Patients treated 91 to 180 minutes) |
Patients with Rankin Good |
Relative Risk (95%CI) |
|
TPA |
Placebo |
||
1-5 |
0/29 (0%) |
0/7 (0%) |
Undefined |
6-10 |
0/37 (0%) |
4/46 (9%) |
Undefined |
11-15 |
5/26 (19%) |
7/35 (20%) |
1.0 (0.3,2.7) |
16-20 |
7/33 (21%) |
8/33 (24%) |
0.9(0.4,2.1) |
>20 |
12/28 (43%) |
16/46 (35%) |
1.2 (0.7,2.2) |
All Patients |
24/153 (16%) |
35/167 (21%) |
0.7 (0.5,1.2) |
>5 (All, excluding 1-5) |
24/124 (19%) |
35/160 (22%) |
0.9 (0.6,1.4) |
These data suggest that, while the numbers are too small in each
subgroup to reach significant conclusions, in each subgroup of NIHSS
scores the trends are all in the direction of more favorable outcome
(relative risk > 1.0 of 3 month Rankin score = 0 or 1) in the TPA group
compared to placebo in all those except the NIHSS 0-5 group (the “mild”
strokes that were, in fact, more prevalent in the TPA group treated from
91-180 minutes). In the total group, and in the subgroup that most
concerns Dr. Mann, those with NIHSS > 5 treated 91-180 minutes post
stroke, the data significantly favor TPA. As found in our previous
publications (5), the risk of hemorrhage is higher in the TPA groups, but
nevertheless there is no difference in mortality between TPA and placebo
in any subgroup based on baseline NIHSS or treatment interval.
We have tried to briefly address Dr. Mann’s charges. There were
numerous other technical errors and incorrect assumptions that he made.
We will be submitting a more detailed analysis to a peer-reviewed journal
shortly. Dr. Mann does not question the benefit of TPA less than 91
minutes after stroke onset. We are confident that benefit remains for
patients treated 91-180 minutes, but that the benefit from treatment is
less as time from onset increases. We would like to reemphasize the
importance of the one hour door to needle time and seek the advice of the
community of physicians on how to deliver this beneficial treatment to
appropriate patients as soon as possible within the three hour limit.
Dr. Trotter is misinformed (3). The NINDS trial was not sponsored by
Genentech. The greatest possible distance from their influence was
maintained during the conduct of the trial and immediately following the
publication of the results. Genentech did provide the drug and did do the
extra documentation required to apply for approval for use in treatment of
acute stroke. They did not control the data from the trial and were not
provided the data until after the final analysis had been completed. None
of the NINDS investigators were employees or consultants to Genentech
either during the conduct of the trial or at the time of the FDA hearings
that occurred 6 months after the publication of the results. We cannot
speak to the actions of the AHA and the financial relationship of that
organization to Genentech, but the original guidelines recommending
TPA(10) were formulated not by the AHA, but by independent “clinician-
interpreters” empanelled by the AHA, 8/13 of whom had nothing to do with
the NINDS trial, and none of whom personally were receiving funds from
Genentech at the time the guidelines were formulated. The guidelines that
seem to bother Drs. Mann and Trotter recommending the use of TPA for
stroke as “standard of care” were formulated much later and for Emergency
Physicians (11). We cannot deny that we are convinced by the data from
the NINDS TPA stroke study. The results of the NINDS study have been
confirmed by numerous independent reports from both academic and community
hospitals. There is one report from Cleveland of bad experience in a
community hospital setting (12). A recent abstract (13) reported outcomes
in the same Cleveland hospitals consistent with those from the NINDS trial
after educational efforts resulted in better protocol adherence. TPA is
presently the only available effective drug treatment for stroke. We hope
community physicians will join the effort to find better treatments in the
future.
The NINDS investigators agree with Dr. Trotter on one important
point—that sensationalistic journalism often presents a biased viewpoint.
An excellent example of this is Jeanne Lenzer’s article in the BMJ (1).
In this article, Ms. Lenzer raises many of the same issues as does Dr.
Mann and that we have just addressed in the preceding paragraphs, but she
makes no effort to balance her “investigative journalism” with any of the
abundant evidence supporting the use of TPA. She is also inaccurate in
the evidence she presents. For example, in her first paragraphs, what is
her evidence that the treatment recommendations to use TPA “could cost
more lives than the disease itself”? This is a wild exaggeration,
considering that of the over 400,000 acute ischemic strokes occurring in
the U.S. yearly, about 15% or 60,000 will die in the first month without
treatment, and that there was no excess mortality in TPA treated patients
in the NINDS trial. Even in the worst case scenario as published in the
“Cleveland study” (12), where there was an excess mortality of 10% in
treated patients, this would amount to 800 excess deaths nationwide
considering that such poor results occurred in a setting where less than
2% of stroke patients were treated.
In response to Dr. Wardlaw (4), we appreciate her assertion that the
NINDS trial results are valid and that the results are supported by
findings in other trials involving over 2400 patients. We would like
correct her statement that the patients in the trial were treated at
tertiary referral centers. Most of the patients in the NINDS trial were
treated at community hospitals. Dr. Wardlaw also questions the blinding of
the investigators in determining the primary 3-month outcome measures.
She may not be aware of the extra efforts made in the NINDS trials. By
protocol, the persons ascertaining the 3-month outcomes were persons who
were not present at the time the patients were randomized. As we are,
she is concerned that few people are receiving the treatment, but please
note that she expresses no doubt that TPA has a beneficial effect. Her
solution to the problem of limited use of TPA for stroke in practice is to
be satisfied with an even smaller benefit for the larger group of patients
in the 0-6 hour time window. Since she would be satisfied with a smaller
effect, a much larger trial will be needed. If she wishes to stratify her
trial by NIHSS, then she may do so, however, the need for stratification
would seem to be even less important in a larger trial. The NINDS
investigators would hope that convincing evidence of the benefit of TPA
beyond three hours is eventually proven in a prospective trial. However,
in the meantime, let’s focus on maintaining the shortest possible door to
needle time and encouraging more patients to recognize stroke and come
immediately to the hospital for emergency care. The recent delayed
diagnosis of President Gerald Ford’s stroke should stand as a reminder to
all of us how far we have to go in the treatment of this major disabling
disease.
The NINDS rt-PA Stroke Trial was an unbiased study. Its results are
highly significant and have been replicated in practice. These two
consecutive randomized trials each showed there was less than 1% chance
that the positive result was due to chance so that the likelihood that the
positive results in both trials were due to chance is vanishingly small.
We still think efforts to make this treatment available to as many
patients as possible are warranted. Not learning how to carry out this
therapy for appropriate patients at appropriate centers where acute stroke
patients are brought, because of the guise that the NINDS study was
biased, cannot be defended. Dr. Wardlaw is correct that “patients with
stroke….are unable to fight for their rights”. It is up to those
entrusted with their care to put away hyperbole and resistance to change.
We are open to suggestions on how we can work with the larger community of
physicians to make the promise of thrombolytic therapy a reality for more
of the millions of patients who have a stroke each year.
References
1. Lenzer J, Alteplase for stroke: money and optimistic claims
buttress the “brain attack” campaign. BMJ 2002;324;723-729.
2. Mann J. “An open letter to the stroke interventionalist
community”, and “Truths about the NINDS study:setting the record
straight.” West J Med 2002;176:192-194.
3. Trotter G. “Why were the benefits of TPA exaggerated?” West J
Med 2002;176:194-197.
4. Wardlaw JA, Linley RI. “Thrombolysis for acute ischemic
stroke: still
a treatment for the few by the few.” West J Med 2002;176:198-199.
5. The National Institute of Neurological Disorders and Stroke rt
-PA Stroke Study Group. Tissue Plasminogen Activator for Acute Ischemic
Stroke. N Engl J Med 1995;333:1581-1587.
6. The National Institute of Neurological Disorders and Stroke rt
-PA Stroke Study Group. Generalized efficacy of rt-PA for acute stroke:
subgroup analysis of the NINDS rt-PA stroke trial. Stroke;1997;28:2119-
2125.
7. Marler JR, Emr M, Jones P, editors. Proceedings of a National
Symposium on Rapid Identification and Treatment of Acute Stroke. The
National Institute of Neurological Disorders and Stroke (NINDS), National
Institutes of Health. 1997. Also available online at
http:\\www.ninds.nih.gov (NINDS) and http:\\www.stroke-site.org (Brain
Attack Coalition).
8. Marler JR, Tilley BC, Lu M, Brott TG, Lyden PC, Grotta JC,
Broderick JP, Levine SR, Frankel MP, Horowitz SH, Haley EC Jr, Lewandowski
CA, Kwitkowski TP for NINDS rt-PA Stroke Study Group. Early stroke
treatment associated with better outcome: the NINDS rt-PA Stroke Study.
Neurology 2000; 55:1649-1655.
9. Yusef S, Wittes J, Probstfield J, Tyroler H. Analysis and
interpretation of treatment effects in subgroups of patients in randomized
clinical trials. JAMA 1991;266:93-98.
10. Guidelines for Thrombolytic Therapy for Acute Stroke: A
Supplement to the Guidelines for the Management of Patients With Acute
Ischemic Stroke Circulation. 1996;94:1167-1174.
11. American Heart Association in Collaboration with the
International Liason Committee on Resuscitation. Guidelines 2000 for
cardiopulmonary resuscitation and emergency cardiovascular care. Part 7:
The era of reperfusion: Section 2: acute stroke. Circulation 2000; 102(8
suppl I): I204-I216
12. Katzan IL, Furlan AJ, Lloyd LE, Frank JI, Harper DL, Hinchey JA
et al. Use of tissue-type plasminogen activator for acute ischemic
stroke: the Cleveland Area Experience. JAMA 2000; 283:1151-1158.
13. Anthony Furlan, personal reference.
Competing interests: No competing interests
In his commentary [1] on Jeanne Lenzer's article, Philip Gorelick
stated "Scientific debate is the crucible in which to test evolving
hypotheses and to evaluate claims for benefit of new treatment strategies.
The most fruitful discussions are conducted between informed scientists
who may agree or disagree with the new observations and the conclusions.
The acceptable milieu is the scientific meeting or the peer-reviewed pages
of the leading scientific journals." I agree with that statement, but I
also feel that online discussions between informed scientists can
supplement those traditional venues as a serious forum for scientific
debate. Critics, who are not personally involved in a particular area of
research, often do not attend scientific meetings frequented by super-
specialists, and that phenomenon robs those super-specialists of the
opportunity to become aware of contrasting points-of-view. The rapid
response section of the bmj is a wonderful idea,
because it allows for an ongoing debate in the "open" forum of public
space, thus giving interested readers the opportunity to view the facts
from two (or more) conflicting points-of-view.
I applaud Dr. Jeffrey Saver for taking the trouble to reply to my
criticism of his views [2] in his recent rapid response letter to the bmj
[3]. I think that it is eminently fair that Dr. Saver be given the
opportunity to reply to his critics, because it gives interested readers
an opportunity to hear both sides of an argument. Being able to view
"reality" through the prism of contrasting viewpoints gives the reader an
opportunity to more clearly discern the "scientific truth". However, the
subject of tPA-for-stroke trial methodology is very complicated and many
readers may not be able to clearly appreciate the nuances of the debate
without further explication. It is obvious that each countering reply
presents the best face on the author's argument -- and subtle, but
important, facts may not be obvious to the uniformed reader without
further explication.
Dr. Saver states [3] "Dr. Jeffrey Mann wildly overstates his case. In
theory, it is best to adjust for baseline variables based on pretrial
knowledge of influence on outcome. In practice, sufficiently detailed
knowledge regarding the influence of baseline variables on clinical
outcome in populations precisely resembling those being enrolled in a
clinical trial is almost never available. As a result, virtually all
clinical trials adjust for influences actually observed in the enrolled
control population, as a cursory glance at trials published in leading
medical journals, including the BMJ itself, will attest. Moreover, it is
scientifically unsound to adjust for baseline imbalances using estimates
of influence upon clinical outcome that are known to be inappropriate for
the enrolled clinical trial population. Yet this well-recognized error is
just what
Mann commits in his posting and in his Western Journal of Medicine essay."
I agree with Dr. Saver that sufficiently
detailed, and appropriate, knowledge regarding the influence of baseline
variables on clinical outcome is not always
available and that trialists therefore have to rely on a less optimum
approach -- using a post hoc statistical adjustment to correct for the
most critically important baseline prognostic variables. Dr. Saver states
further "it is solid evidence based medicine, indeed obligatory EBM, to
adjust for influences of baseline variables using effects observed in the
control group actually enrolled in the clinical trial." If that EBM
practice is indeed obligatory - and I fully agree that it is - then why
did the NINDS trialists not make the obligatory adjustment in their
original analysis of the NINDS trial, as initially reported in the NEJM in
1995? Also, why did all the other tPA-for-stroke RCTs (ECASS, ECASS II,
ATLANTIS) fail to adjust for the influence of baseline stroke severity
variations when interpreting their trial's results? Even more recent re-
analyses in the medical literature, such as Gregory Albers' review of the
ATLANTIS trial's < 3 hours patient group [4] and Marc Fisher's meta-
analysis of the NINDS, ECASS, ECASS II and ATLANTIS trials [5] make no
attempt to correct for imbalances in baseline stroke severity between
treated and placebo patients when analysing the raw data. Has the stroke
research community been scientifically delinquent -- if it has willfully
ignored the confounding variable of "variations in baseline stroke
severity" when analysing the raw data of tPA-for-stroke trials? What is
wilder -- a total failure of clinical trialists to make any statistical
adjustment for a critical prognostic variable (baseline stroke severity
variations in tPA-for-stroke trials), or the imperfect use of a
statistical adjustment tool to better understand the implications of that
failure?
Is it essential to correct for imbalances in baseline stroke severity
in tPA-for-stroke trials? My dialogue essay published in the wjm [6]
suggests that it is critically important because of the steep slope angle
of the graph curve relating baseline stroke severity to the rate of
excellent stroke outcome in untreated patients -- each one point change in
baseline NIHSS stroke severity score could cause a 5-10% difference in the
absolute rate of an excellent stroke outcome due to the natural course of
the disease. If the slope-angle of the graph curve wasn't that steep, then
the interpretative-error that would occur
if that prognostic variable was ignored would not be that significant.
However, baseline stroke severity has a major impact on the final
prognosis, and any stroke severity imbalances between treated and placebo
patients must be corrected for in ALL stroke trials.
Dr. Saver argues that my use of the TOAST graph is invalid. His
reasoning is:- "Because spontaneous improvement and
worsening frequently occur early after stroke onset, NIHSS stroke scores
12-24 hours after stroke onset are expected to
correlate much more substantially with final clinical outcome than NIHSS
stroke scores 1-3 hours after stroke onset.[1] The TOAST trial population
was enrolled up to 24 hours after stroke onset, with virtually no patients
enrolled within 3 hours of onset. It is self-evidently an inappropriate
population to use in estimating influence upon outcome of baseline NIHSS
scores in both NINDS TPA Trial 1 and NINDS TPA Trial 2, each enrolling
only under 3 hour patients." I agree with Dr. Saver -- the fact that there
are variations in baseline stroke severity measurements, depending on how
soon after stroke onset the measurements are made, means that the TOAST
graph cannot be utilized as an absolute "gold standard" measuring stick
when it comes to making accurate predictions about final stroke outcome.
However, Dr. Saver fails to mention that I fully acknowledged that fact in
my open letter to the stroke interventionist community [7]. I specifically
acknowledged that any information derived from the TOAST graph only allows
for a "best guess" estimate, and that it cannot necessarily serve as a
"gold standard" measurement. In that rapid response letter, I stated "If
the stroke interventionist community does not agree with the estimated
figures from the TOAST graph, then why does it not perform a
prospective study on 10,000 acute ischemic stroke patients (untreated) and
measure their baseline NIHSS stroke severity
scores and their rate of excellent stroke outcome at 3 months? It would
then be possible to draw a TOAST-like graph
showing the precise relationship between the baseline NIHSS score and the
rate of excellent stroke outcome without
having to use a logistic regression equation to draw a "best-fit" graph
(because the graph would be plotted from actual baseline NIHSS scores for
each level of baseline stroke severity from a NIHSS score of 1-25). By
plotting that graph, the stroke interventionist community would have
established a "gold standard" curve that would allow it to determine the
true benefit of tPA therapy (or any other stroke therapy) -- by comparing
the individual treated patient's results to the "gold standard" graph." In
other words, I fully acknowledged the fact that the TOAST graph can only
be used as a "relative" comparison, and I have already suggested that the
stroke research community should debate the issue further and
"standardize" the required measurements by appropriate means. The stroke
research community can also debate the time-point issue further and decide
at what time-point after an acute stroke, the baseline stroke severity
measurement should be made. However, readers should not miss the essential
point that I was trying to make -- that using information gleaned from the
TOAST graph (which is only "relatively" accurate) is probably a "second-
best" solution, that should only be necessary if stroke RCTs cannot
perfectly randomize the treated and placebo patients for baseline stroke
severity, and if the stroke research community has not established a "gold
standard" post hoc statistical method of accurately adjusting for
imbalances in baseline stroke severity, that is acceptable to the entire
international stroke research community.
Is there a "gold standard" statistical adjustment that should be
applied post hoc to accurately correct for imbalances in baseline stroke
severity between treated and placebo patients in tPA-for-stroke trials?
Has the entire stroke research community debated all the alternative
methods of making that statistical adjustment, and have they jointly
decided on the optimum methodology? Are stroke interventionists even aware
of the statistical methodology that was used by the NINDS trialists to
correct for baseline stroke severity imbalances in the NINDS trial, and do
they agree on its appropriateness and accuracy? Dr. Saver did not mention
the fact that the only tPA-for-stroke trial that has made any post hoc
statistical correction for stroke severity imbalances was the NINDS trial
-- and that very important fact was only reported in the medical
literature [8] for the first time 5 years after the NINDS trial's results
were initially reported in the NEJM. What are the implications of that
post hoc statistical correction on the "accurate" interpretation of the
NINDS trial's raw data? Dr. Saver has skirted that issue completely. In my
open letter [7] I analysed the implications of the NINDS trialist's post
hoc
statistical adjustment - accepting, for argument sake, their published
figures - and I estimated that it implies that tPA has marginal, and
equivocal, efficacy approximately > 150-180 minutes after stroke onset.
That estimation was based on a "guessestimated" relative risk reduction
(RRR) for the 150-180 minute time period that must be much lower than 1.32
(the RRR figure for the entire 0-180 time period). Does Dr. Saver know
what's the RRR of tPA for stroke patients treated between 150-180 minutes
after stroke onset in the NINDS trial -- after making the appropriate
statistical correction for imbalances in baseline stroke severity?
Dr. Saver also stated "Dr. Mann appears generally insensitive to
important population distinctions between stroke studies. In a discussion
regarding the hemorrhagic risks of intravenous TPA given within 3 hours of
onset, he invokes our report on predictors of hemorrhage in patients given
intra- arterial thrombolysis within 6 hours of onset. In fact, most
patients in our recent analysis were treated with intra-arterial
thrombolysis because they were NOT eligible for intravenous tPA. It is
inappropriate to extend findings from a different procedure with a
different time window and different baseline characteristics to the NINDS
trials." If interested readers read my actual words from the open letter
[7] they will note that I stated "Kidwell showed that the secondary ICH
rate increased markedly with baseline NIHSS scores > 10, but their
published results are of limited value because the study sample size was
too small, because they used IA tPA, and because they used a subgroup
analysis with subgroups that are too broad." In other words, I readily
acknowledged that one could not accuratedly extrapolate their results
because they used intra-arterial tPA and not intravenous tPA (among other
reasons). However, does Dr. Saver argue that the results for intravenous
tPA show a markedly different pattern of secondary ICH compared to intra-
arterial tPA? Consider a recent article by Tanne [9] and note in figure 1
(risk of all ICH by baseline stroke severity) that the rate of secondary
ICH goes up dramatically with baseline NIHSS stroke severity levels >
10. The only point that I was trying to make is that one has to carefully
consider variations in baseline stroke severity when trying to estimate
the likelihood of a secondary ICH -- just as one needs to take variations
in baseline stroke
severity, and variations in time-to-treatment, into precise account when
judging the likely efficacy of tPA. How else can one accuratedly predict
the absolute risk:benefit ratio of tPA therapy when treating different
acute ischemic stroke patients with strokes of varying stroke severity -
at varying times-to-treatment?
Finally, Dr. Saver stated "We also concur with calls to make public
detailed raw data from the pivotal NINDS-TPA trials
and from the pooled analysis of all 6 major intravenous TPA trials soon to
be published." On this point, there is absolutely no contention, and I
fully support the call to make public the raw (patient-level) data from
the NINDS and other intravenous tPA trials.
Jeff Mann.
References:
1. Philip Gorelick - Alteplase and Acute Stroke.
http://bmj.com/cgi/eletters/324/7339/723#21968, 7 May 2002
2. Saver JL, Kidwell CS, Starkman S. Commentary: Thrombolysis in
stroke: it works! BMJ 2002;324:723-729 (23
March).
3. Jeffrey L Saver - In reply. bmj rapid response letter. 19 May
2002.
Available at http://bmj.com/cgi/eletters/324/7339/723#22339
4. Albers, Gregory W. MD. Clark, Wayne M. MD. Madden, Kenneth P. MD,
PhD. Hamilton, Scott A. PhD. ATLANTIS
Trial: Results for Patients Treated Within 3 Hours of Stroke Onset.
Stroke.33(2):493-496, February 2002.
5. Fisher, Marc MD; Ringleb, P. A. MD; Schellinger, P. D. MD;
Schranz,C. MD; Hacke, W. PhD, MD Thrombolytic
Therapy Within 3 to 6 Hours After Onset of Ischemic Stroke: Useful or
Harmful? Stroke. 33(5):1437- 1441, May 2002.
6. Mann J. Truth about the NINDS study: setting the record straight.
West J Med 2002;176:192-194.
Available at http://www.ewjm.com/cgi/content/full/176/3/192
7. Mann J. An open letter to the stroke interventionist community.
bmj rapid response letter. 19 May 2002.
Available at http://bmj.com/cgi/eletters/324/7339/723#22326
8. Marler, J R. MD. Tilley, B. C. PhD. Lu, M. PhD. Brott, T.G. MD.
Lyden, P. C. MD. Grotta, J. C. MD. Boderick, J. P.
MD. Levine, S. R. MD. Frankel, M.P. MD. Horowitz, S. H. MD. Haley, E. C.
Jr. MD. Lewandowski, C. A. Kwiatkowski,
T. P. MD. for the NINDS rt-PA Stroke Study Group *. Early Stroke Treatment
Associated With Better Stroke Outcome:
The NINDS rt-PA Stroke Study. Neurology 55 (11) 1649 - 1655, December 12,
2000.
9. Tanne, David MD. Kasner, Scott E. MD. Demchuk, Andrew M. MD. Koren
-Morag, Nira PhD. Hanson, Sandra MD.
Grond, Martin MD. Levine, Steven R. MD. the Multicenter rt-PA Stroke
Survey Group *. Markers of Increased Risk of
Intracerebral Hemorrhage After Intravenous Recombinant Tissue Plasminogen
Activator Therapy for Acute Ischemic
Stroke in Clinical Practice: The Multicenter rt-PA Acute Stroke Survey.
Circulation. 105(14):1679-1685, April 9, 2002.
Competing interests: No competing interests
We thank BMJ readers who have posted responses to our commentary and
are grateful for the opportunity to address them.
Dr. Jeffrey Mann wildly overstates his case. In theory, it is best to
adjust for baseline variables based on pretrial knowledge of influence on
outcome. In practice, sufficiently detailed knowledge regarding the
influence of baseline variables on clinical outcome in populations
precisely resembling those being enrolled in a clinical trial is almost
never available. As a result, virtually all clinical trials adjust for
influences actually observed in the enrolled control population, as a
cursory glance at trials published in leading medical journals, including
the BMJ itself, will attest. Moreover, it is scientifically unsound to
adjust for baseline imbalances using estimates of influence upon clinical
outcome that are known to be inappropriate for the enrolled clinical trial
population. Yet this well-recognized error is just what Mann commits in
his posting and in his Western Journal of Medicine essay. Because
spontaneous improvement and worsening frequently occur early after stroke
onset, NIHSS stroke scores 12-24 hours after stroke onset are expected to
correlate much more substantially with final clinical outcome than NIHSS
stroke scores 1-3 hours after stroke onset.[1] The TOAST trial population
was enrolled up to 24 hours after stroke onset, with virtually no patients
enrolled within 3 hours of onset. It is self-evidently an inappropriate
population to use in estimating influence upon outcome of baseline NIHSS
scores in both NINDS TPA Trial 1 and NINDS TPA Trial 2, each enrolling
only under 3 hour patients. The two NINDS TPA trials were the first
studies ever to characterize well the course of a large population of
under 3 hour stroke patients. In such a case, it is solid evidence based
medicine, indeed obligatory EBM, to adjust for influences of baseline
variables using effects observed in the control group actually enrolled in
the clinical trial.
Dr. Mann appears generally insensitive to important population
distinctions between stroke studies. In a discussion regarding the
hemorrhagic risks of intravenous TPA given within 3 hours of onset, he
invokes our report on predictors of hemorrhage in patients given intra-
arterial thrombolysis within 6 hours of onset. In fact, most patients in
our recent analysis were treated with intra-arterial thrombolysis because
they were NOT eligible for intravenous tPA. It is inappropriate to extend
findings from a different procedure with a different time window and
different baseline characteristics to the NINDS trials.
Dr. Solomon’s posting contains much opinion, but little actual
analysis. He wishes we were wrong, but is unable to demonstrate that we
are. Two glaring errors in his post are worth pointing out. First, his
statements regarding the “number needed to harm” with TPA are completely
misleading (as are those of Li and colleagues). TPA causes more patients
to bleed – the number needed to treat to cause symptomatic intracerebral
hemorrhage does approximate 17. However, TPA also prevents an
approximately equal number of patients from experiencing symptomatic
worsening from stroke extension, cerebral herniation, and other
complications of large infarcts. Baldly stated, if you receive TPA, the
risk is increased that you may bleed and die. If you don’t receive TPA,
the risk is increased that you may herniate and die. The salient number
needed to harm is the net sum of these two factors, and across all under 3
hour trials, there is no net harm.
A second misrepresentation advanced by Dr. Solomon is his suggestion
that in many communities only a small minority of stroke patients present
via ambulance. We are unaware of any published data that would support
such a claim, and he fails to reference any. In fact, published studies
suggest just the opposite, that 35-70% of all acute stroke patients are
transported to the emergency department by emergency medical services.[2]
This number is likely to be even higher for patients who are candidates
for thrombolytic therapy.
Our UCLA colleague, Dr. Hoffman, makes the rather startling
suggestion that it was unfair of us to refute the incorrect scientific
statements in Ms. Lenzer’s article because Ms. Lenzer is not a scientist.
We could not disagree more. Writings of both journalists and scientists
should be held to one standard—the truth. We are sure Ms. Lenzer would
agree, pace Hoffman. Dr. Hoffman also wishes the BMJ had solicited
commentaries more accurately reflecting the balance of opinion on TPA in
acute stroke. So do we. It is important, however, to realize what such
balanced commentaries would look like. The typical opinion page dichotomy
of one pro opinion and one con opinion would give an entirely misleading
view of the state of informed opinion. Among American stroke experts,
there is overwhelming consensus that TPA is efficacious. A balanced set of
expert commentaries would include an order of magnitude greater number of
positive opinions than negative opinions. We join with Dr. Hoffman in
urging the BMJ to solicit such a representative set of publications.
Lastly, Dr. Hoffman erects a straw man. We never suggested that “it is
impossible to be an expert on a subject about which one does not have a
conflict of interest.” Rather, we merely suggested that many experts will
have minor competing interests. In this regard, it is ironic that Dr.
Hoffman has modified his posting since it originally appeared, adding a
notice of his own financial competing interest with regard to the use of
TPA in stroke that was not included originally. If he were to be true to
the absolutist position on financial conflicts that he has advanced, he
will now absent himself from advising influential and independent
organizations on the TPA in stroke issue.
Contrary to Li and colleagues, we did not count the NINDS TPA trials
twice. We counted NINDS TPA Trial 1 once and NINDS TPA Trial 2 once.
Clearly they would prefer that there had been only one NINDS TPA trial,
but wishing does not make it so. We are mystified by the continued
misreading of the NINDS-TPA Trials NEJM report as representing a single
trial.[3] The first way you can tell that there were two trials in the
study is by their names. The first trial was named Trial 1. The second
trial was named Trial 2. Trial 1 had as its primary prespecified endpoint
early improvement at 24 hours by 4 points or more on the NIHSS. It
narrowly missed reaching statistical significance on this endpoint, but
several final 3 month clinical endpoints were positive. Trial 2 was
launched after the completion of enrollment in Trial 1, enrolled a
completely new set of patients, and was analyzed separately with regard to
its prespecified primary endpoint, a global measure of final functional
outcome 3 months after stroke. Trial 2 was positive on this prespecified
primary endpoint. However devoutly the TPA contrarians wish they only had
a single NINDS-TPA trial with which to contend, the fact is there were two
trials, as the FDA recognized when ascertaining that the evidence for the
benefit of TPA in acute stroke was quite adequate to approve the
indication. Li and colleagues also claim that our pooled analysis of under
3 hour data from intravenous TPA trials is incorrect. Once again, this is
mere assertion—they advance no actual argument—and it is wrong. In
contrast, we can easily point out how the meta-analysis in the table they
provide is itself “contrary to meta-analysis methodology and confounds
rather than promotes the truth.” Their table includes only the 4 trials
with higher mortality in the treatment group than in the placebo group and
leaves out entirely the 3 trials with lower mortality in the treatment
group than in the placebo group. This selective inclusion in a meta-
analysis of only trials with data favorable to one’s argument is a
fundamental violation of meta-analytic methodology. Across all seven
trials with available data (NINDS 1 and 2, ECASS 1 and 2, ATLANTIS A and
B, Haley 1993), death occurred in 83/479 (17.3%) of TPA treated patients
and 83/478 (17.4%) of placebo treated patients (p=0.9). Thus, the number
needed to treat to produce benefit from TPA is as low as 2, the number
needed to treat to cause net harm approaches infinity. These numbers amply
support the statement that TPA is highly efficacious.
Once again, we will close trying to find common ground. We concur, as
before, with calls to bar experts with major financial competing interests
from service on guideline committees and to require experts with minor
financial competing interests to disclose them publicly. We also concur
with calls to make public detailed raw data from the pivotal NINDS-TPA
trials and from the pooled analysis of all 6 major intravenous TPA trials
soon to be published. We are confident that the effect of TPA therapy,
used rightly, is robust and will stand up to detailed scrutiny. Lastly, we
reiterate our concurrence with policy statements of the Brain Attack
Coalition and the American College of Emergency Physicians that urge
emergency physicians and other acute care providers to become expert in
acute stroke care, including the use of TPA for acute stroke, or to place
their hospitals on standby and divert patients to designated stroke
centers where therapy can be expertly delivered.[4,5]
1. Biller J, Love BB, Marsh EE, et al. Spontaneous improvement after
acute ischemic stroke: a pilot study. Stroke 1990;21:1008-1012.
2. Kidwell CS, Saver JL. Starkman S. The acute stroke patient:
prehospital stroke identification and treatment. In: Cohen SN, ed.
Management of Ischemic Stroke. New York: McGraw-Hill, 2000.
3. Haley EC, Jr., Lewandowski C, Tilley BC: Myths regarding the NINDS
rt-PA Stroke Trial: setting the record straight. Ann Emerg Med 1997;30:676
-682.
4. Alberts MJ, Hademenos G, Latchaw RE, Jagoda A, Marler JR, Mayberg
MR, et al. Recommendations for the establishment of primary stroke
centers. Brain Attack Coalition. JAMA 2000; 283: 3102-3109.
5. American College of Emergency Physicians Board of Directors.
Policy statement: Use of intravenous tPA for the management of acute
stroke in the emergency department. Published February 2002. Available at:
http://www.acep.org/1,5006,0.html. Last accessed 5/16/02.
Competing interests: No competing interests
Is an adjusted OR more accurately reflective of reality than an unadjusted OR
Competing interests:
None declared
Competing interests: No competing interests