Any casualties in the clash of randomised and observational evidence?
BMJ 2001; 322 doi: https://doi.org/10.1136/bmj.322.7291.879 (Published 14 April 2001) Cite this as: BMJ 2001;322:879All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Sir,
We agree with Ioannidis et al (1) that more empirical evidence is
needed on the relative merits of observational (OS) vs randomized studies
(RS). However, we would like to add that less prejudgments and passion are
also necessary in order to adequately assess and interpret the results of
studies comparing both methods.
According to the present paradigm, the best available evidence is
that which is obtained from randomized trials and meta-analyses. Some
advocates of evidence-based medicine with their hierarchical ranking
schemes think that “if you find that a study was not randomized, we’d
suggest that you stop reading it and go on to the next article” (2).
“Unfortunately” for the defenders of this last opinion, three recent works
found that OS did not overestimate the size of the treatment effect
compared with RS (3-5). Some of the reactions to these papers are a good
example of the kind of attitudes described by T.S. Kuhn in The Structure
of Scientific Revolutions: "When new data arrive that challenge the
accepted knowledge, the scientific community prefer to refute the new
evidence rather than analyzing if the accepted theory may be wrong". For
example, the editorial (6) that was published in the same issue of the two
NEJM papers, after predicting the potential dangers that the results could
produce to clinical research and patients, tried to demonstrate the flaws
of the papers by suggesting possible biases in the selection of the topics
or arguing about the lack of detailed information on the individual
studies, among other things. Of course, we are in favor of systematically
testing all kind of hypothesis, especially if they are unexpected.
However, we were surprised by the lack of comments on the possibility that
the results of both papers could be valid and also by the lack of mention
to similar (and consistent) works (5).
The denigration of non-experimental methods by defenders of RT and
the assumption that both methods represent alternative rather than
complementary approaches, may be an obstacle to adequately evaluate health
care interventions. During a long period, RCT has enthusiastically been
considered as the only method to assess drugs’ effects. Thirty years ago,
Sir Austin Bradford Hill, the father of the modern RT, already recognized
the dangers of a blind acceptance of RT and a loss of credibility in
clinical observations (7). It is time to objectively assess the validity
of information provided by OS, even if the assessment may represent the
risk of a change in the paradigm.
José A. Sacristán, MD, Clinical Pharmacologist
Juan C. Gómez, MD, Psychiatrist
Inés Galende, MD, Clinical Pharmacologist
Members of the Spanish Group for the Study of
Methodology in Clinical Research
Address for correspondence:
José A. Sacristán, MD,
Spanish Group for the Study of Methodology in Clinical Research,
C/ Mariano José de Larra 16, 2º C,
28230 Las Rozas, Madrid, Spain.
REFERENCES
1. Ioannidis JP, Haidichi AB, Lau J. Any casualties in the clash of
randomised and observational evidence? BMJ 2001; 322: 879-80.
2. Sackett DL, Richardson WS, Rosenberg W, Haynes RB. Evidence-based
medicine: how to practice and teach EBM. New York: Churchill Livingstone,
1997.
3. Benson K, Hartz A. A comparison of observational studies and
randomized, controlled trials. N Engl J Med 2000; 342: 1878-86.
4. Concato J, Shah N, Horwitz IR. Randomized, controlled trials,
observational studies, and the hierarchy of research designs. N Engl J Med
2000; 342: 1887-92.
5. Britton A, McPherson K, McKee M, Sanderson C, Black N, Bain C.
Choosing between randomized and non-randomized studies: a systematic
review. Health Technol Assess 1998; 2: 1-124.
6. Pocock SJ, Elbourne DR. Randomized trials or observational
tribulations? N Engl J Med 2000; 342: 1907-9.
7. Hill AB. Reflections of the controlled trial. Ann Rheum Dis 1966;
25: 107-13.
Competing interests: No competing interests
Dr. Gale makes a very interesting comment. Proper quantitative
methods are essential in distinguishing whether there is significant
variation between the results of randomized trials and observational
evidence, between randomized trials themselves, or between observational
studies themselves. When detected, variation could reflect either genuine
heterogeneity or bias.1 Both are very important to know, since they may
affect the translation of the research findings to clinical practice. We
agree that in both cases, a careful qualitative assessment of the evidence
is important. In some cases it may be possible to identify some gross
systematic errors, but many biases are subtle or acting in opposite
directions with a combined effect that is difficult to decipher. Quality
assessment of both randomized trials and observational designs is not
straightforward.2 Obviously, preventing bias at the study design stage is
preferable to trying to sort it out after the fact. Nevertheless, meta-
analytic approaches may be helpful even at a late stage, since they allow
us to quantitatively examine the comparative experience from other similar
research.
John P.A. Ioannidis
Anna-Bettina Haidich
Department of Hygiene and Epidemiology, University of Ioannina School
of Medicine, Ioannina 45110, Greece
Joseph Lau
New England Medical Center, Tufts University School of Medicine,
Boston, MA 02111, USA
1. Lau J, Ioannidis JP, Schmid CH. Summing up evidence: one answer
is not always enough. Lancet 1998;351:123-7.
2. Ioannidis JP, Lau J. Can quality of clinical trials and meta-analysis
be quantified? Lancet 1998;352:590-1.
Competing interests: No competing interests
This is a highly significant editorial, building on the work that
this team have published elsewhere [1]. However, the source of the
variation cannot be determined purely by quantitative methods.A
qualitative analysis of these issues is part of a critical appraisal of
any paper, for which there is an established methodology [2].
The studies need to be reconsidered, to see if any systemic error or bias
is present within the methodology: these errors would not disappear if
measures are taken to reduce random variation, such as increasing the
numbers in the studies.
Observational studies are a good source of information about potential
biases.One would hope that this editorial will challenge all researchers
to carefully consider the question of bias when they are designing
controlled trials.
1.Ioannidis JPA Lau JEvolution of treatment effects over time:
Empirical insight from recursive cumulative metaanalyses Proc. Natl. Acad.
Sci. USA 2001, 98(3): 837-841.
2. Sackett DL Straus SE Richardson WS Rosenberg W Hayes BW. Evidence-based
medicine: How to practice and teach EBM. 2nd ed. Churchill Livingstone,
2000.
Competing interests: No competing interests
Not All Clinical Trials Are Created Equal
Ioannidis, Haidich, and Lau (1) present a lucid discussion of the
relative merits of distinct study designs. It would be difficult to
disagree with the sound advice they offer as a conclusion: "We need more
quantitative evidence to understand what exactly each design can tell us
and how often and why each design may go wrong". However, the phrasing of
this sentence can be interpreted in more than one way.
Is attention
restricted to when the design goes wrong by virtue of the design itself?
Or is it useful to consider also a situation in which the design may be
the best design there is, yet may still give unreliable answers, not by
virtue of the design itself, but rather by virtue of some other flaw? As
an example, it is well-known that a randomized clinical trial will often
provide the most compelling evidence for comparing two or more medical
interventions. Yet it is also true that simply conducting a clinical
trial and reporting the results in no way guarantees the reliability of
these results. All that is guaranteed is that the biases inherent in a
study which is not randomized would not be a factor in this particular
randomized study. To eliminate or minimize all biases, a concerted effort
would still be needed to both enumerate and attempt to prevent the
potential biases that could occur in a randomized clinical trial. Phrased
differently, not all clinical trials are created (designed) equally. Some
are subject to selection bias through a lack of allocation concealment
(2). Others exclude patients from the analyses in an inappropriate way
(3). Still others use analyses that depend on unreasonable assumptions,
such as normality (4).
To make matters worse, it is often impossible, for
a variety of reasons, to determine how reliable a given study is. For
example, one cannot conclude that a study was masked, or intent-to-treat,
just because a statement to this affect appears in the publication (3, 5).
In fact, I have even seen published studies that claim to be randomized
yet used a deterministic (perhaps alternating) patient allocation scheme
(although confidentiality prohibits me from providing references).
Perhaps the first step towards remedying these problems would be diligent
documentation of the claims. How exactly was a study randomized? What
set of potential allocation sequences was used? Did each sequence in this
list have the same probability of being selected? Which sequence was
observed and used for the design? How many patients were randomized?
What steps were taken to ensure both masking and allocation concealment?
Was there evidence (from a formal analysis) that no selection bias was
present (6)? Which patients are included in the analyses? What was the
prospectively stated primary endpoint and analyses? What assumptions
underlie the validity of the analyses? Why were these assumptions made?
Were these assumptions checked? What were the results?
Having the
answers to these and other questions provided [perhaps in an accompanying
web site (7) if journal space is limited] would go a long way towards
reinstating confidence that clinical trials provide reliable results, as
opposed to results that are simply less unreliable than those of other
designs.
1. Ioannidis JPA, Haidich AB, and Lau J. Any casualties in the clash
of randomised and observational evidence? British Medical Journal 2001;
322(7291):879-880.
2. Schulz, KF. Unbiased research and the human spirit: The
challenges of randomized controlled trials. Can Med Assoc J 1995;
153:783-786.
3. Hollis S and Campbell F. What is meant by intention to treat
analysis? Survey of published randomised controlled trials. British
Medical Journal 1999; 319:670-674.
4. Berger VW. Pros and cons of permutation tests in clinical trials.
Statistics in Medicine 2000; 19:1319-1328.
5. Ney PG. Double-blinding in clinical trials. Canadian Medical
Association Journal 1989; 140:15.
6. Berger VW and Exner DV. Detecting selection bias in randomized
clinical trials. Controlled Clinical Trials 1999; 20, 319-327.
7. Hutchon DJR. Publishing raw data and real time statistical
analysis on e-journals. British Medical Journal 2001; 322:530.
Competing interests: No competing interests