Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Rapid Responses to:
|
|
Rapid Responses published:
|
|
|||
|
Stephen J Senn, Professor of Statistics University of Glasgow, Glasgow, G12 8QQ
Send response to journal:
|
According to Fergusson et al (British Medical Journal, 21 February), they, “consider a trial to be double-blind when the patient, investigators, and outcome assessors are unaware of the patient's assigned treatment throughout the conduct of the trial”. They are quite wrong to do so. The whole point of a successful double-blind trial is that there should be un-blinding through efficacy. That is to say that there should be no incidental reasons, apart from efficacy, as to why the treatments are distinguishable but that the treatments should reveal themselves through efficacy. If the treatments are not distinguishable at all, then the treatments have not been proved different. The classic description of a blinded experiment is Fisher’s account of a lady tasting tea to distinguish which cups have had milk in first and which cups have had tea in first in support of her claim that the taste will be different1. Here the efficacy of the treatment, order of milk or tea, is ‘taste’ and the lady’s task is to distinguish efficacy. Fisher describes the steps, in particular randomisation, that must be taken to make sure that the lady is blind to the treatment2. But if he were to adopt the point of view of Fergusson et al, there would be no point in running the trial, since if the lady distinguishes the cups, the trial will be declared inadequate, as she has clearly not been blind throughout the trial. Of course, in a parallel group trial, the patients only have one treatment. But not unreasonably in such a trial every patient that has had a good outcome might guess, with no other grounds to go on than outcome, she was on active treatment and every patient with an unsatisfactory outcome might guess she was on placebo. If the treatment is effective, the guesses will distribute unequally between the arms of the trial and the trial will then be declared ‘not blind’. Fergusson et al appear to have a blinkered view of blinding. Their proposals are illogical and need re-thinking. References 1. Fisher RA. The Design of Experiments. In: Bennet JH, editor. Statistical Methods, Experimental Design and Scientific Inference. Oxford: Oxford, 1935. 2. Senn SJ. Dicing with Death. Cambridge: Cambridge University Press, 2003. Competing interests: The author is a consultant to the pharmaceutical industry |
|||
|
|
|||
|
David L. Sackett, Director Kilgore Trout Research & Education Centre at Irish Lake, RR 1, Markdale, Ontario, Canada N0C 1H0
Send response to journal:
|
Blindness is important, for all the reasons Dean Fergusson and his colleagues present in their paper. However, asking patients or their clinicians at the end of a trial which drug they think they were taking confounds the success of blinding with hunches about efficacy. When patients or their study clinicians have a hunch about which treatment is superior, patients who have done well will tend to think they were on that treatment, and so will their clinicians. My colleagues and I discovered this phenomenon (to our chagrin) when we were the first group to test aspirin and sulfinpyrazone in the hope that one or both of these drugs might prevent major and fatal strokes in patients with transient ischemic attacks (ref 1). In those early days, we shared the hunch that sulfinpyrazone was probably efficacious, but that aspirin probably wasn't (we did the trial because we were uncertain about these hunches, not indifferent about them). As it happened, our pre-trial hunches were wrong: aspirin turned out to be highly efficacious in our trial, and sulfinpyrazone worthless. At the end of our trial we asked study clinicians to predict which drug each of their patients had been assigned, thinking that we were measuring whether blindness had been successful during the trial. To our confusion, their predictions were statistically significantly WRONG. With 4 regimens in this "double-dummy" trial, we'd expect correct predictions for 25% of patients; our clinicians' predictions were correct for only 18% of them. Our confusion lifted when we thought through the effect of our prior hunches about efficacy. When our patients had done well, their clinicians tended to predict that they had received sulfinpyrazone; when patients had suffered strokes, these same clinicians tended to predict that they had received aspirin or the double-placebo. But what if our pre-trial hunches about efficacy had been correct? If patients who had done well were predicted to have received aspirin, and those who had done poorly were predicted to have received sulfinpyrazone or the double-placebo, our end-of-study test for blindness would have led to the incorrect conclusion that blinding was unsuccessful. I'm not smart enough to be able to look at an end-of-study test for blindness and distinguish unsuccessful blinding from correct hunches about efficacy. I hope somebody is. In the meanwhile, both here and in prior personal correspondence, I've encouraged Dean Fergusson and his colleagues to reconsider their study's interpretations and recommendations. To the extent that patients and clinicians were correct in their hunches about the comparative efficacy of the treatment arms in the trials they examined, they would draw the incorrect conclusion that blinding had been unsuccessful, even when it was completely successful. My colleagues and I vigorously test for blindness before our trials, but not during them and never at their conclusion. Ref 1: The Canadian Cooperative Study Group. A randomized trial of aspirin and sulfinpyrazone in threatened stroke. N Engl J Med 1978;299:53- 9. Competing interests: please see: bmj.com/cgi/content/full/324/7336/539/DC1 |
|||
|
|
|||
|
D B Double, Consultant Psychiatrist Norfolk Mental Health Care NHS Trust, Norwich NR6 5BE
Send response to journal:
|
Fergusson et al have usefully highlighted the problem of unblinding in clinical trials.1 They propose that item 11b of the CONSORT statement2 should be revised to make assessment of blinding a requirement. Their data suggest that more often than not blinding in clinical trials is compromised. In the circumstances, merely reporting whether a trial is blinded may be insufficient. In fact, there is bias in the wording of item 11b of the CONSORT statement. The item in the checklist is "If done, how the success of blinding was evaluated". In other words, it seems to imply that assessment of blinding will confirm its validity. We need to change our mindset to whether it matters if and when the blind is broken. Reporting of the assessment of blinding is uncommon. Correlation of outcome measures with degree of unblinding is even less common and was only done in a few of the 15 trials reported by Fergusson et al. For example, Sackheim et al found a robust association with relapse status and patients’ guesses.3 As pointed out by Fergusson et al, the direction of the causality of this association may be difficult to ascertain. After all, if treatment is obviously effective it is not possible technically to perform a trial double-blind. This may be the explanation of the association found by Sackheim et al. On the other hand, it could be a reflection of bias introduced through unblinding. How we evaluate the efficacy of active treatment is unclear. Breaking of the double-blinding has been interpreted as the explanation for a positive trial result.4 Why should this not be the case in more trials which conclude that active treatment is effective? Fergusson et al’s suggested minimum set of information for the assessment of blinding includes counts of the correctness of patients’ guesses. As they note, this is particularly beneficial in trials with subjective outcomes or outcomes reported by patients. However, raters’ guesses may be more important in trials where the outcome measures are dependent on such scores. It is possible for patients to remain blind, but for raters still to be unblinded leading to significant correlation with outcome measures.5
Competing interests: None declared |
|||
|
|
|||
|
Douglas G. Altman, Professor of Statistics in Medicine Cancer Research UK/NHS Centre for Statistics in Medicine, Oxford OX3 7LF, Kenneth F Schulz, David Moher
Send response to journal:
|
Reports of randomised trials should state clearly whether or not blinding was attempted, and if so who was blinded and how this was done.[1] Fergusson and colleagues present interesting findings regarding the important question of whether blinding was effective.[2] As they say, blinding may be ineffective in some trials, making them less sound methodologically than they appear to be. However, asking trial participants (or caregivers) to try to identify the treatment received runs the risk that the ability to guess the treatment received might well be influenced by outcome. For example, the presence of adverse symptoms will surely influence participants asked which treatment they think they were on, especially as the possibility of such symptoms would surely have been mentioned as part of the informed consent procedure.[3] In this case some apparent loss of blinding is likely and should not be considered necessarily to be a weakness of the trial. We might expect to see an apparent breaking of the blind more often in trials where there was a marked treatment effect, for either an intended outcome or adverse effect. Indeed, end-of-trial tests of blindness might actually be tests of hunches for adverse effects or efficacy.[4]. Assessments of blinding would clearly be much more reliable in trials where they can be carried out before the clinical outcome has been determined. Furthermore, individuals may camouflage unblinding efforts. If someone deciphers assignments, they may provide responses contrary to their deciphering findings to disguise their unblinding actions.[4] That difficulty and the aforementioned interpretational difficulties lead us to question the usefulness of blinding tests in many circumstances. The CONSORT Statement does not say that “the success of blinding is to be reported in the publication.” Rather, it recommends reporting the findings of an assessment of blinding if it was done[5]. Fergusson and colleagues suggest that the CONSORT Statement should be amended to suggest that assessment of blinding should be done routinely. We are not convinced that all trialists should carry out such an exercise. Further, we do not agree that the CONSORT Statement should be modified as suggested. CONSORT is a set of reporting recommendations – it does not make statements on how trials should be done, but asks that what was done should be fully and accurately reported. 1 Moher D, Schulz KF, Altman DG for the CONSORT Group. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001;357:1191–4. 2 Fergusson D, Glass KC, Waring D, Shapiro S. Turning a blind eye: the success of blinding reported in a random sample of randomised, placebo controlled trials. BMJ 2004; 328: 432-4. 3 Schulz KF, Chalmers I, Altman DG. The landscape and lexicon of blinding in randomized trials. Ann Intern Med 2002;136:254-259. 4 Schulz KF, Grimes DA. Blinding in randomised trials: hiding who got what. Lancet 2002;359:696-700. 5 Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med 2001;134:663-94. Douglas G Altman Cancer Research UK Medical Statistics Group, Centre for Statistics in Medicine, Old Road Campus, Oxford Kenneth F Schulz Family Health International, Research Triangle Park, NC, USA David Moher University of Ottawa, Chalmers Research Group, Ottawa, Ontario, Canada Competing interests: The authors are the organisers of the CONSORT Group |
|||
|
|
|||
|
Michael Goodyear, Assistant Professor Dept. Medicine, Dalhousie University, 1278, Tower Road, Halifax, Nova Scotia, CANADA B3H 2Y9.
Send response to journal:
|
The principle of blinding has become entrenched as a manoeuvre to minimise bias in comparative trials (1, 2). An inherent part of that bias lies in the a priori expectations of the investigators. To deny such expectations is unrea1istic since they underpin a reasonable expectation of equipoise that makes asking the question both necessary and possible. Blinding is but one way of minimising bias and we would be less than honest if we did not admit that all trials contain bias. This in turn necessitates a determination of the direction and magnitude of the biases and their possible impact on the interpretation of the outcome. Ferguson and colleagues (British Medical Journal, 328: 432-4, 21 February 2004) (3) ask an important question, namely how effective is blinding? However they may be overstating the position when they say that if blinding is ineffective then the protections are lost, in that the efficacy of blinding is unlikely to be a binary process, and the impact of blinding only partly offsets the actual effect size. Thus in a large randomised trial that demonstrates clinically significant and biologically plausible differences in outcomes, some imperfections in blinding are unlikely to fully explain the variance in outcome. The degree of blinding does however add to the credibility of the reported outcome by reducing opportunities for bias to become manifest. Nevertheless if we advocate blinding it is important that we also ask whether the effectiveness can be measured and if so whether differences in that effectiveness can explain variance in the observed outcomes. Since no validated method for measuring the effectiveness of blinding has yet been developed it is perhaps not surprising that Ferguson et al. found the literature wanting in this regard. Sackett (4) raises the interesting and valid question as to whether testing for effectiveness is merely a surrogate for testing for a priori expectations. Therefore if the effectiveness of blinding is to be measured it may be important to measure such expectations as explanatory variables. Whether the effectiveness can be reliably measured or not, the steps taken to ensure blinding should be reported as a minimum. Bias can be applied at several levels in a trial. These include participant and investigator expectation, outcome evaluation, analysis and interpretation. Each of these steps is potentially subject to estimation of the impact of potential bias. The lexicography of this has been well described (5,6). All commentators to date (7, 4, 8, 9) have pointed out that attempting to measure the effectiveness of blinding is essentially confounded by outcome, whether this be the planned outcome or unintended adverse events. Any attempt to develop a methodology for measuring the success or effectiveness of blinding would need to include not only the a priori expectations but also the effects of chance. Only two trials in Ferguson’s survey used kappa to take this into account. Other considerations are whose estimate of the allocation is being sought and the timing of the assessment, such as whether this was performed prior to the onset of the intended outcome or not. In many cases completely effective blinding is very difficult due to known differences in adverse event profiles. Furthermore participant expectation can be unconsciously reinforced by differing descriptions in the consent form, which itself could be subject to blinding. Thus it is always encouraging to note that in reported results some adverse events or dropouts can be higher amongst placebo patients than amongst those on active medication. While it is difficult to remove all effects of both participant and investigator expectation, it is possible to isolate these factors from evaluation. Independent outcome evaluation can usually be blinded more successfully than assessments and interactions' between participants and investigators in the clinic. Outcome evaluation can be blinded not only to the actual assignments but also the nature of the investigation itself. For instance radiologists reading films can be potentially blinded to the whole trial and be asked merely to report on the presence or absence of a feature on an individual assessment or to compare pairs of films. A special case is participant reported outcome such as Fisher’s classic taste test, as mentioned by Senn (7). Whether blinding can be effectively measured is perhaps not the whole question. We should be cautious about implying that failure to unequivocally establish perfect blinding invalidates the interpretation of the data. In the land of the unblind, even the partially blinded may be king. References 1. Sackett D, Haynes R, Guyatt G, Tugwell P: Clinical epidemiology. A basic science for clinical medicine. Little Brown, Boston. 2nd Ed. 1991. 2. Altman DG, Schulz KF, Moher D, Egger M, DavidoffF, Elboume D, et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med 2001 ;134: 663-94. 3. BMJ, doi:10.1136/bmj.37952.631667.EE (published 22 January 2004) 4. Sackett DL: Why we don't test for blindness at the end of our trials. BMJ Rapid Response February 20 2004. 5. Schulz KF, Chalmers I, Altman DG. The landscape and lexicon of blinding in randomized trials. Ann Intern Med 2002;136:254-259. BMJ Rapid Response February 20 2004. 6. Schulz KF, Grimes DA. Blinding in randomised trials: hiding who got what. Lancet 2002;359:696-700. 7. Senn SJ: A blinkered view of blinding. BMJ Rapid Response February 20 2004. 8. Double DB: Changing the mindset about unblinding in clinical trials. BMJ Rapid Response February 21 2004. 9. Altman DG et al.: Testing the success of blinding and the CONSORT statement. BMJ Rapid Response February 21 2004. Competing interests: Until today I believed that the robustness of the effect size could be further validated by establishing the effectiveness of the blind. |
|||
|
|
|||
|
Stan Shapiro, Professor Department of Epidemiology and Biostatistics, McGill University, 1020 Pine Ave Montreal QC H3A 1A2, Dean Fergusson, Kathleen Cranley Glass, and Duff Waring
Send response to journal:
|
Senn's and Sackett's responses seem to reflect a rather narrow point of view. Both correctly note that there is likely to be a strong relationship between an individual's improvement, real or perceived, and a subsequent guess that they are receiving active treatment. That is why assessing unblinding by simply examining the proportion of correct 'guesses' is not a particularly good choice. However, to argue that we should therefore not attempt to assess whether blinding has been maintained is an even poorer choice. Our paper examines how trialists report on blinding, not only with regard to outcome, but also with regard to process. To be charitable, one might categorize the result as 'not very well'. Since the claim of assay sensitivity for trials with a placebo arm rests on the assumption of appropriate blinding, we do not have the luxury of continuing to avoid the challenging measurement issues involved. It seems contrary to an evidence based approach to avoid obtaining data because we have to struggle with its interpretation. Asking individuals to provide not only their 'guesses', but the reasons for them may help the process. A variety of additional approaches need to be explored. We applaud the aim of CONSORT and understand Altman and colleagues reticence to be prescriptive with regard to trial conduct. However, the CONSORT statement does ask trialists to report not only the method used to generate a random allocation sequence, but also the details of its implementation and concealment. Similarly, we feel that there is room and need for better attention and guidance with regard to the reporting of blinding. Caveats regarding the evaluation of new therapies via non-inferiority trials(1,2) are certainly appropriate. However, methodological caveats with regard to placebo controls need similar attention, and it appears that this is being ignored(2). As we note, and Double echoes, the more subjective the outcome the greater the concern. We believe that licensure of an ineffective agent (a new anti-depressant for example) based on studies with poorly designed and executed placebo controls should be considered as serious an error as licensure on the basis of poorly designed and executed non-inferiority trials. Since details about maintenance of blinding are not being routinely sought, that symmetry appears to be lacking. We think this needs to change. References (1) D'Agostino RB Sr. Non-inferiority trials: advances in concepts and methodology. Stat Med 2003;22:166-167. (2) International Conference on Harmonization of Technical Requirements for the Registration of Pharmaceuticals for Human Uses (ICH). Choice of control group and related issues in clinical trials (E-10). Competing interests: None declared |
|||
|
|
|||
|
D B Double, Consultant Psychiatrist Norfolk Mental Health Care NHS Trust, Norwich NR6 5BE
Send response to journal:
|
Other rapid responses to Fergusson et al apart from my own (and Stan Shapiro's rejoinder) seem to have produced a positive gloss on the problem of unblinding in clinical trials. The general thought seems to be that measuring unblinding is difficult, so we may as well give up and carry on with our pretence. This may be to continue "turning a blind eye", as used in the phrase in the title of the original paper. I may be too sceptical but I continue to wonder whether the small effect size in many clinical trials could be totally explained by bias introduced through unblinding. The measured degree of unblinding by guesses at the end of the trial may be greater than would be expected from correct hunches about efficacy. Like David Sackett, I do not think I am clever enough without help to distinguish true unblinding from correct hunches about efficacy, but the advantage of rapid responses means I can share my thoughts without having worked them through properly. I think it may be possible to measure what the degree of unblinding should be from correct hunches from efficacy based on effect size, and if the actual degree of unblinding with correct guesses is significantly greater than this, it would surely imply that bias had been introduced. I am not sure if this makes sense, but I am reluctant to leave the issue and be as negative about the implications as some of my fellow rapid responders. For example, the small effect size in meta-analyses of antidepressant trials should be more widely known.1 Psychological variables, of course, may be particularly susceptible to bias. However, even trials containing hard end-points like mortality often have very small differences between active treatment and controls, and their statistical significance is heightened by the large scale of the trials.2 In psychiatric trials the outcome is commonly determined by raters rather than patients themselves. Patients may not necessarily be very good at determining the presence of placebo or active treatment. To give another example, the evidence is that patients may not be aware that they are taking lithium, but observers seem to be able to detect it in one way or another.3 If raters are able to be cued in to whether patients are receiving active or placebo treatment, their wish fulfilling expectancies could be affecting outcome ratings. How do we know that small effect sizes in particular are not due to this amplified placebo effect? I think we should stop turning a blind eye to this legitimate question. It does need to be answered to give confidence about the use of many medications that are endorsed in clinical practice.
Competing interests: None declared |
|||
|
|
|||
|
Santosh K Chaturvedi, Consultant Psychiatrist North Staffs Combined Healthcare, ST6 5UD
Send response to journal:
|
Dear Editor, I read with some amusement the article1 on blinding and the assessment of the success of blinding. The authors question the current lack of reporting on success of blinding in randomised controlled placebo controlled trials. I am sure the authors expect the assessment of success of blinding to be done in a blind unbiased method, and the evaluation of the blind unbiased method for testing the success of blinding by blind procedures and so on….. 1. Fergusson D, Glass KC, Waring D, Shapiro S. Turning a blind eye:the success of blinding reported in a random sample of randomised placebo controlled trials. BMJ 2004; 328, 432-434, February 21. Professor SK Chaturvedi, MD
Competing interests: None declared |
|||
|
|
|||
|
Andrei SP Brennan, Medical Research Assistant Royal Victoria Hospital, Clincial Epidemiology, R3.35, 87 Pine Ave W, Montreal, QC, Canada, H3A-1A1, JAC Delaney and Caroline Hebert
Send response to journal:
|
Fergusson et al bring up an excellent point about the need to formally assess blinding in clinical trials. Blinding is a key component of a clinical trial and, along with randomization, is responsible for the lower susceptibility of a clinical trial to bias when compared to an observational study. When blinding is not properly accounted for, there is the risk of methodological problems compromising the results. This issue was recently discussed by Garbe and Suissa who noted the possibility of detection bias in the Women’s Health Initiative (WHI) clinical trial (1). WHI is an excellent case where concerns with the effectiveness of the blinding can lead to confusion about the results. There was a partial report in the study on the high rate of unblinding (25% of subjects were unblinded to a clinic gynaecologist) (2). Of special concern was that the unblinding was differential (40% of women in the treatment group vs. 7% in the placebo group) (2). This differential unblinding experienced by gynaecologists creates concerns that study subjects or investigators could have also experienced differential unblinding (1). However, without a formal assessment of the blinding status of participants and investigators it is impossible to assess the magnitude of this bias. Properly assessing the effectiveness of blinding is an important step forward in being able to better interpret the results of clinical trials. Adopting the recommendations of Fergussson et al would be a solid step forward in improving research using clinical trials. Andrei SP Brennan, MA, Medical Research Assistant JAC Delaney, MSc, MA, Statistican Caroline Hebert, MSc, PhD Candidate (Epidemiology) 1. Garbe E and Suissa S. Issues to debate on the Women’s Health Initiative (WHI) study. Human Reproduction 2004; 19(1): 8-13. 2. Writing Group for the Women's Health Initiative Investigators. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women's Health Initiative randomized controlled trial. JAMA 2002;288: 321-33. Competing interests: None declared |
|||
|
|
|||
|
Raywat S. Deonandan, Epidemiologist Chalmers Research Group, Children's Hospital of Eastern Ontario Research Institute, Nicholas J. Barrowman
Send response to journal:
|
While the proposal by Fergusson et al (1), regarding post hoc assessment of the success of blinding in RCTs, is problematic, it does suggest an intriguing dimension of the so-called placebo effect. If the intervention being studied in the RCT is truly more efficacious than placebo, then, as noted in several of the response letters (2,3,4), blinding can be invalidated through patients correctly guessing their allocation by gauging their individual responses to treatment (leaving aside the issue of side effects). In such a scenario, improvement due to the treatment intervention is perhaps compounded by further improvement brought on through the expectation of improvement, once blinding has been effectively broken. This effect might be most pertinent in trials of depression drugs, where continuous subjectively-assessed outcomes are typically used and where placebo effects may therefore occur (5). A patient randomized to receive the drug begins to feel better, correctly guesses his or her allocation, then possibly benefits further from the very expectation of improvement. In short, the positive effects of expectation might be more pronounced in the treatment group and less pronounced in the placebo group. The individual assessing outcomes might be similarly affected, since he or she might also correctly guess the patient's allocation. In studies with serial assessments, future assessments could then be biased. For treatments that show dramatic results, RCTs may overestimate the actual size of the treatment effect relative to placebo. It may then be more appropriate to speak of testing the whole treatment experience, rather than purporting to test the specific effectiveness of the intervention alone. References 1. Fergusson D, Glass KC, Waring D, Shapiro S. Turning a blind eye: the success of blinding reported in a random sample of randomised, placebo controlled trials. BMJ 2004; 328:432. 2. Altman DG, Schulz KF, Moher D. Turning a blind eye: testing the success of blinding and the CONSORT statement. BMJ. 2004;328:1135. 3. Senn SJ. Turning a blind eye: authors have blinkered view of blinding. BMJ. 2004; 328:1135-6. 4. Sackett DL.Turning a blind eye: why we don't test for blindness at the end of our trials. BMJ. 2004; 328:1136. 5. Hrobjartsson A, Gotzsche PC. Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. N Engl J Med 2001; 344:1594-1602. Competing interests: None declared |
|||
|
|
|||
|
Jafar Kolahi, Odopharma research nucleus, Isfahan Science and Technology Town, Isfahan, Iran. No 10. Sayt 180. Shahin Shahr. Isfahan. Iran. Co 83188-65161
Send response to journal:
|
We read this article with great interest. In the methods section authors mentioned that "Our Medline search used publication type "randomised controlled trial" and the MeSH term "placebo-controlled" to identify placebo controlled randomised trials". Yet an easy MeSH Database search shows that "placebo-controlled" is not a MeSH term. Competing interests: None declared |
|||