Rapid Responses to:

RESEARCH:
An-Wen Chan, Asbjørn Hróbjartsson, Karsten J Jørgensen, Peter C Gøtzsche, and Douglas G Altman
Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols
BMJ 2008; 337: a2299 [Abstract] [Full text]
*Rapid Responses: Submit a response to this article

Rapid Responses published:

[Read Rapid Response] Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed
Chi-Tai Fang, Loreen Y.L. Huang   (27 January 2009)
[Read Rapid Response] Re: Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed
ZEKRIA IBRAHIMI   (29 January 2009)
[Read Rapid Response] A post script--power and protocol: the dangers of manipulating data and methods
ZEKRIA IBRAHIMI   (2 March 2009)

Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed 27 January 2009
 Next Rapid Response Top
Chi-Tai Fang,
Assistant Professor
National Taiwan University, Taipei 100, Taiwan,
Loreen Y.L. Huang

Send response to journal:
Re: Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed

In their empirical study on the quality of conducting and reporting clinical trials, Dr. Chan and colleagues [1] found that, "when reported in publications, sample size calculations and statistical methods were often explicitly discrepant with the protocol or not pre-specified." This disturbing observation highlights the need for scrutiny of statistical inference from randomised controlled trials.

In this aspect, we would like to point out another often neglected issue. It is common for clinical trial investigators to interpret a lack of statistically significant difference in risk of adverse events between compared treatments as a proof of safety for the test drug [2,3]. However, the failure to reject the null hypothesis of no increased risk of harm can occur from inadequate statistical power, incomplete or biased ascertainment of adverse events, or both [2,3]. Being designed to maximize the probability of proving efficacy rather than proving harm, most phase 3 clinical trials have no pre-specified harms-related hypotheses, and often lack sufficient sample sizes as well as systemic data collection for harm- related analyses [2,4].

To avoid erroneous claims of drug safety, standards for designing and reporting harm-related analyses in clinical trials need be enhanced. An extension of the CONSORT statement [4] recommends that when harms are major outcomes of a trial, the investigators should explicitly pre-specify the harm-related hypotheses and the plans for collecting, analyzing, and presenting data. Common limitations in harms-related analyses should also be addressed [4]. Specifically, to ensure transparency in claims of "no increased risk" of harm for the test drug, we recommend that clinical trial reports describe the statistical power of these analyses, so that interested parties (e.g., regulatory agency, journal reviewers and readers) can better evaluate the reliability of such claims.

Loreen Y.L. Huang, MD. Graduate Institute of Preventive Medicine National Taiwan University College of Public Health Taipei, Taiwan

Chi-Tai Fang, MD., PhD. Graduate Institute of Epidemiology National Taiwan University College of Public Health Taipei, Taiwan E-mail: fangct@ntu.edu.tw

References

1. Chan AW, Hrobjartsson A, Jorgensen KJ, Gotzsche PC, Altman DG. Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. BMJ 2008;337:a2299.

2. Liu JP. Rethinking statistical approaches to evaluating drug safety. Yonsei Med J 2007;48: 895-900.

3. Rothman KJ, Greenland S, Lash TL. Chapter 10: Precision and statistics in epidemiologic studies. In: Rothman KJ, Greenland S, Lash TL, eds. Modern Epidemiology, 3rd edn. Philadelphia: Lippincott Williams & Wilkins, 2008.

4. Ioannidis JP, Evans SJ, Gotzsche PC, et al. Better reporting of harms in randomized trials: an extension of the CONSORT statement. Ann Intern Med 2004;141:781-8.

Competing interests: None declared

Re: Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed 29 January 2009
Previous Rapid Response Next Rapid Response Top
ZEKRIA IBRAHIMI,
psychiatric patient
Coombs Library UB1 3EU

Send response to journal:
Re: Re: Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials -- Transparency Needed

Welding medicine with mathematics is a task that is far from convenient and easy. The 'join' between the two disciplines is always bound to be problematic.

The comments about reliance only on statistical significance (that is, alpha), without regard to sample size, which is related to power (that is, one minus beta), brings one back to the old dispute between Fisher and Neyman / Pearson (1). Fisher proposed a null hypothesis, involving a Type I error. But Neyman/ Pearson extended significance testing to an alternative hypothesis, which was about a Type II error instead.

The difficulty is trying to bring the Type I error- a false negative- and the Type II error -a false positive- under one umbrella. They are necessarily inversely related.

One tentative proposal would be to always hold the Type I error, alpha, at 0.05, then cite it in combination with an estimation of the Type II power, viz. 1-beta. Currently in trials, alpha may be 0.01 or 0.0005, etcetera, according to the particular data. Solely alpha is given, in a very bald manner, so that statistical insight is not truly facilitated. Deprived of the power added to the statistical significance, we have less idea about sample size and also effect size. Power is related to both effect size and sample size.

Alpha without Beta is as deficient as curry without rice!

The conundrum is that we may never be able to readily fuse the Fisher and Neyman/ Pearson approaches.

REFERENCES:

(1) Paper Number : 300494. History of Science and Statistical Education: Examples from Fisherian and Pearsonian schools. Paper presented at the 2004 Joint Statistical Meeting, Toronto, Canada. Chong Ho Yu.

Competing interests: None declared

A post script--power and protocol: the dangers of manipulating data and methods 2 March 2009
Previous Rapid Response  Top
ZEKRIA IBRAHIMI,
psychiatric patient
Coombs Library UB1 3EU

Send response to journal:
Re: A post script--power and protocol: the dangers of manipulating data and methods

Randomization was first envisaged as a concept by R. A. Fisher, then applied to practical medicine by A. Bradford Hill (1).

It is not easy to make sure that control and treatment groups are essentially similar, apart from the different drug or other intervention being considered. Thus we have dreadfully complicated methods such as analysis of covariance to address possible nuisance factors such as age. And we also have the not always clear idea of Intention to Treat (ITT). There is some obscurity involved in ITT (2). The basic concept of ITT is to stick with initial protocol allocations of patients in a trial, whatever may ultimately happen to them- whether they die, or shift to another treatment. Is ITT, then, randomization with an excess of rigour?

The BMJ online article by An- Wen Chan et al (3) deals with deviations in ITT between the initial protocol and the eventual publication. Statistical procedures have always an element of obscurity in them that can be exploited.

For example, we might deploy various rating scales for clinical depression in one and the same trial, according to what makes particular data seem more favourable. This manipulation would not be favourable to balanced medicine. We may be refusing to let the facts and numbers shine forth; instead, we blindfold the protocol, and select scales and methods after the data is collected.

One rapid response to the An- Wen Chan et al article referred to a particular dilemma- statistical power in harms related trials (4). Trials usually seek to show efficacy of a drug; but a harms related trial would be seeking the reverse, so that a Type II error- a false negative- would actually be more severe. It would declare a drug to be safe when it was actually dangerous. There would be a temptation for pharma to deliberately underpower harms related trials and rely more on conventional Type I errors, and not bother to state the alternative harms- related hypothesis.

As Neyman and Pearson wearily concluded, it is very difficult to fit Type I and Type II errors in a single framework. Emphasizing Type I or Type II is a problematic judgement that is as much medical as mathematical. Which is less relevant in our research issue: statistical significance, or statistical power? What are we seeking? Harm, not efficacy? Those researchers who are less than scrupulous may refuse to publish figures and findings according to their own selection bias.

REFERENCES:

(1) Statistical Methods in Medical Research. P. Armitage, G. Berry, J.N.S. Matthews. Blackwell. Pg. 600.

(2) What is meant by Intention to treat Analysis? Survey of published randomised controlled trials. Sally Hollis, Fiona Campbell. BMJ 1999;319:670-674 (11 September)

(3) Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. An- Wen Chan, Asbjorn Hrobjartsson, Karsten J Jorgensen, Peter C Gotzsche and Douglas G Altman. BMJ 2008; 337: a2299

(4) BMJ Rapid Response. Statistical Power for Evaluating Adverse Drug Reactions in Randomised Controlled Trials-- Transparency Needed. Chi- Tai Fang, Loreen Y.L. Huang (27 January 2009)

Competing interests: None declared