Interpreting and reporting clinical trials with results of borderline significanceBMJ 2011; 343 doi: https://doi.org/10.1136/bmj.d3340 (Published 04 July 2011) Cite this as: BMJ 2011;343:d3340
- Allan Hackshaw, deputy director1,
- Amy Kirkwood, statistician1
- 1Cancer Research UK and UCL Cancer Trials Centre, University College London, London W1T 4TJ
- Correspondence to: Allan Hackshaw
- Accepted 11 February 2011
The quality of randomised clinical trials and how they are reported have improved over time, with clearer guidelines on conduct and statistical analysis.1 Clinical trials often take several years, but interpreting the results at the end is arguably the most important activity because it influences whether a new intervention is recommended or not. Although researchers have become more familiar with medical statistics, the interpretation and reporting of results of borderline significance remains a problem. We examine the problem and recommend some solutions.
What is the problem?
New interventions used to be compared with minimal or no treatment, so researchers were looking for and finding large treatment effects. Clear recommendations were made because the P values were usually small (eg, P<0.001). However, modern interventions are usually compared with the existing standard treatment, so that the effects are often expected to be smaller than before, and it is no longer as easy to get small P values. The cut-off used to indicate a real effect is widely taken as P=0.05 (called statistically significant). The problem is that although P=0.05 is an arbitrary figure, many researchers still adhere strictly to it when making conclusions about an intervention, and often use it as the sole basis for this. Researchers and journals sometimes conclude that there is no effect.
The P=0.05 cut-off was first proposed by R A Fisher in 1925 as being low enough to make decisions, and over time has become widely adopted. However, examining interventions with P values just above 0.05 is difficult, especially if the trial is unique. It is incorrect to regard, for example, a relative risk of 0.75 with …