Are these data real? Statistical methods for the detection of data fabrication in clinical trialsBMJ 2005; 331 doi: https://doi.org/10.1136/bmj.331.7511.267 (Published 28 July 2005) Cite this as: BMJ 2005;331:267
- Sanaa Al-Marzouki1, research student,
- Stephen Evans (), professor of pharmacoepidemiology, Medical Statistics Unit1,
- Tom Marshall, senior lecturer in medical statistics1,
- Ian Roberts, professor of epidemiology and public heath1
- 1 Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT
- Correspondence to: S Evans
- Accepted 15 July 2005
Objectives To test the application of statistical methods to detect data fabrication in a clinical trial.
Setting Data from two clinical trials: a trial of a dietary intervention for cardiovascular disease and a trial of a drug intervention for the same problem.
Outcome measures Baseline comparisons of means and variances of cardiovascular risk factors; digit preference overall and its pattern by group.
Results In the dietary intervention trial, variances for 16 of the 22 variables available at baseline were significally different, and 10 significant differences were seen in means for these variables. Some of these P values were extraordinarily small. Distributions of the final recorded digit were significantly different between the intervention and the control group at baseline for 14/22 variables in the dietary trial. In the drug trial, only five variables were available, and no significant differences between the groups for baseline values in means or variances or digit preference were seen.
Conclusions Several statistical features of the data from the dietary trial are so strongly suggestive of data fabrication that no other explanation is likely.
See also p 281, and Editorial by Smith and Godlee
We thank Tom Meade who, on behalf of the Medical Research Council, provided the data for the drug trial and Richard Smith for his encouragement to examine further the data from the diet trial. The BMJ provided the data from the diet trial, which were supplied by the original author for further investigation of these data.
Contributors SE and SAM had the ideas for the analysis, and SAM, SE, TM, and IR all contributed to the planning, conduct, and writing of the paper. SAM planned and carried out the statistical analyses. SAM and SE are jointly responsible for the overall content as guarantors. There are no other contributors.
Competing interests None declared.
- Accepted 15 July 2005