Why researchers should share their analytic codeBMJ 2019; 367 doi: https://doi.org/10.1136/bmj.l6365 (Published 21 November 2019) Cite this as: BMJ 2019;367:l6365
- Ben Goldacre, director,
- Caroline E Morton, researcher,
- Nicholas J DeVito, researcher
- DataLab, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
- Correspondence to: B Goldacre
JAMA recently retracted and replaced an important clinical trial report from 2018 after a serious programming error was discovered.1 Quantitative medical research relies on analytic scripts: a sequence of commands issued to extract, reshape, manage, and then analyse data. In this case, there was a catastrophe. The “randomisation assignment” variable coded the control group “1” and the intervention group “2”; this had to be converted to “0” and “1” for the statistical analysis to run, but an incorrect conversion command resulted in the intervention and control groups being mislabelled. The results of the trial were almost completely reversed.
It is laudable that this single error was acknowledged and corrected with a retraction. However, neither the retraction notice nor the accompanying editorial acknowledged the systemic problems and opportunities exemplified by this case.12 Sharing analytic code is increasingly the norm across many fields.345 It provides an unambiguous record of the analytical methods used, aiding reproducibility.67 It also allows expert …