Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in The BMJ and PLOS Medicine

BMJ 2018; 360 doi: (Published 13 February 2018) Cite this as: BMJ 2018;360:k400
  1. Florian Naudet, postdoctoral fellow1,
  2. Charlotte Sakarovitch, senior statistician2,
  3. Perrine Janiaud, postdoctoral fellow1,
  4. Ioana Cristea, visiting scholar1 3,
  5. Daniele Fanelli, senior scientist1 4,
  6. David Moher, visiting scholar1 5,
  7. John P A Ioannidis, professor1 6
  1. 1Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California, USA
  2. 2Quantitative Sciences Unit, Division of Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA
  3. 3Department of Clinical Psychology and Psychotherapy, Babes-Bolyai University, Romania
  4. 4Department of Methodology, London School of Economics and Political Science, UK
  5. 5Centre for Journalology, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
  6. 6Departments of Medicine, of Health Research and Policy, of Biomedical Data Science, and of Statistics, Stanford University, Stanford, California, USA
Objectives To explore the effectiveness of data sharing by randomized controlled trials (RCTs) in journals with a full data sharing policy and to describe potential difficulties encountered in the process of performing reanalyses of the primary outcomes.

Design Survey of published RCTs.

Setting PubMed/Medline.

Eligibility criteria RCTs that had been submitted and published by The BMJ and PLOS Medicine subsequent to the adoption of data sharing policies by these journals.

Main outcome measure The primary outcome was data availability, defined as the eventual receipt of complete data with clear labelling. Primary outcomes were reanalyzed to assess to what extent studies were reproduced. Difficulties encountered were described.

Results 37 RCTs (21 from The BMJ and 16 from PLOS Medicine) published between 2013 and 2016 met the eligibility criteria. 17/37 (46%, 95% confidence interval 30% to 62%) satisfied the definition of data availability and 14 of the 17 (82%, 59% to 94%) were fully reproduced on all their primary outcomes. Of the remaining RCTs, errors were identified in two but reached similar conclusions and one paper did not provide enough information in the Methods section to reproduce the analyses. Difficulties identified included problems in contacting corresponding authors and lack of resources on their behalf in preparing the datasets. In addition, there was a range of different data sharing practices across study groups.

Conclusions Data availability was not optimal in two journals with a strong policy for data sharing. When investigators shared data, most reanalyses largely reproduced the original results. Data sharing practices need to become more widespread and streamlined to allow meaningful reanalyses and reuse of data.

