Publishing raw data and real time statistical analysis on e-journalsBMJ 2001; 322 doi: http://dx.doi.org/10.1136/bmj.322.7285.530 (Published 03 March 2001) Cite this as: BMJ 2001;322:530
- David J R Hutchon, consultant obstetrician and gynaecologist
Authors of medical publications rarely provide their readers with the full raw data from their work but provide only the summarised statistical analysis. Indeed, publishing the raw data in a paper journal would usually be impractical and of little help to readers as transcription from the printed paper to a computer for further analysis would be laborious and prone to transcription errors. Without raw data, however, peer reviewers are unable to check the statistical analysis, and further work on the data by others is not possible.
I demonstrate a method of including the raw data within a web version of an audit project that includes real time data analysis (see details below). The raw data for this paper amounts to only 1526 data items, but even this much data could not normally be included in a paper journal. The internet and most modern computers can cope with much larger datasets.
The demonstration paper is a simple audit cycle, but any publication involving a considerable amount of raw data could be published in this form with considerable advantage. Potential advantages of providing raw data and statistical software within the web version of a published paper include
Raw data remain available in the foreseeable future for other workers to analyse further
The data can be easily copied into other applications, making analysis by others a practical proposal
The data are available for effective meta-analysis
The statistical analysis is available to be checked by peer reviewers and readers
Internet publication has in practical terms unlimited capacity for data storage
Most journals will support a web version in the next few years.
Some of the advantages of electronic publishing have been realised with the launch of web versions of major journals such as the BMJ and Lancet. The practical limitations of sharing large amounts of data have been overcome with internet technology. Presently, raw data from most research are likely to be filed away or lost in the depths of a hard disk once the paper is published.
If raw data were published with the original paper they would remain available, with appropriate permission and acknowledgement, to other workers in the specialty. Furthermore, if the data were published within the electronic version of a paper they could not become separated or lost as they would be an integral part of the paper. Meta-analysis of published evidence would be more effectively combined if the raw data were available. Also readers could easily add to or alter the database and rerun the statistical analysis in the knowledge that the analysis would be identical with that performed in the published article.
The demonstration paper, “A complete audit cycle of ultrasound estimation of the date of delivery,” is available at www.hutchon.freeserve.co.uk/demo.htm
Competing interests None declared.