Information In Practice Infopoints

Publishing raw data and real time statistical analysis on e-journals

BMJ 2001; 322 doi: http://dx.doi.org/10.1136/bmj.322.7285.530 (Published 03 March 2001) Cite this as: BMJ 2001;322:530
  1. David J R Hutchon, consultant obstetrician and gynaecologist
  1. Memorial Hospital, Darlington, County Durham DL3 6HX

    Authors of medical publications rarely provide their readers with the full raw data from their work but provide only the summarised statistical analysis. Indeed, publishing the raw data in a paper journal would usually be impractical and of little help to readers as transcription from the printed paper to a computer for further analysis would be laborious and prone to transcription errors. Without raw data, however, peer reviewers are unable to check the statistical analysis, and further work on the data by others is not possible.

    I demonstrate a method of including the raw data within a web version of an audit project that includes real time data analysis (see details below). The raw data for this paper amounts to only 1526 data items, but even this much data could not normally be included in a paper journal. The internet and most modern computers can cope with much larger datasets.



    Embedded Image

    In the demonstration version I have included software to provide the database for readers to view. From here the data can easily be copied and pasted into another application. The data can also be easily viewed within the HTML code with any browser such as Internet Explorer that allows users to view source code. The statistical analysis is carried out with JavaScript within the browser software, and all the algorithms are available for inspection by readers within the HTML code if desired.

    The demonstration paper is a simple audit cycle, but any publication involving a considerable amount of raw data could be published in this form with considerable advantage. Potential advantages of providing raw data and statistical software within the web version of a published paper include

    • Raw data remain available in the foreseeable future for other workers to analyse further

    • The data can be easily copied into other applications, making analysis by others a practical proposal

    • The data are available for effective meta-analysis

    • The statistical analysis is available to be checked by peer reviewers and readers

    • Internet publication has in practical terms unlimited capacity for data storage

    • Most journals will support a web version in the next few years.

    Some of the advantages of electronic publishing have been realised with the launch of web versions of major journals such as the BMJ and Lancet. The practical limitations of sharing large amounts of data have been overcome with internet technology. Presently, raw data from most research are likely to be filed away or lost in the depths of a hard disk once the paper is published.

    If raw data were published with the original paper they would remain available, with appropriate permission and acknowledgement, to other workers in the specialty. Furthermore, if the data were published within the electronic version of a paper they could not become separated or lost as they would be an integral part of the paper. Meta-analysis of published evidence would be more effectively combined if the raw data were available. Also readers could easily add to or alter the database and rerun the statistical analysis in the knowledge that the analysis would be identical with that performed in the published article.

    Footnotes

    • The demonstration paper, “A complete audit cycle of ultrasound estimation of the date of delivery,” is available at www.hutchon.freeserve.co.uk/demo.htm

    • Competing interests None declared.