Editorials

BMJ policy on data sharing

BMJ 2010; 340 doi: http://dx.doi.org/10.1136/bmj.c564 (Published 29 January 2010) Cite this as: BMJ 2010;340:c564
  1. Trish Groves, deputy editor
  1. 1BMJ, London WC1H 9JR
  1. tgroves{at}bmj.com

    New guidance proposes minimum standards to lessen risks to participants’ privacy

    The BMJ asks authors of original research articles to state in their manuscripts whether they are making available any additional unpublished data. These may comprise raw unprocessed data as well as protocols, analyses, statistical codes, images, and ideas (http://resources.bmj.com/bmj/authors/types-of-article/research). We ask this largely because we are keen to maximise the usefulness and usage of data and promote transparency, but also because many research funders now encourage or even mandate data sharing.1 Many BMJ articles' authors simply say “no additional data available,” but a growing repository of positive data sharing statements range from “an audit trail of the forest plots and related data is available at www.wolfson.qmul.ac.uk/bptria2 to “a full list of participants’ quotes and explanations offered by the authors to illustrate each of the four themes are available on request from the corresponding author at rachaelm{at}health.usyd.ed.au.3 We are delighted that authors have been so willing to share data.

    We appreciate that the acceptability and practicability of this concept will vary among studies and authors. The ethical and legal risks to the privacy of patients and other participants are important and must be taken seriously. Even among those who are willing to share data, some may want to defer this until after a period of fair use, and some may limit sharing only to other researchers, perhaps on personal request or at a password protected website.

    In the linked article (doi: 10.1136/bmj.c181), Hrynaszkiewicz and colleagues advise researchers to seek informed consent to data sharing from research participants upfront, at the recruitment stage. They also point out that until now there has been little information on how such data should be prepared for sharing.4 As well as discussing technical aspects, they list 28 personal and clinical descriptors that could de-identify patients. These descriptors are derived from a review of policy documents and research guidance from major UK and US funding agencies, governmental health departments and statutes, and three internationally recognised publication ethics resources for editors of biomedical journals. They recommend that direct identifiers such as names should be removed from datasets and urge caution with using indirect identifiers such as age and sex. These items are often needed to make sense of the science and, on their own, pose little risk to confidentiality. In combination, however, they can build a recognisable personal profile.

    So Hrynaszkiewicz and colleagues and the working group they convened (which included TG) are recommending that datasets containing three or more indirect identifiers for any participant should be reviewed—either by an independent researcher or even by an ethics committee—to assess this risk before being shared. This, they say, should be the minimum standard for ensuring that participants’ privacy is not put at unnecessary risk. They also recommend that authors should make explicit statements about consent in research articles that have linked raw data. They suggest that authors choose one of three options, stating either that participants gave informed consent for data sharing, or that consent was not obtained but the presented data are anonymised and risk of identification is low, or that consent was not obtained and the dataset does pose a threat to confidentiality. (This last option is, clearly, controversial.)

    The BMJ does not intend, at least for now, to post additional large datasets online. But we will continue to encourage authors to link their BMJ articles to such data deposited elsewhere, and we are now adopting some of the recommendations made by Hrynaszkiewicz and colleagues. Firstly, we strongly support the view that researchers should seek informed consent to data sharing from research participants upfront, at the recruitment stage. There are good ethical and practical reasons for doing so. Even if the investigators have no current plans to share raw data, at some future time data sharing may become the norm. If so, sharing will be much easier if no one has to try to seek consent retrospectively. Secondly, we will expand our advice to authors about data sharing to reinforce the need for anonmysation and to warn authors of the 28 patient identifiers they need to consider. And, thirdly, we will extend our data sharing statements to include explicit information about consent.

    Notes

    Cite this as: BMJ 2010;340:c564

    Footnotes

    • Research methods and reporting, doi:10.1136/bmj.c181
    • Competing interests: TG and deputy editor Jane Smith took part in discussions with Hrynaszkiewicz and colleagues and contributed to the recommendations made in the linked article.

    • Provenance and peer review: Commissioned; not externally peer reviewed.

    References