- Vinay Rathi, medical student1,
- Kristina Dzara, research associate2,
- Cary P Gross, associate professor of medicine23,
- Iain Hrynaszkiewicz, publisher4,
- Steven Joffe, associate professor of pediatrics5,
- Harlan M Krumholz, professor of medicine, professor of investigative medicine and public health367,
- Kelly M Strait, statistician6,
- Joseph S Ross, assistant professor of medicine236
- 1Yale University School of Medicine, New Haven CT, USA
- 2Section of General Internal Medicine, Department of Medicine, Yale University School of Medicine
- 3Robert Wood Johnson Foundation Clinical Scholars Program, Department of Medicine, Yale University School of Medicine
- 4BioMed Central, London, UK
- 5Department of Pediatric Oncology, Dana-Farber Cancer Institute and Department of Medicine, Boston Children’s Hospital, Boston MA, USA
- 6Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven CT, USA
- 7Section of Cardiovascular Medicine, Department of Medicine, Yale University School of Medicine; Section of Health Policy and Administration, Yale University School of Epidemiology and Public Health
- Correspondence to: J S Ross, Section of General Internal Medicine, Yale University School of Medicine, PO Box 208093, New Haven CT 0520, USA
- Accepted 3 November 2012
Objective To investigate clinical trialists’ opinions and experiences of sharing of clinical trial data with investigators who are not directly collaborating with the research team.
Design and setting Cross sectional, web based survey.
Participants Clinical trialists who were corresponding authors of clinical trials published in 2010 or 2011 in one of six general medical journals with the highest impact factor in 2011.
Main outcome measures Support for and prevalence of data sharing through data repositories and in response to individual requests, concerns with data sharing through repositories, and reasons for granting or denying requests.
Results Of 683 potential respondents, 317 completed the survey (response rate 46%). In principle, 236 (74%) thought that sharing de-identified data through data repositories should be required, and 229 (72%) thought that investigators should be required to share de-identified data in response to individual requests. In practice, only 56 (18%) indicated that they were required by the trial funder to deposit the trial data in a repository; of these 32 (57%) had done so. In all, 149 respondents (47%) had received an individual request to share their clinical trial data; of these, 115 (77%) had granted and 56 (38%) had denied at least one request. Respondents’ most common concerns about data sharing were related to appropriate data use, investigator or funder interests, and protection of research subjects.
Conclusions We found strong support for sharing clinical trial data among corresponding authors of recently published trials in high impact general medical journals who responded to our survey, including a willingness to share data, although several practical concerns were identified.
The conduct of a clinical trial requires substantial investment from research funders, demands considerable time and effort from investigators, and may expose human volunteers to health risks. For these reasons, many have advocated data sharing to enhance the scientific value of trial data. In addition, full access to trial data reduces the potential for incomplete reporting of study outcomes and improves the medical evidence base, which should ultimately improve clinical decision making.1 In clinical research, data sharing involves a research team making trial data available to individuals with whom they are not collaborating. Data sharing generally takes place via either of two methods. First, investigators may share trial data by depositing it in a repository, an archive of data with terms of access defined by the organisation that maintains it. Second, investigators may share trial data on their own terms in response to individual requests.
In recent years, several major clinical research funders have adopted policies supporting or mandating data sharing. These include the US National Institutes of Health,2 the UK Medical Research Council,3 and the Bill and Melinda Gates Foundation.4 Similarly, there are now major biomedical journals, such as Annals of Internal Medicine5 and BMJ,6 which require all authors of original research to state their willingness to share data in published articles, although the effectiveness and enforcement of these policies has been questioned.7 Several other journals, such as Nature, have even made data sharing a condition of publication.8 In the clinical trials community, the journal Trials encourages authors to publish the de-identified raw data supporting published trials as supplementary material,9 although to date few authors have done so.10 Secondary users of clinical trial data, such as the Cochrane Collaboration, advocate for stronger data sharing policies in the hopes of increasing access to clinical trial data to test the reliability of medical evidence and improve evidence based practice.1 11 12 13 14 15 16 17
Amid these calls for clinical trial data sharing, major regulators, most notably the European Medicines Agency,18 are beginning to contemplate far reaching policies on open access to data, as are several companies in the pharmaceutical and medical device industries.19 20 21 However, despite compelling arguments for mandating full access to trial data,1 13 22 23 24 authors of clinical studies on human subjects are among the least likely to share their raw data.25 Data suggest that these investigators, particularly clinical trialists, have historically claimed an exclusive right to collected clinical trial data and are largely opposed to data sharing policies due to concerns over research subject confidentiality,26 resources required for additional data management,26 27 inappropriate secondary data use,26 28 and diminished rewards for conducting original research.26 27 28 However, most of these studies are dated, and concerns were not systematically elicited.
The success of efforts to promote data sharing depends on cooperation from clinical trialists, who generate and maintain these data. Therefore, in an effort to inform future policies and initiatives and to better understand the trialist community’s collective thoughts about and experiences of sharing data, we surveyed corresponding authors of clinical trials published in 2010 or 2011 in one of the six general medical journals with the highest impact factor in 2011. The perspectives of clinical trialists publishing studies in these high impact journals have not been studied previously, although their studies are likely to address important clinical questions that can potentially affect clinical decision making—exactly the type of data whose scientific value should be maximised for public benefit.
Study sample and design
We assembled a sample of clinical trialists through a review of the literature to identify individuals who had published clinical trial findings in 2010 or 2011 in the six highest impact general medical journals. Relevant trials were identified by searching Ovid Medline (1 January 2010 to 31 December 2011) using the terms “clinical trial as topic” and “clinical trial” as free text, limiting our search results to articles published by the six general medical journals with the highest impact factor in 2011 (Journal Citation Reports, Thomson Reuters; New England Journal of Medicine, Lancet, JAMA, Annals of Internal Medicine, PLoS Medicine, and BMJ) (n=903). All non-clinical publications (n=101) and retrospective studies (n=49) were excluded from the resulting list to limit our sample to prospective observational or interventional studies of human subjects (n=753). Primary and secondary analyses of clinical trials published as original articles or research letters were eligible. The first corresponding author named in each article was identified for participation, except for those trials published in Annals of Internal Medicine, whereby the author named in the “reproducible research statement” was identified for inclusion. After accounting for studies with corresponding authors named in multiple publications (n=49) by selecting one publication randomly for inclusion, we identified 709 unique authors as potential survey respondents.
From the original article, we abstracted the following information in order to compare survey respondents and non-respondents: corresponding author location (US or Canada, Western Europe, other) and affiliation (medical school or hospital, private industry, government, other), trial funder (private industry, government, other, mixed), trial enrolment, and journal in which the article was published.
All potential survey respondents were sent an initial email describing the purpose of the study, requesting their participation, and providing a link to the survey in late July 2012, with three follow-up requests sent by email in early August, late August, and early September. Non-respondents were then contacted by telephone during the last two weeks of September 2012 to solicit their participation. Non-respondents were called up to three times and no more than once per day; those with whom we were unable to establish telephone contact were sent one last follow-up request by email in late September. The invitations to participate contained no information about specific hypotheses of the study. Participation was voluntary and included an opportunity to win one of five $100 gift certificates for Amazon. All responses were rendered anonymous by the web based survey platform (Qualtrics Labs, Provo, UT, USA). Approval from the Yale University School of Medicine Human Research Protection Program was obtained before conducting the study, and consent was considered to be implied when participants completed the online survey.
Survey instrument development
The design of our 38 item survey instrument was informed by previously published surveys of academic geneticists on data withholding,29 30 31 a review of the literature on clinical trial data sharing,23 25 26 27 28 32 33 and discussions with experts (including authors IH and SJ). The instrument was pretested by five clinical investigators unaffiliated with the research team and modified iteratively to improve clarity, face validity, and content validity. Adaptive questioning was used to decrease response burden. Items were presented in multiple response, Likert scale, and open ended formats. The complete instrument is provided as supplementary material on bmj.com.
Support for and prevalence of data sharing
We used Likert-type questions to assess clinical trialists’ support for data sharing in principle, through data repositories, and in response to personal requests. We used yes/no questions to ascertain whether respondents who were required by their research funder to deposit data from their published study in a repository had done so. We used multiple choice questions to ascertain the number of instances in which respondents who had received at least one individual data sharing request related to their published study had shared or withheld data.
Concerns with and reasons for data sharing
We used multiple response questions that allowed for open ended feedback to ask respondents about concerns with sharing data through repositories, about experiences with receiving and making data sharing requests, about reasons for granting or declining individual requests, and about their beliefs on the right of first use of clinical trial data. For those respondents who had experience with depositing or sharing data, concerns and reasons were solicited by asking about actual experience; for those without data depositing or sharing experience, concerns and reasons were solicited by asking about hypothetical situations. For those questions related to concerns with and reasons for data sharing, respondents were first asked to select any or all overarching categories of concerns and reasons provided as multiple choice responses; when respondents selected a category, more detailed concerns and reasons were provided as non-exclusive multiple choice responses. From among the more detailed concerns listed, respondents were asked to indicate the magnitude of their concern (major, minor, or none). We did not differentiate between major or minor reasons for granting or denying individual requests to share data.
We also collected respondent sociodemographic characteristics, including age, sex, and primary employer, as well as professional characteristics, including academic rank, years since completion of highest degree, location of scientific training, academic productivity, and funding status.
To compare characteristics of survey respondents and non-respondents, we used χ2 tests for categorical variables (author employer and location, trial funder, and journal) and the Kruskal-Wallis test for continuous variables (trial enrolment), using two-sided tests with a type I error level of 0.05. Comparison data were analysed by KD using SPSS version 19.0 (IBM, Armonk, NY, USA).
Next, we conducted descriptive analyses, calculating the proportion of respondents who supported data sharing in principle, had engaged in data sharing, had identified major and minor concerns about data sharing, and had identified reasons for granting and denying data sharing requests. For Likert questions, we collapsed “strongly disagree” and “somewhat disagree” into one category (“disagree”) and collapsed “strongly agree” and “somewhat agree” into another (“agree”). No variables were missing >1% of responses. Survey data were analysed by KMS using SAS Version 9.3 (SAS Institute, Cary, NC, USA).
Survey participation requests were sent to 709 unique clinical trialists (fig 1⇓); 26 were subsequently excluded from our study population because contact information was invalid (n=18), technical difficulties prevented the author from accessing the survey (n=4), or the author was a secondary user of clinical trial data generated by another research group (n=4). Of the remaining 683 trialists, 317 completed the survey either online (n=306) or by phone (n=11), yielding a response rate of 46%.
Survey respondents did not differ from non-respondents with respect to corresponding author location or affiliation, trial enrolment, or journal in which the article was published (table 1⇓). However, trial funders differed among responders and non-responders (P=0.001), as corresponding authors of trials funded solely by government sources responded more often than corresponding authors of trials funded solely by industry or by mixed funding sources.
More than 90% of respondents were corresponding authors of clinical trials (275 (87%) of primary reports and 20 (6%) of secondary reports of clinical trials), and 22 (7%) were corresponding authors of prospective observational studies. Most respondents were aged 50–64 years, male, received their scientific training in the US or Canada, and had completed their training 10–24 years ago (table 2⇓). Most (83%) were employed by an academic institution, and two thirds of these had reached the rank of full professor. Respondents were professionally productive: in the past three years, 41% had published ≥25 articles, 46% had been awarded four or more grants, and 52% had received more than $1million in direct research support.
Support for and prevalence of data sharing
Overall, 278 (88%) of respondents supported data sharing. Specifically, 236 (74%) thought that, in principle, sharing de-identified data through a data repository should be mandatory. Furthermore, 229 (72%) thought that investigators should be required to share de-identified data upon individual request.
In practice, only 56 (18%) respondents were required by their research funder to deposit their trial data in a repository, and 32 of the 56 (57%) had thus far deposited the data. Similarly, 149 (47%) had received an individual request to share their clinical trial data, and 115 (77%) of these had granted at least one request and 56 (38%) had denied at least one request. The most common reasons for data requests were for systematic review or meta-analysis (n=85 (57%)), for subgroup analysis of the originally published study (n=58 (39%)), and to pursue novel research questions (n=47 (32%)). Among the respondents, 101 (32%) had made a request for clinical trial data of another investigator: 81 of 101 (80%) had at least one request granted, and 49 (49%) had at least one request declined.
Respondents varied in their beliefs about right of first use of clinical trial data. Five (2%) indicated that clinical trial data should be made available to investigators external to the study team immediately on trial completion, 109 (34%) stated that it should be within one to two years of trial completion, 97 (31%) stated that it should be within three to five or more years of trial completion, while 106 (33%) indicated that there should be no time limit and that the right of first use should extend until the main findings are accepted for publication.
Concerns with data sharing through repositories
Respondents identified potential major and minor concerns with data sharing through repositories. The most common concerns related to appropriate data use (n=205 (65%)), but investigator and funder interests (n=129 (41%)) and protection of research subjects (n=91 (29%)) were also common concerns. Specific concerns, and whether they were indicated to be major or minor, are displayed in fig 2⇓.
Reasons for granting and denying individual data sharing requests
Respondents offered several reasons for sharing data in response to an individual request. The most common reasons cited were related to promoting open science (n=248 (78%)), although academic benefits and recognition (n=133 (42%)) were also often identified. More detailed reasons for granting data sharing requests are displayed in fig 3⇓.
Similarly, respondents offered several reasons for denying individual requests to share data. The most common reasons cited were related to ensuring appropriate data use (n=233 (74%)). However, protecting investigator or funder interests (n=121 (38%)) and protecting research subjects (n=107 (34%)) were also often cited. More detailed reasons for denying data sharing requests are displayed in fig 4⇓.
In this survey of clinical trialists who had recently published trials in high impact general medical journals, we found strong support among respondents, in principle, for sharing clinical trial data with investigators who are not directly collaborating with their research team. Nearly three quarters of respondents thought that sharing de-identified data through data repositories or in response to individual requests should be required. These findings contradict prior research which suggested that trialists are largely opposed to data sharing.26 27 28 Our results may reflect changes in attitude among clinical researchers over time, since our survey was administered more than a decade after two of these earlier studies.26 28 However, our survey also differed from the third, more recent, study by asking potential respondents to self report support for and data sharing behaviours, as opposed to testing actual willingness to share clinical trial data.27 Nevertheless, a self reported willingness to engage in data sharing is a critical first step.
For the value of data sharing to be realised, support in principle needs to translate to actions, on the part of both potential data sharers and funders. Fewer than a fifth of trialists were required by funders to deposit their trial data in a repository. Additionally, fewer than half of the respondents had received an individual request for their data, and fewer than a third had requested data from others. Partially, this may be a consequence of an underdeveloped data sharing infrastructure, which requires robust repositories, data standards, copyright and licensing agreements, and patient consent.34
However, our findings suggest a genuine willingness among these respondents to engage in data sharing. Although less than half had received individual requests for data, more than three quarters of these had thus far shared data in response. Since the trials of surveyed investigators were all published in 2010 or 2011, willingness to share might increase as time passes. Moreover, even though only half of respondents had received a request for data, this suggests that there is a clear demand for shared data. Perhaps the low rate of data sharing requests was due to investigators external to the research team only recently becoming aware of the data. In addition, although slightly over half of respondents who were required by funders to deposit data had thus far done so, open ended comments suggested this was because the deadline date for deposit had not been reached. However, the fact that so few funders require deposition of de-identified data in a repository for use by other investigators suggests that a commitment among funders to require data sharing is a clear mechanism by which to promote and increase the practice.
Generalisability of results
Despite general support for and willingness to engage in data sharing efforts, only 46% of potential participants completed our survey. That fewer than half of potential participants responded may suggest that our findings overestimate support for and willingness to engage in data sharing in the clinical trial community, limiting the external validity of our findings—as individuals who chose not to respond may have been less supportive of data sharing efforts than those who responded. Furthermore, even among survey respondents, our findings may have been biased by social desirability, as respondents may have been less likely to self report beliefs and behaviours that may be negatively perceived by others and instead indicated stronger support for data sharing efforts.35 36 However, there were few observed differences between survey respondents and non-respondents, with the exception that the response rate among corresponding authors of trials funded solely by government sources was higher than that among corresponding authors of trials funded solely by industry or by mixed funding sources. Because government funded trialists may be more oriented toward public health interests, such as data sharing initiatives, our findings may be biased toward that viewpoint. Nevertheless, a substantial number of trialists who had received industry funding participated in this survey.
In addition, we used several mechanisms to ensure prospectively the generalisability of our findings, prevent social desirability bias, and improve response rates. First, our solicitation letters consistently reminded potential participants that the purpose of the survey was to fairly represent their views in the growing international debate on whether to require investigators to release their clinical trial data to others. Second, we used a web based survey platform for ease of completion and limited the scope of the survey to reduce response burden. Finally, we offered financial incentives for participation and employed several reminder contacts, including three emails and at least one telephone contact. Although our response rate compares favourably with other surveys of physicians (a difficult group to engage in participation),37 38 39 it was lower than that of other web based surveys of clinical trial investigators.40 41 42
Implications of results
Respondents had several concerns that may need attention to ensure their engagement in data sharing efforts. Most concerns related to the integrity of the process and to the need for data sharing to lead to both public and private benefit. For instance, the most common concerns related to appropriate data use, such as preventing misinterpretation or misleading secondary analyses and ensuring clarity of data elements for other investigators. However, there are mechanisms to allay such concerns, including pre-registration and specification of secondary data analysis plans,43 deposition of clear data dictionaries, and public posting of frequently asked questions about data sources to ensure that individuals learn from one another. Moreover, to ensure competency among both data sharers and requestors, training curriculums can be developed to teach best practices for preparing and using shared clinical trial data.23
Similarly, some trialists indicated that they were concerned about protecting either their own or their colleagues’ interests, including the need to ensure that trialists receive sufficient academic or scientific recognition for sharing their clinical trial data, do not spend undue time or effort preparing data for sharing, and have sufficient opportunity to publish studies using the data they were responsible for collecting. Addressing these concerns will, in many respects, require a cultural shift in the clinical research community, including within academic medicine.44 Academic institutions and promotions committees need to begin crediting investigators not just for publishing articles in high-impact journals, but also for creating data by designing and conducting clinical trials, sharing data with other investigators and enabling them to address important questions. Interestingly, academic recognition was not always a concern, as more than 40% of respondents reasoned that their decision to share trial data would increase their academic recognition by increasing the impact of their work and helping them develop professional relationships.
Limitations of study
In addition to the question of generalisability raised above, our study was limited to corresponding authors of clinical trials published in the highest impact general medical journals. Our findings may not be applicable to the entire clinical trial research community, although these high impact studies are likely to address important clinical questions that can potentially affect clinical decision making—exactly the type of data whose scientific value should be maximised for public benefit. Finally, some information of interest was not asked in order to reduce response burden, including questions about whether funders have explicitly prohibited data sharing and experiences negotiating data ownership among funders, trialists, and secondary users. Data ownership remains a critical obstacle to data sharing efforts,32 34 particularly for industry funded research, with sponsors often explicitly retaining ownership rights.41 45 46
We found strong support for sharing clinical trial data among corresponding authors of recently published trials in high impact general medical journals who responded to our survey. However, practical concerns must be addressed if the efforts of clinical trial funders, journals, and secondary users of clinical trial data to promote data sharing are to succeed. The clinical trialist community not only has to cooperate with these efforts, but must trust that data sharing is in the best interests of the public and science.
What is already known on this topic
Data sharing policies are increasingly promoted to improve access to clinical trial data to inform evidence based practice
Little is known about support for these policies among clinical trialists
What this study adds
Among corresponding authors of recently published trials in high impact general medical journals who responded to our survey, about three quarters supported initiatives for sharing clinical trial data
Respondents reported a willingness to share data, along with several practical concerns related to appropriate data use, investigator or funder interests, and protection of research subjects
Cite this as: BMJ 2012;345:e7570
Contributors: VR, CPG, IH, SJ, and JSR were responsible for the conception and design of this work. VR, KD, and JSR were responsible for acquisition of data. VR and JSR drafted the manuscript. KD and KMS conducted the statistical analysis. JSR provided supervision. All authors participated in the analysis and interpretation of the data and critically revised the manuscript for important intellectual content.
Data access and responsibility: All authors had full access to all the data in the study, and JSR takes responsibility for the integrity of the data and the accuracy of the data analysis.
Funding: This study was not supported by any external grants or funds. VR received support from the Yale University School of Medicine Medical Student Research Fellowship. The five $100 Amazon gift certificates were paid for from JSR’s institutional discretionary funds. HMK and JSR receive support from the Centers of Medicare and Medicaid Services (CMS) to develop and maintain hospital performance measures that are used for public reporting. HMK is supported by a National Heart Lung Blood Institute Cardiovascular Outcomes Center Award (1U01HL105270-02). JSR is supported by the National Institute on Aging (K08 AG032886) and by the American Federation for Aging Research through the Paul B Beeson Career Development Award Program.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; CPG, HMK, and JSR receive research support from Medtronic to develop methods to promote data sharing; CPG and JSR are on a scientific advisory board for FAIR Health, a not-for-profit organisation with the mission to achieve fairness and transparency in healthcare reimbursement; HMK chairs a scientific advisory board for UnitedHealthcare, a health insurance company; SJ is a paid member of a data monitoring committee for Genzyme/Sanofi; no other relationships or activities that could appear to have influenced the submitted work.
Data sharing: Requests for statistical code and dataset can be made to the corresponding author at firstname.lastname@example.org. The dataset will be made available via a publicly accessible repository on publication.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.