Intended for healthcare professionals

  1. Joseph S Ross, professor of medicine and public health
  1. Section of General Medicine Department of Internal Medicine, Yale University School of Medicine, PO box 208093, New Haven, CT 06520-8093, USA
  1. joseph.ross{at}

This pandemic teaches us the value of truly open science

The covid-19 pandemic has caused incalculable harm and suffering worldwide. As of March 2021, more than 130 million people have been infected with SARS-CoV-2 and nearly three million have died. Many who survived continue to have symptoms, ranging from fatigue to shortness of breath to “brain fog.” Economies have been disrupted, with many experiencing food and housing insecurity, a burden far more likely to be shouldered by society’s most vulnerable.

Not surprisingly, the covid-19 pandemic has also irrevocably changed the global scientific enterprise. But thankfully, the changes have been positive. Researchers began to work collectively and more collaboratively, embracing open science.1 Preprint platforms grew exponentially,23 as scientists sought to rapidly disseminate research findings. Trialists established multisite collaborative platform studies,4 working together to rapidly test new and established treatments. And population and public health researchers, in collaboration with national and regional health system leadership, launched “big data” observational research initiatives to better understand the prognosis and outcomes associated with covid-19.567 Some of these initiatives, such as the CVD-COVID-UK consortium, are now even making data available for other investigators to use for their own research.

In a linked special paper, Wood and colleagues (doi:10.1136/bmj.n826) describe key features of and data available through the CVD-COVID-UK initiative.8 This consortium has collated healthcare datasets across the four nations of the UK, deidentified and linked, to enable research focused on covid-19 and the disease’s relation with cardiovascular diseases such as myocardial infarction, heart failure, stroke, and thromboembolic events. Available information includes personal characteristics such as age, sex, and race, along with data on primary care and hospital episodes (covering inpatient, outpatient, emergency department, and critical care episodes), from which cardiovascular disease diagnoses can be determined, as well as registered deaths (including cause of death), covid-19 laboratory test data, and community dispensed medicines.8 Additional information on specialist intensive care, cardiovascular audit, hospital electronic prescribing, and covid-19 vaccination data is expected to be added in the future.

These data remain a work in progress. Not only are there plans to merge additional data sources in the future, but the current linked data differ in the periods for which information are available. For instance, while hospital episode and death data are available from April 1997 onwards, community drug dispensing data are available only after April 2018.8 Currently, data are shared only with approved researchers based in UK research organizations (universities and NHS bodies), with decisions made by the approvals and oversight board intended to ensure that research projects undertaken fall within the scope of the ethical and regulatory approvals for the CVD-COVID-UK initiative.8 However, why should access be restricted to UK based investigators? How can researchers request access and what are the explicit criteria for approval? The initiative’s website currently lists brief descriptions of seven approved projects.9 But linkages to more detailed research project proposals would be helpful; ideally using project proposal templates that follow established best practices10 for research using health system data such as these. Also unclear are the board’s expectations for reporting of results and dissemination of findings to the scientific community and the public.

Much has been learnt about making data available to others for clinical and health sciences research—from clinical trial data sharing initiatives, such as the multi-sponsor platform ClinicalStudyDataRequest,11 to administrative data collaborations, such as OptumLabs,12 to established public health data resources, such as UK Clinical Practice Research Datalink, UK Biobank, and the numerous survey and administrative data sources available in the US through the National Center for Health Statistics and the Agency for Healthcare Research and Quality. Transparency is particularly critical for efforts such as the CVD-COVID-UK initiative that elect to use more restrictive data access models13 that rely on at least partially independent intermediaries for oversight and review of requests, as well as data use agreements. For example, at the clinical trial data sharing initiative that I co-lead, the Yale Open Data Access (YODA) project, we have processes to promote transparency and the responsible conduct of research while ensuring data stewardship, including requirements for public posting of research proposals and dissemination of results.14

Nevertheless, these data represent a remarkable research resource and illustrate how covid-19 has fostered open science. The CVD-COVID-UK initiative includes data from more than 54 million people, comprising 96% of the English population,8 and spans a period far longer than covid-19. We can all look forward to the research and insights that are generated using these shared data, which are sure to inform clinical care decisions, public health, and policy. Further, we owe a debt of gratitude not only to the investigators and consortium leadership who invested time and resources in making these data available for wider use, but also to the UK citizens whose data are being shared through the CVD-COVID-UK initiative in the spirit of a learning healthcare system and open science.


  • Research, doi: 10.1136/bmj.n826
  • Competing interests: JR is the US Outreach and Research Editor at the BMJ. He currently receives research support through Yale University from Johnson & Johnson to develop methods of clinical trial data sharing (support for the YODA project), from the Medical Device Innovation Consortium as part of the National Evaluation System for Health Technology (NEST), from the Food and Drug Administration for the Yale-Mayo Clinic Center for Excellence in Regulatory Science and Innovation (CERSI) program (U01FD005938), from the Agency for Healthcare Research and Quality (R01HS022882), from the National Heart, Lung, and Blood Institute of the National Institutes of Health (NIH) (R01HS025164, R01HL144644), and from the Laura and John Arnold Foundation to establish the Good Pharma Scorecard at Bioethics International. In addition, he is a cofounder of medRxiv, the preprint platform for health sciences research.

  • Provenance and peer review: Commissioned, not externally peer reviewed.

This article is made freely available for use in accordance with BMJ's website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by BMJ. You may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained.