Intended for healthcare professionals


The dearth of disaggregated health data: a political rather than a technical challenge

BMJ 2023; 381 doi: (Published 01 June 2023) Cite this as: BMJ 2023;381:p1254
  1. Kent Buse, director1,
  2. Abhishek Gautam, associate director2,
  3. Unsia Hussain, research consultant3,
  4. Victoria Olarewaju, research consultant3
  1. 1Healthier Societies Program, George Institute for Global Health, Imperial College London, UK
  2. 2Gender and Health, International Center for Research on Women (ICRW), Asia Regional Office, Delhi, India
  3. 3Global Health 50/50, London, UK

The failure to disaggregate population datasets hold back efforts to achieve health equity and to hold governments accountable for progress

“We commit to leave no one behind and we commit, by 2020, to strengthen capacity to report disaggregated data.” These were among promises made by governments when they agreed to the Sustainable Development Goals in 2015.1

Covid-19 put both of those commitments to the test. It was clear from the outset of the pandemic that health outcomes were not evenly spread across the population: certain groups of people were more exposed to the virus, or experienced greater health impacts, or had less access to quality health services. Front line health workers and people who lacked the opportunity to work from home were more likely to become infected. Women were less likely to be admitted to intensive care compared to men who were admitted to hospital. The likelihood of death among people with pre-existing morbidities, older people, and men was higher compared to women or younger people.2 As the pandemic wore on, further trends were revealed, including who had greater access to testing and vaccines. Although this was understood in broad terms, many country-specific realities were not systematically reported and published by governments—at least not initially.

An effort involving researchers based across different regions of the world sought to fill the global data gap. They were motivated by a desire to draw attention to the relationship between sex and gender and covid-19 outcomes along a clinical pathway. A partnership of three institutions scrubbed government websites for sex-disaggregated data found in reports, press releases, and social media to build a picture of the gendered nature of the pandemic on health behaviours, healthcare, and health outcomes. These data, published in a regularly updated online tracker, became the world’s most comprehensive repository of such data, and global and regional analyses were published.34 The data were extensively used by journalists, academics and policy makers.56 It also responded to longstanding calls for sex and gender disaggregated data—for example in the 1995 Beijing Declaration and Platform for Action.7

Across the 205 countries included in the tracker, patterns in sex-differences emerged, and gendered analyses helped interpret these differences. For example, most countries reported inconsistent and incomplete sex-disaggregated data across all the key covid-19 indicators (i.e., testing, hospital admissions, intensive care admissions, healthcare workers and vaccine uptake). Countries and regions diverged on the consistency and comprehensiveness of their reporting. Based on the World Bank income status categories, low-income countries least reported sex-disaggregated data on cases and deaths. Further, among the small number of countries which reported sex-disaggregated testing data, slightly more women got tested than men.8 Analysis and interpretation of the data found that it generally aligned with previously reported gendered patterns in risk exposures, health behaviours, and the gendered nature of health services utilisation and quality of care received.9

Given the intersectional nature of determinants and risk factors for most, if not all health conditions,1011 additional variables, beyond sex and gender, were looked for. The group took a snapshot of disaggregated data available in November 2021 and October 2022.12 They were particularly interested in variables that governments had committed to strengthen reporting on as part of the SDG process, specifically from SDG target 17.18: age; income; geographical location; comorbidity; nationality; race or ethnicity; socio-economic status; rural/urban; refugee or internally displaced person; and disability.

Data patterns emerged, with similarities and differences across countries. Age disaggregation though common was reported by only 41% (85/205) of countries. Data disaggregated by race/ethnicity was available for only five high income countries: the USA, England, Northern Ireland, Wales and New Zealand. Urban/rural, pregnant/breastfeeding mothers and socioeconomic status data was reported by the USA, UK, Northern Ireland and Wales. No low- or middle-income country reported on these variables. None of the countries had data disaggregated for Internally Displaced People (IDP) or Refugees, including data specific to camps or facilities, despite the additional challenges faced by these populations, including their access to healthcare. Comorbidity was reported by a mix of one low-income, three middle-income and 12 high-income countries (16/205). The findings point to a dearth in disaggregated data despite ample evidence of the influences these social determinants exert either alone or interconnectedly on covid-19 across populations.13141516

Epidemiologists, health policymakers, and most people who lived through the pandemic will not be surprised to learn of its inequalities. But what should surprise and concern us all, is the dearth of reporting of disaggregated data during the world’s worst pandemic in living memory. The UN and health advocates encouraged governments to build back better. Vast sums were (rightly) invested in pandemic responses, yet systems were not widely strengthened to collect, track, and report national data on the way the pandemic unfolded across their populations. In the snapshots, for example, only five countries reported on disability, (Australia, England, Lithuania, New Zealand and Wales). LICs in particular have been found to report the least number of disaggregated variables beyond sex. While overall we found a lack of data during the pandemic, we acknowledge England, Wales, Northern Ireland and the USA; the only countries that have reported covid-19 data disaggregated by six or more of the variables recommended by the UN.

Data collection is political: an act of deciding what and who is worth counting—and identifying who is left behind. While effective health surveillance systems will not intrinsically help to create better health, they can enable more equitable targeting of policies and programmes. Such data also provide a means to hold governments accountable for their promises to leave no one behind. The covid-19 pandemic was a catastrophe on an unprecedented scale; and the failure to build robust national surveillance systems is but one missed opportunity. As the international community negotiates a new global pandemic preparedness and response agreement,17 advocates who care about health equity should get behind efforts to do data differently—and that means disaggregated by the categories agreed upon in the SDG framework, data that are used in responsive policies and programmes, and data that truly help identify the inequalities and inequities that drive health and wellbeing across people and populations.


  • Competing interests: The authors declare that they are employed or associated with the organisations that produce The Sex, Gender and COVID-19 Project. They do not have other interests to declare.

  • Acknowledgment. This Opinion is based on research funded by the Bill and Melinda Gates Foundation (INV-017909; INV-030827). The findings and conclusions contained within are those of the authors and do not necessarily reflect positions or policies of the Bill and Melinda Gates Foundation. We are grateful to Fizza Fatima, Mehrnoosh Samaei, David Zezai, Fiona Bakelmun; and from APHRC Sally Odunga and Michelle Mbuthia; Kakoli Borkotoky and Ritu Acharjee from ICRW; and, Aaron Koay and Mireille Evagora-Cambell from GH5050 for undertaking data collection for the platform.