Intended for healthcare professionals


How can we make better use of ethnicity data to improve healthcare services?

BMJ 2023; 380 doi: (Published 30 March 2023) Cite this as: BMJ 2023;380:p744
  1. Graham Martin, director of research1,
  2. Rohini Mathur, professor and chair of health data science2,
  3. Habib Naqvi, chief executive3
  1. 1THIS Institute, Cambridge
  2. 2Centre for Primary Care, Wolfson Institute of Population Health, Queen Mary University of London
  3. 3NHS Race and Health Observatory, London

Healthcare staff, patients, and the wider public need to know sensitive data are being put to good use, say Graham Martin and colleagues

The importance of high quality, reliable data about the demographic characteristics of patients and populations is universally accepted. Information about people’s access to, experiences of, and outcomes from healthcare, and how these vary by ethnicity and other characteristics, is evidently vital in helping to ensure quality, identify gaps in provision, and meet the needs of diverse populations. To date, however, we have been more successful in using data to identify and understand inequalities than to reduce them.

Inequities in healthcare access, experiences, and outcomes still endure between ethnic groups. Differences by ethnicity in age standardised mortality rates seemed to reduce in the UK between 2012 and 2019,1 but people from ethnic minority groups continue to have poorer outcomes and healthcare experiences than their white counterparts across a range of areas, from mental health to neonatal care.2 The disproportionate impact of covid-19 on groups who already had poorer health highlighted the need to break this cycle.3 Tackling the underlying causes of ethnic inequalities could improve the quality of healthcare for everyone, but especially people who are currently poorly served by healthcare. How can we make better use of data to achieve this?

Firstly, we should acknowledge the assets we already have and recognise the value of efforts to document the relationship between ethnicity and health, which provide a mandate for action. The UK is unusual in having made it mandatory to collect ethnicity data in certain official statistics, such as the national census and various other government and health and social care datasets. Only four other European countries have similar requirements.4

But legal mandates only go so far. In New Zealand, for example, which like the UK requires the routine collection of ethnicity data, the completeness of data remains a problem. Health statistics are estimated to undercount people of Māori ethnicity by 15-20% relative to gold standard census data.5 This has implications for Māori people’s eligibility for and uptake of various healthcare services, as they may not all be offered targeted interventions, such as ethnicity-stratified risk assessment and screening.

In England, the completeness, accuracy, and usability of ethnicity data improved after its collection in general practices was financially incentivised under the Quality and Outcomes Framework.6 Yet the ethnicity information routinely captured in health records is still subject to missing data, inaccuracies, and systems that fail to follow good practice in categorisation and granularity.7 The result is datasets that are far less reliable than they should be, with their potential value further weakened by inconsistencies between systems and difficulties in linking data.

Tackling this implementation gap is likely to require a range of strategies that account for the views and needs of healthcare professionals and patients. Ensuring adherence to common data collection standards is a basic, and vital, step. Better incentivisation may also be a useful tool.6 However, improving understanding of the purpose of data collection is perhaps even more important. Healthcare staff need to know that their time is not being wasted, and patients and the wider public need to be convinced that the potential benefits of disclosing sensitive data outweigh the risks.

A guarantee of public good

Trust in public authorities is low among some ethnic minority groups8 and for good reason. Events in the recent past, such as the Windrush scandal, demonstrate the malintent and racist consequences of some government data collection activities.9 More broadly, experience also shows the importance of respecting the “social licence” for collecting and processing data,10 especially when those data are potentially sensitive. Public trust must be earned rather than taken for granted, and those collecting data must justify their activities by showing that they contribute to the public good, while acknowledging and mitigating the risks.

Including the groups most affected throughout the process of collecting, collating, and using data is central to building trust—as an abundance of case studies show.11 A commitment to partnership helps to ensure that any improvement efforts are enriched and enhanced by people’s lived experiences, as well as the more abstract knowledge about disparities provided by statistical data.

It can also help to secure the validity and relevance of those data. Ethnicity is socially constructed and ethnic identifications and inequalities evolve over time. Patterns of variation between groups may not be captured by the aggregate ethnic categories we use at any one time, such as “south Asian,” “eastern European,” or “black.” Ensuring that the ways we categorise and record ethnicity keep pace with that evolving reality, and are informed by those with direct experience, is vital for their validity.7

Of course, the most important causes of ethnic inequalities in health are not under the control of the healthcare system, but have their roots in structural and societal forces.12 Yet public health and healthcare have tools that can help to mitigate the worst of these forces, and ameliorate access, experience, and outcomes. At the very least, efforts to improve healthcare must avoid “iatrogenic inequity”—exacerbating existing injustices by failing to acknowledge and account for their existence, or failing to engage people who are disadvantaged. High quality data are central to these goals.

The social licence is fragile, and failing to show that data collection produces social value will threaten the quality of the information we obtain. Combining data, inclusion, and rigorous delivery of improvement can, however, create a virtuous circle, rebuilding trust through demonstrable impact and laying the collaborative foundations for further work.


  • Competing interests: RM receives salary contributions for her work on the Genes & Health programme, by a Life Sciences Consortium that includes AstraZeneca PLC, Bristol-Myers Squibb Company, GlaxoSmithKline Research and Development Limited, Maze Therapeutics Inc, Merck Sharp & Dohme LLC, Novo Nordisk A/S, Pfizer Inc, Takeda Development Centre Americas Inc. Nothing further declared.

  • Provenance and peer review: Commissioned; not externally peer reviewed.

  • Acknowledgements

  • This piece draws on presentations and a panel discussion at THIS Space 2022 on “Why we need to collect good demographic and ethnicity data,” which can be viewed online at GM is based in The Healthcare Improvement Studies Institute (THIS Institute), University of Cambridge. THIS Institute is supported by the Health Foundation, an independent charity committed to bringing about better health and healthcare for people in the UK. RM is supported by Barts Charity (MGU0504).