Intended for healthcare professionals


The 100 000 Genomes Project: bringing whole genome sequencing to the NHS

BMJ 2018; 361 doi: (Published 24 April 2018) Cite this as: BMJ 2018;361:k1687
  1. Clare Turnbull, professor14,
  2. Richard H Scott, consultant clinical geneticist1 5,
  3. Ellen Thomas, consultant clinical geneticist1 2,
  4. Louise Jones, professor1 6,
  5. Nirupa Murugaesu, consultant oncologist1 7,
  6. Freya Boardman Pretty, researcher1,
  7. Dina Halai, researcher1,
  8. Emma Baple, consultant clinical geneticist1 8,
  9. Clare Craig, consultant pathologist1 ,
  10. Angela Hamblin, consultant haematologist1 9,
  11. Shirley Henderson, researcher1 10,
  12. Christine Patch, consultant genetic counsellor1 2 11,
  13. Amanda O’Neill, researcher1 12,
  14. Andrew Devereau, researcher1,
  15. Katherine Smith, analyst1,
  16. Antonio Rueda Martin, analyst1,
  17. Alona Sosinsky, analyst1 ,
  18. Ellen M McDonagh, researcher.1,
  19. Razvan Sultana, analyst1,
  20. Michael Mueller, analyst1,
  21. Damian Smedley, researcher1 3,
  22. Adam Toms, researcher1,
  23. Lisa Dinh, researcher1,
  24. Tom Fowler, director of public health1,
  25. Mark Bale, deputy director1 13,
  26. Tim Hubbard, professor1 14,
  27. Augusto Rendon, director of bioinformatics1 12,
  28. Sue Hill, chief scientist10,
  29. Mark J Caulfield, chief scientist13
  30. on behalf of the 100 000 Genomes Project
  1. 1Genomics England, London, UK
  2. 2Guy’s and St Thomas’ NHS Foundation Trust, London, UK
  3. 3William Harvey Research Institute, Queen Mary University of London, UK
  4. 4Institute of Cancer Research, London, UK
  5. 5Great Ormond Street Hospital NHS Trust, London, UK
  6. 6Barts Cancer Institute, Queen Mary University of London
  7. 7St George's University Hospitals NHS Foundation Trust, London, UK
  8. 8University of Exeter, Exeter, UK
  9. 9Oxford BRC Haematology Theme, Oxford Universities NHS Foundation Trust, Oxford, UK
  10. 10NHS England, London, UK
  11. 11Florence Nightingale Faculty of Nursing and Midwifery, King’s College, London, UK
  12. 12University of Cambridge, Cambridge, UK
  13. 13Science Research and Evidence Directorate, Department of Health and Social Care, London, UK
  14. 14Medical and Molecular Genetics, King’s College London
  1. Correspondence to: C Turnbull clare.turnbull{at}

In partnership with NHS England, Genomics England’s ambitious plans to embed genomic medicine into routine patient care are well underway. Clare Turnbull and colleagues discuss its progress

Many disorders we encounter in clinical medicine have a genomic basis, from rare “single gene” disorders such as cystic fibrosis, to complex, polygenic disorders such as ischaemic heart disease, drug toxicity, and tumour evolution driven by serial somatic mutations. Next generation technology has transformed the capacity, speed, and cost of genomic sequencing. This has provided important advances and new opportunities for the clinical application of genomics (fig 1). However, radical expansion of genomic medicine within clinical care requires new infrastructure, extended skills, education of the workforce, and diligent engagement with the public. The Genomics England 100 000 Genomes Project was initiated in 2013 to establish the use of whole genome sequencing in the NHS and drive change within NHS services to adopt this technology.

Fig 1
Fig 1

Potential applications of genomics in medicine

Transforming genomics in UK

The UK has long been at the forefront of discovery in human genomics and is recognised for its world leading genetic research studies, such as UK Biobank and Deciphering Developmental Disorders (fig 2).123 In parallel the UK has evolved a mature network of NHS funded regional genetics laboratories and clinical genetics departments.

Fig 2
Fig 2

Genomics in the UK: timelines of clinical testing and research achievements

Until recently, genomic technologies available in the clinic have enabled us to look for the “causative mutation” just one segment of a gene at a time, limiting both the speed and volume of clinical testing. Over the past decade, next generation sequencing has made it possible to sequence millions of fragments of DNA simultaneously. This step change in scale enables us to offer genetic testing to many more people and test one person for hundreds or thousands of genes at a time.1 Indeed, while the initial sequencing of the full human genome took over 10 years and cost more than £2bn (€2.4bn; $3bn), an individual’s genome can now be sequenced in around a day at a cost of less than £700.456

To harness the new possibilities availed by this technology shift, successive governmental strategy reports have emphasised the need for new approaches to delivering genomics services.789 These reports have called for centralised provision of whole genome sequencing and related (bio)informatics to improve cost effectiveness and adaptiveness, a national database of genomic information (and associated clinical data) that is accessible throughout the NHS, as well as expansion of the genomics workforce and improved genetic literacy across the full clinical workforce.

In 2012, the then prime minister, David Cameron, announced funding for whole genome sequencing of 100 000 genomes from patients in the English NHS to capitalise on the potential of this technology for patient benefit. Genomics England, owned by the Department of Health and Social Care, was set up to deliver the project (fig 3) working in partnership with NHS England. Rare disease and cancer were selected as the areas that had the most immediate potential for clinical benefit from whole genome analysis.10

Fig 3
Fig 3

100 000 Genomes Project: milestones

Rare diseases and the diagnostic odyssey

The project’s rare disease programme was established to initiate and embed use of whole genome sequencing within the NHS to identify genetic causes in people with rare inherited diseases (box 1). Clinicians and researchers nominated more than 200 recruitment categories (spanning over half of the roughly 7000 recognised rare diseases) that were deemed underserved by current clinical diagnostic testing or required further research to elucidate their genetic basis.111213141516

Box 1

Rare disease and genomics

  • Rare diseases are defined as those affecting <1 in 2000 people. Over 7000 rare disease entities have been described, estimated to affect a total of around three million people (1 in 17) in the UK

  • About 75% of these diseases manifest before the age of 5 years. They are typically life shortening and confer serious disability

  • Most are due to single gene defects (so called monogenic or mendelian diseases)

  • A robust genetic diagnosis in rare disease can be critical to management. The specific genetic diagnosis enables the clinician to apply the therapies and interventions most likely to be effective, give a best estimate of prognosis, predict additional features, and pre-empt complications

  • Genetic diagnosis can also provide information to the family about likelihood of recurrence in subsequent pregnancies as well as options for pre-implantation or prenatal genetic diagnosis


Historically, the “diagnostic odyssey” in a child with a rare disease could span several years. Children would have investigation of multiple organ systems by different medical specialists, and even after referral to clinical genetics, serial testing of different genes (often at different laboratories) could take years. Sequencing of the coding regions of all 20 000 genes by whole exome or genome sequencing eliminates reliance on the clinical hypothesis to select which genes to test and has enabled diagnoses of many previously unsolved cases.1718

Whole genome sequencing in cancer

Whole genome sequencing of cancer tissue can provide information on cancer aetiology, prognosis, and potential therapeutic responsiveness (box 2). Procurement of tumour DNA of sufficient quantity, quality, and purity has often limited clinical and research tumour sequencing to date. The 100 000 genomes cancer project has collected a broad range of early stage and advanced solid tumours from diagnostic biopsy and surgical resection samples, as well as haematological malignancies. In current clinical testing, multiple standalone tests are used to capture the set of genomic biomarkers examined for a given tumour type, but the falling cost makes whole genome sequencing potentially attractive as a single all-encompassing test.

Box 2

Cancer genomics and molecular oncology

  • Cancer is a disease of disordered genomes: acquisition of serial genomic mutations results in progressive escape from the mechanisms that regulate cell division, leading to tumourigenesis, invasiveness, and metastasis19

  • The catalogue of such recognised “somatic driver mutations” has been increased through large scale international tumour sequencing projects such as the International Cancer Genome Consortium and the Cancer Genome Atlas2021

  • Knowledge of tumour genomic changes has enabled development of targeted drugs that switch off mutated oncogenes (eg, the BRAF inhibitor vemurafinib used in metastatic melanoma or EGFR inhibitors such as erlotinib used in advanced lung cancer)

  • Clinical testing for genomic changes (biomarkers) can define diagnostic subtypes; predict tumour behaviour, prognosis, and drug response; and enable monitoring for early recurrence of disease2223

  • Paired sequencing, subtracting the normal genome (eg, of the blood) from the tumour genome enables identification of the acquired mutations in the tumour, from small mutations in genes (base substitutions and deletions/insertions) to larger structural variants (translocations, large deletions, or duplications resulting in amplification).2425

  • Signatures (complex mutational patterns) can also be extracted from analysis across the whole genome.2627 Trials are evaluating perfomance of these signatures to predict drug response and tumour behaviour

  • Technologies are evolving rapidly for genomic analysis to detect minuscule levels of cell-free circulating tumour DNA (ctDNA) in the bloodstream before the tumour becomes clinically or radiologically obvious.28 Current clinical evaluation is largely focused on early detection of tumour recurrence, but there is substantial interest in using the technology for primary screening for cancer


How is the project being delivered?

Sample acquisition and sequencing

Thirteen centres across England with established expertise in molecular genetics, clinical genetics, molecular pathology, and molecular oncology were established by NHS England as NHS genomic medicine centres. These hub hospitals link to over 90 local recruiting hospitals, providing substantial national coverage (fig 4). Tissue preparation, DNA extraction, and quantitation are undertaken locally according to standardised protocols, and DNA is then transferred to the central national biorepository and then to the national sequencing centre.

Fig 4
Fig 4

Infrastructure development and expansion of training through the 100 000 Genomes Project

Central automated analysis with local interpretation and clinical reporting

Genomics England has developed platforms and automated pipelines for processing, calling, quality checking, storing, presenting, annotating, and prioritising the variants identified at sequencing. Twenty eight commercial suppliers of genomic analysis, annotation, and interpretation services were evaluated at the start of the programme; some of these have become “clinical interpretation partners” and in collaboration with Genomics England have adapted their decision support software to deliver the results of sequencing to the NHS genomic medicine centres.

In rare disease, the family set of genomes is analysed as a group, with four to five million variants identified in each individual. Algorithms incorporating variant frequency, familial inheritance, variant impact, and gene-phenotype association are applied to sort the identified variants into four groups according to the likelihood that they are causative (tiers 1-3 and untiered). For gene-phenotype association, the tiering algorithm uses a national bank of community curated gene lists established by Genomics England.29 This prioritised variant list is then sent to the NHS genomic medicine centre within a decision support tool for manual review and validation by the local laboratory and clinical teams.

For each family analysed, the high likelihood tiers (1 and 2) together typically contain fewer than five variants. If the causative variant (or pair) is not identified among these, then more laborious review of the tier 3 and untiered variants may be warranted. Each genomic medicine centre has established multidisciplinary meetings to bring together laboratory, clinical genetics, and medical subspecialty expertise, variably organised at local, regional, and national level.

In the whole genome analysis for cancer, established knowledge bases are used to assess the potential diagnostic, predictive, or prognostic value of the identified somatic variants. Variants are highlighted to indicate suitability for NICE approved targeted drugs as well as eligibility for genomically stratified UK clinical trials. A full analysis of tumour structural and copy number variation is also presented, as well as other findings such as pan-genomic signatures and mutational burden. Tumour sequencing boards have been set up at each genomic medicine centre to bring together laboratory scientists, oncology clinicians, pathologists, and germline cancer geneticists to advance molecularly driven patient management.

Accessing genome data for research

Individual identifiable data are available only to registered clinical users working within the NHS genomics medicine centres. De-identified, individual clinical and genomic data for research use are held within the Genomics England research environment. Access to the 100 000 Genomes Project data is carefully controlled through robust authentication systems and an “airlock” mechanism ensuring that only summary level data can be removed. Academic researchers can access the research environment through membership of one of the 42 Genomics England clinical interpretation partnership domains or through specific collaborations with approved partners from industry (fig 4).

Routinely collected national datasets, including Cancer Registry datasets and Hospital Episode Statistics, are regularly merged to the genomic data (at individual level).3031323334 The linked longitudinal life course datasets are currently immature but will eventually facilitate analyses for associations of genomic factors with longer term outcomes.

Challenges, progress, and evolution

As of April 2018, over 70 000 participants have been recruited to the project, more than 55 000 whole genomes sequenced, and over 10 000 whole genome analysis reports have been returned to the NHS. Various challenges have had to be overcome on the way.

Tissue for cancer genome sequencing

Formalin fixation and paraffin embedding (FFPE) has been used for over 100 years to prepare tumour tissue for microscopy. Formalin causes severe degradation of DNA, affecting the fidelity of the genomic readout (especially for large structural variant calls in whole genome sequencing). Fresh tumour tissue provides much higher quality results but processing, transporting, and storing fresh tumour tissue across diverse settings has been a sizeable challenge, requiring new practices such as vacuum packing, tissue refrigeration, and use of novel coolants and transport media. The Royal College of Pathologists, the Human Tissue Authority, the Health Research Authority, NHS England, and Genomics England have come together in supporting this transformation, issuing a joint statement that collection of fresh tissue should be standard of care in modern cancer diagnostics.35

Standardisation of submitted clinical data

The automated genomic analyses for cancer and rare disease require input of standardised clinical data, for which Genomics England has developed data models using internationally established nomenclature systems, such as the Human Phenotype Ontology.36 Obtaining complete clinical data in a consistent format has been challenging because of diverse local electronic medical record systems, together with competing demands for local informatic resources. Substantial investment in local informatics and collaborative approaches across trusts working with NHS Digital, NHS England, and the Farr Institute have driven novel solutions. Examples include the GENIE system developed by the West Midlands genomics medical centre, which automates capture, collation, and delivery of clinical and sample data from diverse clinical systems.

Consistency in interpretation of genomic variants

Computational prediction tools, variant databases (commercial and academically maintained), and functional validation assays are all improving. However, determining whether a genomic variant is benign or pathogenic can be complex and interpretations of a specific variant are often inconsistent. The 100 000 Genomes Project combines high throughput automated central tiering analysis with local detailed review and validation of variants by clinical laboratory teams. This model seeks to ensure consistent, safe clinical practice while optimising the diagnostic rate and use of staff, expertise, and technology. National NHS England validation and reporting groups for both rare disease and cancer, with representation from each genomics medicine centre, are establishing consensus standards and driving national consistency in interpretation of genomic variants.37

Diagnostic rate

The initial overall diagnostic rate in the rare disease programme from preliminary central review of tiers 1 and 2 small variants is 22%, with diagnostic rates higher in certain disease categories such as intellectual disability. This diagnostic rate will increase after feedback from detailed local clinical review and analyses of additional variant types such as copy number variants. Nevertheless, the diagnostic rate will be influenced by the extensive pretesting undertaken in UK standard genetics practice and inclusion of sizeable numbers of people with late onset disorders that have reduced penetrance.

New NHS genomic medicine service

The 100 000 Genomes Project has catalysed evolution of informatics infrastructure, development of data pipelines, expansion of workforce capacity, development of skills in whole genome sequencing technology, and new expert national professional networks and ways of working. Health Education England has also developed training in genomics for the wider NHS workforce (fig 4).

The project is nearly finished, but in late 2018 whole genome sequencing will become part of an NHS England commissioned national genomic medicine service for rare inherited disease and cancer. The service will provide centralised accredited whole genome sequencing, with the results returned to a national network of genomic laboratory hubs, where other genomic tests will also be done. NHS England will publish a national genomic test directory3839 that will be linked to a national system for ordering genetic tests. This will support more systematic access to genomic testing across the country, with capture of clinical outcomes to enable ongoing evaluation.

Key messages

  • The 100 000 Genomes Project has established delivery of whole genome sequencing in the NHS

  • The project has driven transformation of local systems at participating centres, including tissue handling, collection of data, and processing of results

  • An NHS genomic medicine service is being established by NHS England to deliver systematic access to genomic tests, including whole genome sequencing

  • Whole genome sequencing can enable rapid diagnosis in children with rare disease

  • Whole genome sequencing of tumour tissue can inform selection of treatments for cancer


We thank the participants of the 100 000 Genomes Project, the NHS England Genomics team that has led the NHS implementation and service transformation and the future genomic medicine service developments, the NHS staff undertaking recruitment and sample processing, as well as all additional people who have supported the project in multitudes of other ways.

Contributors and sources: The manuscript was drafted by CT with contribution from MJC, SH, FBP, DH, RHS, ET, LJ, NM, AR, MB, KS, TH, and JM. FBP, DH, LD, and AT generated images for publication. All authors contributed to review of the final manuscript. We also acknowledge the contribution of members of the Genomics England Science and Bioinformatics Groups: John Ambrose, Maria Athanasopoulou, Wasim Bari, Bahareh Beshavardi, Tom Billins, Marta Bleda, Chris Boustred, Helen Brittain, Lisa Carr, Georgia Chan, James Chancellor, Jacobo Coll-Moragon, Clare Craig, Louise Daugherty, Liz Edwards, Rebecca Foulger, Pedro FurioTari, Kristina Garikano, Oleg Gerasimenko, Elena Hoxha, Samuel Hubble, Rob Jackson, Corey Johnson, Dalia Kasperaviciute, Melis Kayikci, Lea Lahnstein, Claudia Langenberg, Kay Lawson, Sarah Leigh, Greg Lever, Javier Lopez, Angela Matchan, Kenan McGrath, Clodagh McGuire, Ignacio Medina, Martina Mijuskovic, George Morrissey, Michael Mueller, Anna Need, Olivia Niblock, Christopher Odhams, Tracy Odogie, Daniel Pérez-Gil, Toby Petty, Dimitris Polychronopoulos, Lana Redmonds, Pablo Riesgo-Ferreiro, Laura Riley, Keith Rogerson, Kevin Savage, Kushmita Sawant, Ian Thompson, Simon Thompson, Carolyn Tregidgo, Arianna Tucci, Michael Walker, Sarah Watters, Matthew Welland, Eleanor Williams, and Suzanne Wood.


  • Feature, doi: 10.1136/bmj.k1267
  • Competing interests: We have read and understood BMJ policy on declaration of interests and have no relevant interests to declare.

  • Provenance and peer review: Commissioned; externally peer reviewed