Intended for healthcare professionals


What’s holding up the big data revolution in healthcare?

BMJ 2018; 363 doi: (Published 28 December 2018) Cite this as: BMJ 2018;363:k5357
  1. Kiret Dhindsa, postdoctoral fellow1,
  2. Mohit Bhandari, professor2,
  3. Ranil R Sonnadara, associate professor2
  1. 1Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
  2. 2Department of Surgery, McMaster University, Hamilton, Ontario, Canada
  1. Correspondence to: K Dhindsa dhindsj{at}

Poor data quality, incompatible datasets, inadequate expertise, and hype

Big data refers to datasets that are too large or complex to analyse with traditional methods.1 Instead we rely on machine learning—self updating algorithms that build predictive models by finding patterns in data.2 In recent years, a so called “big data revolution” in healthcare has been promised345 so often that researchers are now asking why this supposed inevitability has not happened.6 Although some technical barriers have been correctly identified,7 there is a deeper issue: many of the data are of poor quality and in the form of small, incompatible datasets.

Current practices around collection, curation, and sharing of data make it difficult to apply machine learning to healthcare on a large scale. We need to develop, evaluate, and adopt modern health data standards that guarantee data quality, ensure that datasets from different institutions are compatible for pooling, and allow timely access to datasets by researchers and others. These prerequisites for …

View Full Text

Log in

Log in through your institution


* For online subscription