BMJ UK BMJ Americas BMJ Brazil BMJ China BMJ India

Machine learning shows similar performance to traditional risk prediction models

  • BMJ
  • /
  • Newsroom
  • /
  • Newsroom
  • /
  • Machine learning shows similar performance to traditional risk prediction models

Machine learning shows similar performance to traditional risk prediction models

Further scrutiny needed before they are used to make clinical decisions for individual patients, say researchers

Some claim that machine learning technology has the potential to transform healthcare systems, but a study published by The BMJ finds that machine learning models have similar performance to traditional statistical models and share similar uncertainty in making risk predictions for individual patients.

The NHS has invested £250m ($323m; €275m) to embed machine learning in healthcare, but researchers say the level of consistency (stability) within and between models should be assessed before they are used to make treatment decisions for individual patients.

Risk prediction models are widely used in clinical practice. They use statistical techniques alongside information about people, such as their age and ethnicity, to identify those at high risk of developing an illness and make decisions about their care. 

Previous research has found that a traditional risk prediction model such as QRISK3 has very good model performance at the population level, but has considerable uncertainty on individual risk prediction.

Some studies claim that machine learning models can outperform traditional models, while others argue that they cannot provide explainable reasons behind their predictions, potentially leading to inappropriate actions.

What’s more, machine learning models often ignore censoring - when patients are lost (either by error or by being unreachable) during a study and the model assumes they are disease free, leading to biased predictions. 

To explore these issues further, researchers in the UK, China and the Netherlands set out to assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions.

They assessed 19 different prediction techniques (12 machine learning models and seven statistical models) using data from 3.6 million patients registered at 391 general practices in England between 1998 and 2018.

Data from general practices, hospital admission and mortality records were used to test each model’s performance against actual events.

All 19 models yielded similar population level performance. However, cardiovascular disease risk predictions for the same patients varied substantially between models, especially in patients with higher risks.

For example, a patient with a cardiovascular disease risk of 9.5-10.5% predicted by the traditional QRISK3 model had a risk of 2.9-9.2% and 2.4-7.2% predicted by other models.

Models that ignored censoring (including commonly used machine learning models) substantially underestimated risk of cardiovascular disease.

Of the 223,815 patients with a cardiovascular disease risk above 7.5% with QRISK3 (a model that does consider censoring), 57.8% would be reclassified below 7.5% when using another type of model, explain the researchers.

The researchers acknowledge some limitations in comparing the different models, such as the fact that more predictors could have been considered. However, they point out that their results remained similar after more detailed analyses, suggesting that they withstand scrutiny.

“A variety of models predicted risks for the same patients very differently despite similar model performances,” they write. “Consequently, different treatment decisions could be made by arbitrarily selecting another modelling technique.”

As such, they suggest these models “should not be directly applied to the prediction of long term risks without considering censoring” and that the level of consistency within and between models “should be routinely assessed before they are used to inform clinical decision making.”

[Ends]

04/11/2020

Notes for editors
Research: Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
Journal: The BMJ

Funding: China Scholarship Council

Link to Academy of Medical Sciences press release labelling system: https://press.psprings.co.uk/AMSlabels.pdf

Peer reviewed? Yes
Evidence type: Observational
Subjects: Risk prediction models

BMJ Expert Media Panel

If you are a journalist needing to speak to an expert, please click here.

Browse our Expert Media Panel

BMJ IN THE NEWS

Latest coverage of BMJ in the national and international media

SEE BMJ IN THE NEWS

JOIN OUR MEDIA LIST

If you are a journalist who would like to receive our press releases, please provide your details.

GET THE LATEST PRESS RELEASES

CONTACT OUR MEDIA RELATIONS TEAM

Email the UK media relations team for more information.

CONTACT US TODAY