Intended for healthcare professionals

Views And Reviews No Holds Barred

Margaret McCartney: Innovation without sufficient evidence is a disservice to all

BMJ 2017; 358 doi: (Published 05 September 2017) Cite this as: BMJ 2017;358:j3980
  1. Margaret McCartney, general practitioner
  1. Glasgow
  1. margaret{at}

Breeding with the vigour of rabbits, a variety of private GP services are springing up and jostling for your attention. Why spend hours phoning your own GP when you can choose a private GP by looks and price, having immediate access online or by app without needing to leave your home? Recruitment for GPs in such companies is increasingly competitive, with advertisements promising hourly rates and perks unfamiliar to NHS employees.

Some of these companies have bigger ambitions. “Why couldn’t Babylon be a patient’s NHS GP?” asks Ali Parsa, CEO of Babylon—an IT company whose app, he says, is faster and more accurate than doctors in risk assessing cases.1 The NHS in north London has gone further, commissioning his company to offer “symptom advice” through a smartphone app service—“NHS 111 powered by Babylon”2—although the NHS 111 telephone service still exists.

Babylon has several offerings intersecting medicine and IT. It provides private GP services by video link and has corporate contracts with Bupa, Sky, and Boots. Earlier this year the Advertising Standards Authority, after I’d contacted it, told Babylon to stop saying that it had the “world’s best doctors” and the “world’s most advanced AI [artificial intelligence].”

Good IT is something the NHS has struggled for years to provide. It’s clear that NHS 111 has been beset by problems.3 But how do we put better technology into the NHS? If we want to hand over clinical triage to smartphone apps we need to know that they work to a standard we can set and test independently. That should mean robust and independent trials. But we don’t seem to have them.

If we hand over clinical triage to smartphone apps they need to work to a standard we can set and test independently

The initial information sheet that Babylon co-wrote with the NHS contained a set of “frequently asked questions.”4 Under “Is it safe?” the answer was, “An independent study tested the app’s symptom checker against nurses and junior doctors. It found that the app gave safe advice 100% of the time, whereas doctors gave safe advice 98% of the time . . . it also found that the app is more accurate than doctors or nurses.”4

But this study was not “independent”: it had six authors,5 and five are current or past Babylon employees,678910 while the sixth is its owner (Parsa). The study itself—not indexed in PubMed—was not a real life trial of how humans use the app but a simulation using actors and invented scenarios. I would not regard it as a trial but a description of a development process, omitting essential details about the clinical scenarios tested. I can’t find any other publicly available testing of this system. Although this claim on safety has now been removed, a pilot project such as this is, in my view, not adequate.

Babylon says that its partnership with NHS 111 is itself an independent pilot programme set up to evaluate the technology. “Before the pilot commenced, the local NHS conducted their own validation of the Babylon triage service against all their serious incidents over a recorded period, and found the service to be completely safe,” the company says. Babylon also says that its symptom checking products have been shown to work well for UK patients, relieving pressure on healthcare systems.

It’s not that I don’t think technology has potential. It does. But we need high quality evidence, which should mean high quality trials. Is an automated app better than NHS 111, or are humans needed to over-ride problems in automation? When we drop the threshold to consultation we change the demographic such that false positives easily become more common, potentially leading to unnecessary emergency department or GP attendance—which, in turn, makes it harder for sick people to get attention. If this happens we will be creating more dilemmas, not solving them. Innovation without sufficient evidence is a disservice to all.