Margaret McCartney: Innovation without sufficient evidence is a disservice to allBMJ 2017; 358 doi: https://doi.org/10.1136/bmj.j3980 (Published 05 September 2017) Cite this as: BMJ 2017;358:j3980
All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
It's a shame that Dr Butt does not use his response to put evidence supporting Babylon's technology forward.
It's no secret that I support the founding values of the NHS and my DOI has been publicly available for several years. I've even written a whole book about the State of Medicine - keeping the promise of the NHS. In it, I argue that non evidence based policy making, and serial IT innovations, have wasted money and provided poor value for patients.
I have been asking for details of the pilot in North London since March 2017. I have been given no details about the evaluation and how safety and accuracy will be assessed. I would be grateful if Dr Butt would supply this. Dr Butt says they have been publishing their results. Neither NHS London or Babylon could direct me towards any publication of any results and only this paper seems to have been published https://arxiv.org/pdf/1606.02041.pdf which is not a real life study of use. I have also been trying to find out what the financial arrangement is between the NHS and Babylon: no information has yet been forthcoming and I would be grateful if Dr Butt could supply these.
Additionally, Babylon are still saying on their website under 'is it safe?' that "An independent study tested the app’s symptom checker against nurses and junior doctors. It found that the app gave safe advice 100% of the time, whereas doctors gave safe advice 98% of the time, and nurses gave safe advice 97% of the time. It also found that the app is more accurate than doctors or nurses, sending patients to the most appropriate place for their more often than either doctors or nurses. This means that the app not only gives safe advice, but also saves patients from spending time in Accident and Emergency or at their GP’s surgery when this isn’t necessary."https://support.babylontech.co.uk/hc/en-us/articles/115000931729-Is-it-s...
This was not an independent study; it was not a real life study; it is not capable of examining real life harms.
New technology should be treated like any other medical intervention capable of benefit and harm: it should be tested in high quality trials capable of finding unintended harm as well as benefit. When one asks for evidence but is rebuffed with a supply of popularity scores I think I am right to be concerned.
Competing interests: I wrote the article, and I have and will campaign for the founding values of the NHS; DOI http://www.whopaysthisdoctor.org/doctor/6
It is not surprising that the evidence Babylon adduce in their marketing materials is not indexed in PubMed. The site on which it appears, arxiv.org, does not peer review the papers it hosts, but uses a moderation system. Moderation does not, as conventional peer review would, test the quality and validity of a paper, merely whether it contravenes arXiv policies on format, topic, content, copyright and frequency of submission, and whether it has been submitted to the most appropriate archive within arXiv. See https://arxiv.org/help/moderation
PubMed on the other hand,
There are sound reasons for arXiv’s approach; it allows for the speedy dissemination of research results, without paywalls. Papers deposited with arXiv may appear later in conventional journals. But the presence of a paper in the repository is no guarantee of its quality or validity.
Competing interests: No competing interests
Yesterday a tweet was shared showing a set of questions and responses, apparently from the Babylon app, regarding left shoulder and arm pain in a 44 year old, 30 a day smoker. After a few questions and answers (including one about recent trauma to which the answer was no) the app concluded that this was a joint sprain and should be treated with rest and an ice pack. This is truly dangerous territory. My only previous brush with automated medicine was investigating an on-line doctor service - it was prepared to prescribe me metronidazole for £44 to treat an ear infection after wholly inadequate and mostly irrelevant questions.
It may be that there is a way of writing medical software that is safe without sending everyone to A and E or a GP but I have yet to meet it. Meanwhile, promises are made to patients which are just not true about the quality and safety of the advice they will receive, and hopelessly optimistic forecasts are made about the money the NHS will save in a brave new digital world where patients are diagnosed by an app (and treated by robots?). The quest for savings by commissioners and the scramble for profits by private providers may be standing in the way of careful evaluation of the evidence base which we need before these innovations can be safely adopted.
Competing interests: No competing interests
Babylon’s partnership with NHS111 in North London is by definition a time-limited, independent pilot programme in one region, set up precisely so that it can be evaluated rigorously to assess any potential benefits for patients on a wider scale. As you would expect, before the pilot commenced, the local NHS conducted their own validation of the Babylon triage service against all their serious incidents over a recorded period, and found the service to be completely safe.
In this context, Dr. McCartney’s call for additional trials and evidence - before a pilot programme can even commence to gather the evidence she claims to desire - seems somewhat confused. We will continue to publish our results, but the reality is that we’re simply doing what innovative NHS GPs have been doing for years – establishing a pilot, testing our systems to give high confidence they are safe and effective, publishing the results, and working with the NHS on a full evaluation.
Of course, as a supporter of a political campaign that exists to prevent any new organisations from innovating to support the NHS (1), it is hard to know if any evidence that supports an alternative view could satisfy Dr. McCartney, but anyone who claims to value evidence-based judgements should set aside ideological bias and wait for tangible results before attempting to pass verdict.
As a GP, who previously spent years feeling frustrated by a system that piles unsustainable pressure on clinicians to meet soaring levels of demand, and prevents us providing the type of patient care we entered medicine to deliver, I have been amazed by how babylon’s technology can transform both patient care and GP workloads.
Since we introduced our artificially intelligent symptom checker to babylon subscribers, we have reduced same day GP consultations by 40% by appropriately providing alternative care, such as reliable healthcare information, self-management advice and pharmacy support. This is quite simply a game-changer for the economics of delivering GP services. As a result, we have been able to keep waiting times - for our list of hundreds of thousands of patients in the UK, and 10% of the adult population in Rwanda - at a matter of hours rather than days or weeks. That’s one of the reasons that our patient ratings give us one of the highest customer satisfaction rankings in the country, with a net promoter score of 93, and 93% of patients giving four or five star ratings.
There will always be politically-minded people who resist innovation to shore up the status quo, but with waiting times hitting three weeks in some areas, and GPs retiring and resigning in droves, anything that makes workloads lighter and primary care more accessible deserves a fair hearing. At babylon we’re currently in discussions with 35 different countries who have approached us about using our technology to transform their own primary care systems. It would be a great disservice to UK patients if the NHS was not able to benefit from this type of technology.
1. http://margaretmccartney.com/welcome/ – I donate a small amount of money monthly to Keep Our NHS Public.
Competing interests: Employee of babylon
"The computer says no"
The scientific validation of clinical apps is of significant concern although not insurmountable (1) (2).
The potential for indiscriminate utilisation of such clinical apps would be another major concern. The gate-keeping financial managers might decide to use these apps for triaging of patients in primary and secondary care. (3).
Like many frustrated consumers navigating the "interactive voice response systems” and consumer helplines, many patients utilising these clinical apps might end up longing for human contact.
1. Margaret McCartney. Innovation without sufficient evidence is a disservice to all BMJ 2017; 358 doi: https://doi.org/10.1136/bmj.j3980.
2. Tang H, Ng J. Googling for a diagnosis—use of Google as a diagnostic aid: internet based study. BMJ 2006; 333 doi: https://doi.org/10.1136/bmj.39003.640567
3. Cronin RM, Fabbri D, Denny JC, Rosenbloom ST, Jackson GP. A comparison of rule-based and machine learning approaches for classifying patient portal messages. Int J Med Inform. 2017 Sep;105:110-120. doi: 10.1016/j.ijmedinf.2017.06.004. Epub 2017 Jun 23.
Competing interests: No competing interests