Intended for healthcare professionals

Opinion

Making the black box more transparent: improving the reporting of artificial intelligence studies in healthcare

BMJ 2024; 385 doi: https://doi.org/10.1136/bmj.q832 (Published 16 April 2024) Cite this as: BMJ 2024;385:q832

Linked Research Methods & Reporting

TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods

Linked Editorial

TRIPOD+AI: an updated reporting guideline for clinical prediction models

  1. Gary S Collins, professor1
  1. 1Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
  1. Correspondence to: G S Collins gary.collins{at}csm.ox.ac.uk (or @GSCollins on X)

The updated TRIPOD+AI reporting guidelines can help guide the writing of research on artificial intelligence in healthcare to improve the transparency and usefulness of reporting

As a scientist, when I read a paper I want to know why and how a piece of research was carried out, what the results are, what they mean, and what are the implications of the findings. But incomplete and poor reporting is a blight on the growing volume of published healthcare research and cannot be ignored. If research lacks all the details necessary to understand the validity of the study, it is not useful. Incomplete reporting puts patients at risk, undermines public confidence, squanders valuable resources, and stalls medical advancements if studies cannot be replicated and built on.1 Transparent reporting is also the cornerstone for reproducibility.

My main research interests are in clinical prediction models. These models estimate a person’s risk for having some, as yet, undiagnosed disease, or the future risk of developing a disease or disease progression. Unfortunately, these studies are far from immune from the aforementioned poor reporting and are often flagged as problematic.2 Frustrated with the poor reporting of clinical prediction models, in 2010 along with Carl Moons, Hans Reitsma, and the late Doug Altman (founder of the EQUATOR Network), I embarked on an initiative to improve the situation. We developed the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) reporting recommendations, which was published in 2015.34 The recommendations focused on models developed using regression based approaches because these were the prevailing methods used at the time.

Fast forward a few years, and the research and healthcare landscape has been disrupted by major advances in artificial intelligence (AI). In a short period of time, the number of studies involving AI has exploded to an extent that few predicted. AI methods (ie, machine learning) are widely lauded as having the potential to take clinical prediction to the next level—through flexibility, ability to model complex associations, and scalability. But these studies are not any better reported than more traditional statistical approaches. We carried out a series of studies showing that they were poorly reported,56 at high risk of bias,78 and subject to spin and hype by overinflating their ability to predict.9

In 2019, we set out to start the process of updating TRIPOD to provide recommendations that are contemporary and fit for purpose in the modern predictive analytics landscape—as well including items on open science and public and patient involvement. We set out to develop inclusive and harmonised guidance that will be relevant to researchers using either regression based approaches or AI.10 AI methods are complex and often characterised as “black box,” because of a lack of understanding about how they work—it is therefore crucial that studies using these methods to develop prediction models are transparently and completely described. We need to know what was done, especially since there are already concerns around algorithmic bias, and the propensity of AI to create or exacerbate health inequity. Complete reporting can help identify any potential red flags either during peer review or after publication.

In our published research in The BMJ,10 developing TRIPOD+AI involved revisiting the original TRIPOD recommendations, proposing updated wording, suggesting new items, and taking a consensus based approach using a broad international stakeholder group.10 Over 200 people from 27 countries participated in the Delphi survey that helped us shape the guidance. The survey was accompanied by an online consensus meeting attended by 28 international participants to lead to the final set of reporting recommendations. Noteworthy additions to the recommendations include an emphasis on fairness that is embedded throughout, a new section on open science practices and an item on public and patient involvement. The new open science section will place greater expectations on researchers to register their study, make a study protocol available and share code and data. We will periodically revisit the recommendations to ensure they reflect contemporary practice and are usable.

AI is changing research. To ensure that it has value and importantly does not harm people or exacerbate existing inequities, we need transparency. The TRIPOD+AI reporting guidelines have been developed to help guide the writing of research ensuring that AI is fit for purpose and the findings presented in a usable format.

Footnotes

  • Competing interests: GSC is a statistics editor for The BMJ.

  • Provenance and peer reviewed: Commissioned; not externally peer reviewed.

References