Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort studyBMJ 2018; 360 doi: https://doi.org/10.1136/bmj.j5745 (Published 18 January 2018) Cite this as: BMJ 2018;360:j5745
All rapid responses
Re: Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort study
The SAHIT score, an outcome prediction model for patients with aneurysmal subarachnoid haemorrhage (SAH), has recently been developed, validated by Dr. Jaja et al, and published in the British Medical Journal. This tool has been developed from a multinational cohort dataset (10,936 patients) extrapolated from 7 randomized control trials and 2 prospective hospital registries with an extensive enrolment period (1983-2012). The scale features 8 variables for the 3 models: core (age, premorbid history of hypertension, and neurological grade on admission stratified by the World Federation of Neurological Surgeons [WFNS] scale); neuroimaging (core plus aneurysm location and size, and Fisher grade); full (neuroimaging plus aneurysm treatment method). Surprisingly, other factors known to be associated with poor outcome in patients with aneurysmal SAH were not included such as rebleeding prior to treatment. Most patients (47%), from the model development cohort, presented with a good neurological grade (1, based on the WFNS scale) and 21% with high grade (WFNS IV&V). Imputed data ranged between 7.9%-10.6% for the 3 models. Results of the investigators external validation showed that the models seem to perform better when predicting 3-month functional outcome than mortality. Mortality prediction discrimination AUC (area under the receiver operator characteristics) ranged between 0.76 and 0.78. Mis-calibration was noted, and the models seemed to underestimate observed mortality. Therefore, we sought to externally validate the SAHIT score prediction of mortality in our heterogenous cohort of patients from a single neurological centre, while differentiating between the patients’ neurological grade 1) at admission and 2) after acute neurological resuscitation (usually with an external ventriculostomy drain [EVD]), (at treatment), since 59% of our patients required an EVD on admission.
Data from 345 patients with SAH secondary to ruptured intracranial aneurysms were collected by investigators from a modern 6-year retrospective cohort study (2010-2016) with 45% of patients presenting with high grade (WFNS IV&V) on admission and majority (65%) of aneurysms repaired with endovascular coiling.
Discrimination of the core, neuroimaging and full mortality models, which refers to the ability of the models to differentiate between patients who did or did not survive, was evaluated with the area under the receiver operating characteristic curve (AUC). The AUC ranges between 0.5 and 1, with larger values representing better discriminating ability. Calibration, which refers to the ability of the models to accurately estimate the probability of mortality, was characterized with the calibration intercept. The calibration intercept reflects the average difference between observed and predicted mortality risks, with a value of 0 indicating perfect calibration. The calibration slope reflects the agreement between the strength of the predictors in the original cohort and our external cohort. A slope of 1 indicates perfect agreement. Multiple imputation using chained equations was used to address the problem of missing data, with 28 imputed datasets and predictive mean matching as method of imputation. The AUC, the calibration intercept and the calibration slope were pooled using Rubin’s rule.[4-5]
Summary of missing data
The subset of predictors for the core model had no missing data. For the neuroimaging model, we imputed 25 out of 2070 (1.2%) values. For the full model, we imputed 25 out of 2415 (1.0%) values.
External validation: complete-case and multiple imputation results
The core model is better calibrated with WFNS at admission (slope: 0.99, intercept: -0.04), with a negative intercept suggesting that the model tends to slightly overestimate the mortality risk. With WFNS at treatment, the calibration intercept goes further away from 0 (intercept: -0.07) and the calibration slope goes further away from 1 (slope: 1.15), which reflects poorer calibration. Discrimination, however, is better with WFNS at treatment (AUC at treatment: 0.89, AUC at admission: 0.86).
Results from the multiple imputation analysis shows that the neuroimaging model is better calibrated with WFNS measured at admission, with both the intercept and the slope close to their respective target values (intercept: -0.02, slope: 1.03). The model with WFNS at treatment underestimates the mortality risk for patient in higher risk categories and overestimates the mortality risk for patients in lower risk categories (intercept: -0.04, slope: 1.18), which reflects poorer calibration and thus worse estimation of mortality risks. However, the discrimination is better with WFNS at treatment (AUC at treatment: 0.89, AUC at admission: 0.86).
Results with the full model are similar to the neuroimaging model: the model is better calibrated with WFNS at admission (intercept: -0.02, slope: 1.10, AUC: 0.90) but has better discrimination with WFNS at treatment (intercept: -0.02, slope: 1.15, AUC: 0.91). However, mis-calibration was noted with multiple imputation. The full model overestimated observed mortality in high grade patients.
The complete-case core model and multiple imputation neuroimaging model both have very good calibration with WFNS at admission (core intercept: -0.04, core slope: 0.99; neuroimaging intercept: -0.02, neuroimaging slope: 1.03). All models reflect good discriminating ability with AUCs between 0.86 and 0.91, while using WFNS at treatment generally yields better discriminating ability of the models. The full model has the best discrimination with WFNS at treatment (AUC: 0.91) compared to the core and neuroimaging models with WFNS at treatment (both AUCs: 0.89). Better calibration is important in prognostic settings where the primary focus is on prediction of the mortality risks whereas a well discriminated model would be preferable in diagnostic settings where classification of patients in survivor or non-survivor categories is most important. Keeping in mind that the SAHIT prediction model is a prognostic tool, better calibrated models such as the neuroimaging and core models with WFNS at admission are preferred over a better discriminated model.
External validation of the SAHIT models using our single-centre data set confirmed the validity of the models in predicting mortality, with similar discrimination and calibration. Surprisingly, using the WFNS at admission showed better calibration than the WFNS at treatment. Since calibration is the more important value for prognosticating mortality, we found that the SAHIT prediction models are best used with the admission WFNS grade rather than the WFNS grade after acute neurological resuscitation.
Yasser B. Abulhasan
Mark R. Angle
1. Jaja BNR, Saposnik G, Lingsma HF, et al. Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort study. BMJ 2018;360:j5745. doi: 10.1136/bmj.j5745 [published Online First: 2018/01/20]
2. Abulhasan YB, Alabdulraheem N, Simoneau G, et al. Mortality after Spontaneous Subarachnoid Hemorrhage: Causality and Validation of a Prediction Model. World Neurosurg 2018;112:e799-e811. doi: 10.1016/j.wneu.2018.01.160 [published Online First: 2018/02/08]
3. White IR, Royston P, Wood AM. Multiple imputation using chained equations: Issues and guidance for practice. Stat Med 2011;30(4):377-99. doi: 10.1002/sim.4067 [published Online First: 2011/01/13]
4. Vergouwe Y, Royston P, Moons KG, et al. Development and validation of a prediction model with missing predictor data: a practical approach. J Clin Epidemiol 2010;63(2):205-14. doi: 10.1016/j.jclinepi.2009.03.017 [published Online First: 2009/07/15]
5. Debray TP, Damen JA, Snell KI, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ 2017;356:i6460. doi: 10.1136/bmj.i6460 [published Online First: 2017/01/07]
Competing interests: No competing interests