Usefulness of data from magnetic resonance imaging to improve prediction of dementia: population based cohort studyBMJ 2015; 350 doi: https://doi.org/10.1136/bmj.h2863 (Published 22 June 2015) Cite this as: BMJ 2015;350:h2863
- Blossom C M Stephan, lecturer1,
- Christophe Tzourio, professor23,
- Sophie Auriacombe, neurologist4,
- Hélène Amieva, professor5,
- Carole Dufouil, research director23,
- Annick Alpérovitch, senior research director2,
- Tobias Kurth, research director23
- 1Institute of Health and Society, Newcastle University, Newcastle, UK
- 2Inserm Research Centre for Epidemiology and Biostatistics (U897), Team Neuroepidemiology, F-33000 Bordeaux, France
- 3University of Bordeaux, College of Health Sciences, F-33000 Bordeaux, France
- 4University Hospital, Department of Neurology, Memory Consultation, CMRR, F-33000 Bordeaux, France
- 5Inserm Research Centre for Epidemiology and Biostatistics (U897), Team Epidemiology and Neuropsychology of Brain Aging, F-33000 Bordeaux, France
- Correspondence to: T Kurth
- Accepted 8 May 2015
Objective To determine whether the addition of data derived from magnetic resonance imaging (MRI) of the brain to a model incorporating conventional risk variables improves prediction of dementia over 10 years of follow-up.
Design Population based cohort study of individuals aged ≥65.
Setting The Dijon magnetic resonance imaging study cohort from the Three-City Study, France.
Participants 1721 people without dementia who underwent an MRI scan at baseline and with known dementia status over 10 years’ follow-up.
Main outcome measure Incident dementia (all cause and Alzheimer’s disease).
Results During 10 years of follow-up, there were 119 confirmed cases of dementia, 84 of which were Alzheimer’s disease. The conventional risk model incorporated age, sex, education, cognition, physical function, lifestyle (smoking, alcohol use), health (cardiovascular disease, diabetes, systolic blood pressure), and the apolipoprotein genotype (C statistic for discrimination performance was 0.77, 95% confidence interval 0.71 to 0.82). No significant differences were observed in the discrimination performance of the conventional risk model compared with models incorporating data from MRI including white matter lesion volume (C statistic 0.77, 95% confidence interval 0.72 to 0.82; P=0.48 for difference of C statistics), brain volume (0.77, 0.72 to 0.82; P=0.60), hippocampal volume (0.79, 0.74 to 0.84; P=0.07), or all three variables combined (0.79, 0.75 to 0.84; P=0.05). Inclusion of hippocampal volume or all three MRI variables combined in the conventional model did, however, lead to significant improvement in reclassification measured by using the integrated discrimination improvement index (P=0.03 and P=0.04) and showed increased net benefit in decision curve analysis. Similar results were observed when the outcome was restricted to Alzheimer’s disease.
Conclusions Data from MRI do not significantly improve discrimination performance in prediction of all cause dementia beyond a model incorporating demographic, cognitive, health, lifestyle, physical function, and genetic data. There were, however, statistical improvements in reclassification, prognostic separation, and some evidence of clinical utility.
The prevalence of dementia is expected to double every 20 years, with about 35.6 million people worldwide affected in 2010 and 65.7 million predicted in 2030.1 The greatest increase is expected in the developing world. Despite the lack of an effective treatment for Alzheimer’s disease, it is estimated that a two year delay in onset could have a dramatic effect on its prevalence, reducing incidence by about 20%.2 Risk assessment for future disease to better focus intervention to those at highest risk and reduce the cost of unnecessary diagnostics is therefore a major issue, and it has been the aim of many recent studies.3 4 5 6 7 In that regard, the development of a simple accurate method for prediction of risk of dementia is a priority.
Having an accurate model for predicting future dementia in population based settings would be beneficial for several reasons. Firstly, targeting whole populations for modification of behaviour and reduction of risk factors might not always be cost effective, particularly when intervention strategies are costly or adherence rates low. Secondly, broad based targeting strategies are not always recommended— for example, when there are safety concerns or a high risk of side effects of treatment. A complementary approach could be to target high risk individuals by developing a model to accurately identify these individuals as early as possible without being too broad in risk selection. These individuals could then be referred for services, improved care, clinical trials, and, when intervention is available, stratified or individualised risk factor reduction to ultimately improve patient outcomes. In contrast, people at low risk could be excluded from further immediate follow-up thereby reducing costs, for example, of unnecessary diagnostics.
While ageing is the most universally accepted risk factor for dementia, other conventional risk factors have been incorporated into prediction models developed in populations aged ≥65, including poor neuropsychological test performance, subjective memory complaint, low educational attainment, sex, depression, history of cardiovascular (such as coronary heart disease, peripheral vascular disease), cerebrovascular (such as stroke), and metabolic (such as diabetes) diseases and their risk factors (such as hypertension, smoking, alcohol use, physical inactivity, obesity), blood based biomarkers (serum total cholesterol concentration), inability to perform activities of daily living (such manage money and drugs), and genetic susceptibility (such as apolipoprotein e4 status).8 9 10 11 12 13 14 15 16 17 18 19 Non-traditional risk factors (such as denture fit and eye and ear trouble) have also been used.20 21 Predictive accuracy of current models has generally been low to moderate.7
Improvement in dementia risk prediction is needed for medical and research purposes to enhance diagnostic protocols (such as recruitment into clinical trials) and inform therapeutic decisions (such as personalised medicine). This could be achieved through the use of indicators of dementia derived from magnetic resonance imaging (MRI), including structural changes (such as hippocampal atrophy, medial temporal lobe atrophy, and evidence of white matter disease) and functional changes (such as positron emission tomography imaging of amyloidosis and tauopathy), in addition to assessment of cerebral spinal fluid (such as amyloid-β 42 and tau). Variables derived from both cerebral spinal fluid analysis and MRI have been proposed for stratification of patients for research purposes under the new “lexicon” of Alzheimer’s disease.22 23 24 The immediate implications of using such complex biomarkers are that they require technologically advanced, costly, burdensome (for participants as they can cause discomfort), and not easily available methods, especially in developing countries. This might offset any advantage of the use of such variables in predictive models. To make recommendations on the use of data from MRI in dementia risk prediction in population based settings, we need evidence on what this adds to more conventionally derived risk models.
We evaluated the value of markers from MRI added to a model incorporating previously proposed conventional risk factors for the prediction of all cause dementia and Alzheimer’s disease over 10 years’ follow-up in a large prospective population based cohort study.
The Three City Study is a multi-centre longitudinal population based cohort study, conducted in three French cities (Bordeaux, Dijon, and Montpellier), and designed to estimate the risk of dementia and cognitive impairment attributable to vascular factors. Full details of the methods and baseline characteristics of the participants have been published previously.25
The current study is solely based on the Dijon centre, the only centre in which a cerebral MRI was conducted. In brief, at the 1999 French census, the total population of Dijon was 153 800.26 To be eligible for recruitment a person had to be living in Dijon or its suburbs, registered on the electoral roll, aged ≥65, and not be living in an institution. Baseline interviews were undertaken in 1999-2001, with follow-up interviews conducted about two, four, six, and 10 years after enrolment.
From the original 4931 participants enrolled in Dijon, MRI was offered to those aged 65-80 who had been enrolled between June 1999 and September 2000. Although the consent rate for scanning was 83%, scans were obtained from 1923 participants (39%), as funding restrictions precluded MRI for everyone. From these 1923 participants we excluded from analysis 123 individuals with missing MRI variables (such as poor scan quality and artefacts), eight with prevalent dementia, and individuals with missing dementia status over the 10 years of follow-up (n=71 participants were seen only at baseline). The remaining sample included 1721 individuals. Follow-up time ranged from 0.6 to 10.6 years (mean 7.3 years, SD 2.3 years).
Comparison of the baseline characteristics of our analytical sample with all remaining age eligible participants without dementia in Dijon is shown in appendix 1. Individuals excluded because of missing dementia status at follow-up did not differ from those included with regard to sex (χ2=0.53, df=1, P=0.47), age (F1,1790=0.97, P=0.97), or educational attainment (χ2=1.92, df=1, P=0.38). Individuals without known dementia status over follow-up, however, performed significantly worse on the mini-mental state examination at baseline: median score 28 (interquartile range 27-29) for included v 27 (26-29) for excluded; Wilcoxon-Mann-Whitney test: z=2.42, P=0.02.
There was no patient involvement in the design, conduct, and interpretation of the study.
Trained psychologists collected data with a standardised questionnaire during a face to face interview at the participants’ home. Information included sociodemographic status, lifestyle, medical history, drug use, and assessment of cognitive and functional status. Clinical examination included measurements of blood pressure with a digital tensiometer (OMRON M4). Anthropometric measures included height and weight. Fasting bloods samples were taken and markers (such as cholesterol, glucose) measured at a single laboratory.
Magnetic resonance imaging
Brain MRI scanning was undertaken on average of 4.2 months (SD 3.0 months) after the baseline examination. Scanning was completed with a 1.5-Tesla Magnetom (Siemens, Erlangen, Germany). Usual MRI exclusions were applied. The scanning sequence and data extraction methods have been described in detail previously.27 28 In brief, raw data were converted to the ACR-NEMA standard format and then transformed for analysis and storage at the Department of Neurofunctional Imaging, Caen.25 This centre developed fully automatic image processing software for tissue segmentation and to detect and quantify white matter lesions.27 29 Automated imaging processing was also used to study brain volume (white matter, grey matter, and ventricles). Total intracranial volume by summing grey matter, white matter, and cerebral spinal fluid volumes were computed with voxel based morphometry techniques.
We selected three MRI measures for analysis including white matter lesion volume (calculated by summing the volumes of all white matter lesions detected), hippocampal volume (combining left and right sides), and total brain volume (defined as the sum of grey and white matter) as these are commonly assessed and have been previously associated with cognitive decline and dementia.30 31 32 33 All three MRI variables were normalised to total intracranial volume and converted to a percentage—that is, each volume (white matter lesion, hippocampal, and whole brain) was divided by total intracranial volume and multiplied by 100.
Both the total brain volume and hippocampal volume variables were normally distributed. In contrast, white matter lesion volume had a markedly skewed distribution and therefore scores were log transformed before analysis to decrease the impact of extreme observations.
Diagnosis of dementia
Diagnosis of dementia was established with a three phase procedure. All participants were first screened with scores from the mini-mental state examination34 (with education adjusted cut-off points) and the Isaac set test.35 In Dijon, in the second phase, a neurologist saw individuals with suspected incident dementia based on their performance on neuropsychology tests. In the third phase, a panel of independent neurologists reviewed all potential prevalent and incident cases to obtain consensus on diagnosis and aetiology according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders, fourth edition.36 With regard to subtypes of dementia, Alzheimer’s disease (possible and probable) was diagnosed according to criteria from the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA), and vascular dementia was based on history of vascular disease, Hachinski score, and MRI (whenever possible).25 37
Information on prognostic determinants of occurrence of dementia was extracted from the published literature on dementia risk prediction and previous findings from the Three City Study. We selected 13 variables. Sociodemographic factors included age, sex, and educational attainment. Lifestyle factors included smoking and alcohol use. Functional assessment was measured with the Lawton and Brody scale for instrumental activities of daily living38 that assesses ability to use the telephone, responsibility for drug treatment, managing money, mobility, shopping, grooming, housework, and laundry (the last three were asked in women only). Cognition was assessed with the mini-mental state examination, Benton visual retention test,39 and the digit span test.40 Appendix 2 shows the adjusted cut-off scores for cognitive impairment adjusted for age and education. Health variables included cardiovascular events (combining self reported history of myocardial infarction, coronary surgery, coronary angioplasty, surgery of the arteries in the legs, or stroke requiring admission to hospital), metabolic disease (diabetes; self reported, high glucose concentration ≥7.0 mmol/L, or receipt of hypoglycaemic treatment including oral diabetic drugs or insulin), and systolic blood pressure (continuous). Genetic risk assessed apolipoprotein e4 status (coded as e4 positive v e4 negative).
Missing covariate information
Of the 1721 participants included in the analysis, few had missing information on covariates (<2.5%; range 0-2.4%). In total, 87 participants had missing information on covariates in the conventional prediction model, which was run on 1634 participants.
We tested differences in demographics between the groups with and without dementia using χ2 test (for categorical variables), analysis of variance (for continuous normally distributed variables), or the Wilcoxon-Mann-Whitney test (for continuous, non-normally distributed variables). Data were censored at first diagnosis of dementia (for dementia cases) or last follow-up interview (for those without dementia).
A multivariable model incorporating all 13 conventional risk variables was calculated with Cox proportional hazards regression analysis. To test whether MRI data improve discrimination performance of this model, we performed Cox regression analyses with inclusion of each of the MRI variables and their combination. The proportional hazards assumption was tested with the estat phtest command in Stata and was not violated in any model (test carried out with the detail option in Stata to examine the proportional hazards assumption for each predictor as well as to carry out the global test). Non-linearity of the three MRI variables was checked visually by plotting the martingale residuals41 and statistically by using a Wald test (using the Stata command nlcheck, with the spline option). There was no evidence of non-linearity.
For each model we obtained estimates of Harrell’s concordance index (with 95% confidence intervals, calculated by bootstrapping) using the method described by Newson.42 Inference regarding improvement of the models incorporating MRI variables compared with the conventional risk model was undertaken by estimating the difference (and 95% confidence interval) in the concordance statistics using the lincom command in Stata. We calculated net reclassification improvement (with three risk groups corresponding to 0%-<10%, 10%-<20%, and ≥20%) and integrated discrimination improvement indices to further assess model performance.43 The net reclassification improvement index assesses correctness of reclassification (for example, up for events and down for non-events) into different prespecified risk categories. In contrast, the integrated discrimination improvement index is a continuous measure that can be interpreted as the improvement in average sensitivity minus the change in average (1−specificity). The integrated discrimination improvement index has the advantage that it is a continuous measure and therefore does not depend on arbitrary user defined risk categories. For the net reclassification improvement index and integrated discrimination improvement index values above zero indicate improved risk classification with the addition of the new variable(s). Each index was calculated with the predstat command in Stata. We also calculated Royston and Sauerbrei’s44 index of discrimination (D) and optimism corrected D (Dadjusted) using the str2d command in Stata to assess prognostic separation. To assess possible clinical implications of adding the MRI variables to the conventional risk prediction model, we used the theoretical relation between the threshold probability of disease and the relative value of false positive and false negative results to ascertain the value of the various prediction models (decision curve analyses), accounting for censored observations using the stdca command in Stata.45 Preferred models are those with the highest net benefit calculated as the difference between the proportion of true positives and the proportion of false positives weighted by the relative harm of a false positive and false negative result. All analyses were repeated with Alzheimer’s disease as the outcome (sensitivity analysis).
Although the sample size is large for a brain imaging study, it is relatively small for risk model testing. Therefore, instead of splitting the sample into derivation and testing datasets we ran the analyses on the entire sample. To correct for optimism bias in the C statistic value (that is, over-fitting to a specific sample), we undertook internal validation using 100 bootstrap samples. To use all the data from the 1721 participants and test whether missingness (assumed to be missing at random) influenced the results, we carried out multivariate imputation by chained equations (using the mi procedure in Stata) that included all 13 conventional predictors and the outcome variable. We created 10 imputed datasets and fitted each model separately on each. Results from the analysis of each imputed dataset were combined with Rubin’s rules (mi estimate command in Stata)46 (see appendix 3). Analyses were completed with STATA version 13 (StataCorp, College Station, TX). All probabilities were two tailed, and significance was set at P<0.05.
Of the 1721 participants with complete MRI data at baseline, 119 (6.9%) progressed to dementia over the 10 years of follow-up. The mean time to onset of dementia was 6.7 years (SD 2.0 years; range 1.7-10.5 years). People without dementia were followed for an average of 7.4 years (SD 2.3 years; range 0.6-10.6 years). Table 1 shows baseline demographic characteristics of the cohort stratified by dementia status at follow-up⇓. The groups with and without dementia differed significantly with respect to age, education, physical function, cognitive function and apolipoprotein e4 allele status but had similar distributions of sex, smoking, alcohol, history of cardiovascular disease, diabetes, and systolic blood pressure.
Simple model for dementia risk with conventional risk factors
When all 13 conventional risk variables were included in the model (M1), significant predictors of 10 year incident dementia included age, physical function, cognition (mini-mental state examination, Benton visual retention test, and digit span), and apolipoprotein e4 allele status (table 2⇓). The discrimination performance of the model was moderate (C statistic 0.77, 95% confidence interval 0.71 to 0.82; n=1634).
Conventional risk factors and MRI derived variables
Table 3⇓ summarises the performance indices including Harrell’s concordance index (C statistic) (and estimated bias), net reclassification improvement, integrated discrimination improvement, and Royston and Sauerbrei’s44 index of discrimination (unadjusted and adjusted) for the conventional prediction model and the extended models including MRI data: white matter lesion volume, whole brain volume, and hippocampal volume (table 2 shows the hazard ratios and their 95% confidence intervals). Across the four different models, optimism bias (or over-fitting) in the C statistic value was low (optimism ranged from: 0.0188 to 0.0285). While the C statistic did slightly increase with inclusion of MRI variables, as shown in figure 1⇓, the change was not significant (P values are shown in table 3⇓). This indicated that discrimination performance of the simple model as measured by the C statistic was not significantly improved with the addition of any of the MRI variables, alone or in combination.
As shown in table 3, however, the integrated discrimination improvement index was positive and significantly different from zero, suggesting improvement in reclassification, when the conventional model was compared with models incorporating all three MRI variables (integrated discrimination improvement 0.043; z=2.11, P=0.04) or hippocampal volume (0.044; z=2.23, P=0.03). Furthermore, compared with the conventional model, Royston and Sauerbrei’s44 index of discrimination (D) was higher (difference >0.2) when all three MRI variables or hippocampal volume were added to the conventional model, indicating an improvement in prognostic separation.
Decision curve analysis
Figure 2⇓ shows the net benefit curves for the conventional prediction model and the extended model includes all three MRI variables. As shown, for increasing probability thresholds the models incorporating all three MRI variables combined had higher net benefit than the conventional model, suggesting that this model has potentially higher clinical utility. Increase in net benefit was further seen when hippocampal volume was included to the conventional risk model (similar curve to the curve with all three variables) but not when incorporating whole brain volume and white matter lesion volume (figures not shown).
Sensitivity analysis: predicting Alzheimer’s disease
With respect to subtypes of dementia, 84 (71%) people had a diagnosis of possible or probable Alzheimer’s disease, 13 (11%) had mixed dementia, and the rest were classified as “other” (including, for example, people with vascular dementia). Table 4⇓ shows the results from the Cox proportional hazards regression analysis, and table 5⇓ summarises the performance indices for the conventional model, with and without MRI variables, when the outcome was restricted to Alzheimer’s disease. Similar to the results for all cause dementia, as shown in figure 3⇓, there was a slight but non-significant improvement in the discriminative performance of the conventional risk factor model we included MRI data on hippocampal volume or all three MRI variables combined. Also, similar to the results for all cause dementia the integrated discrimination improvement index was significantly different from zero, and Royston and Sauerbrei’s44 index of discrimination (D) also increased in the models incorporating all three MRI variables and hippocampal volume. The decision curve analyses for the outcome Alzheimer’s disease shows a similar net benefit improvement when all three MRI variables were included in the conventional risk model (fig 4⇓). A similar picture was seen when hippocampal volume was added to the conventional model but not for white matter lesion volume or whole brain volume (figures not shown).
In this large prospective population based cohort study, addition of MRI variables to a conventional risk model including sociodemographic, functional, cognitive, health, lifestyle, and genetic predictors did not significantly improve discrimination performance for all cause dementia over 10 years’ follow-up as measured by the C statistic. In contrast, we found that data from MRI might have some value in reclassification, prognostic separation, and some improvement in clinical utility. Findings were similar when we restricted the outcome to Alzheimer’s disease. The results have implications for avoiding unnecessary increase in cost by using MRI for identifying individuals at high risk of dementia in population based settings. The results also have implications for decisions regarding the statistics used to evaluate risk prediction models.
Comparison with other studies
Currently there are no recommended models for screening for individuals at high risk of dementia. Numerous predictors have been proposed including demographic, neuropsychological, health, physical function, lifestyle, and neuroimaging variables. Use of complex data, however, would be expected to increase discriminatory ability. Previous studies have not compared whether MRI data improves prediction of dementia risk in the general population relative to information gained from simple multivariable models of conventional risk factors. While our results do not contradict previous findings of an association between MRI variables (that is, hippocampal and white matter lesions volumes) and incident dementia, they suggest that inclusion of MRI data to a model incorporating convention risk variables (such as age, education, cognitive function (memory and global functioning), impairments in instrumental activities of daily living, health (cardiovascular disease, diabetes, and blood pressure), lifestyle (smoking and alcohol use) and apolipoprotein e4 status) does not significantly improve discrimination performance.
In contrast with comparison of the discriminatory performance of the models, however, inspection of other statistical indices that assess model performance, such as the integrated discrimination improvement index, showed improvement for the models incorporating hippocampal volume or all three MRI variables. This result replicates findings of Zahuranec and colleagues, who found that exclusion of computed tomography variables from a model developed for the prediction of 30 day mortality after stroke had only a minimal impact on discriminative performance measured with the area under the curve.47 In contrast, inclusion of computed tomography variables did result in significant improvement in the integrated discrimination improvement index. Disparate results are caused by the calculation of each measure. The clinical relevance of this difference, in the presence of a non-significant change in the concordance index, within the framework of prediction of disease risk, however, is still unclear. Indeed, disparate results are often observed, leading to debate about the use of different metrics to assess model performance.48 49
We also found that Royston and Sauerbrei’s index of discrimination (D)44 and results from the decision curve analysis indicated improvement, in terms of prognostic separation and clinical utility, respectively, of the models incorporating hippocampal volume or all three MRI variables compared with the conventional risk model. The increase in net benefit has to be interpreted with caution because the decisional aspect of correctly identifying a person with dementia would require that there is also an effective preventative intervention or strategy available; however, a clear beneficial strategy currently does not exist. Further, the increase in net benefit could also be partly offset by potential inconveniences of the MRI imaging., When there might be requirements for increased sensitivity (such as because of increased risk of adverse effects in people with positive results on screening and the need to reduce false positive results), MRI variables might be appropriate. Before recommendations can be made, however, further work replicating these results in other studies is needed.
Predicting Alzheimer’s disease
Most cases of dementia (71%) were classified as probable/possible Alzheimer’s disease. A sensitivity analysis with Alzheimer’s disease as the outcome replicated the results of the models for prediction of all cause dementia; a slight but non-significant increase in discriminatory performance with inclusion of hippocampal volume or all three MRI variables to the conventional risk model. This supports other findings that have questioned whether the differences are large enough to warrant recommendations for MRI in population based samples.50 In contrast, and similar to the findings for all cause dementia, addition of all three magnetic imaging variables or hippocampal volume to the conventional risk model significantly improved reclassification, measured with the integrated discrimination index, and the decision curve analysis resulted in higher net benefit, however, as explained above, caution is required in the interpretation. These results suggest that MRI data could have some prognostic and clinical utility for predicting Alzheimer’s dementia. As we had only a limited number of cases with Alzheimer’s disease, however, these results should be interpreted with caution. Extension of these results to other cohorts focused on specific dementia subtypes and with postmortem confirmation of disease (and other Alzheimer’s disease biomarkers) is therefore needed.
Components of model for prediction of dementia risk
There is no consensus in the literature regarding which variables best predict risk of incident dementia. In this paper we tested only conventional variables for risk that have previously been associated with dementia. Our aim was to test whether the inclusion of MRI data improved the discriminatory performance of a model incorporating variables that have been previously linked to dementia; it was not to develop a new model for dementia risk prediction. Increased age and poor cognition are strongly related to dementia. Importantly, none of the cognitive measures included in the model had been used for the final diagnosis of dementia, which was clinician based. Of note is the broad range of cognitive indicators required including measures of general functioning, memory, and non-memory ability. The results support findings suggesting that impaired performance in a single cognitive domain is not as effective at identifying individuals at risk of dementia as multi-domain deficits.51 Other relevant factors included educational attainment, impaired functional ability, and the apolipoprotein gene. Increased education has been previously linked to promoting “cognitive reserve” and is associated with decreased risk for dementia.52 Declining functional ability, particularly in instrumental activities of daily living, is a hallmark of dementia diagnosis and is found in individuals in preclinical states, such as mild cognitive impairment.53 54 One or more of the apolipoprotein e4 alleles has been found to be a risk factor for dementia55 in both clinical and population based studies. Construction of risk prediction models with genetic variables, however, must proceed with caution and raises ethical concerns particularly around disclosure. Indeed, positive apolipoprotein e4 status does not provide certainty of a risk of dementia.
Strengths and limitations
The strengths of the study include the prospective population based design, the large number of participants with MRI data, over 10 years follow-up, and the high level of detailed reporting of risk factors measured with standardised methods. Several limitations should be considered when interpreting our results. We aimed primarily to test whether MRI derived variables previously associated with dementia could improve prediction of dementia in a population based setting, and, as such, findings might not generalise to other samples (such as clinical). The Three-City Study comprises volunteers, most of whom are white and generally have better global health than the rest of the population; this might also limit generalisability of the results. Indeed, individuals who are able and willing to MRI tend to be relatively healthy (see findings from the Rotterdam Study56). While this might explain the low occurrence of dementia (6.9%) in our study, we have no reason to believe that associations between MRI measures and dementia differ within this subgroup. These results needed to be replicated, however, across settings and in different populations (such as those with poor health and greater baseline risk of incident dementia), over different timeframes (such as a shorter follow-up time between scan and diagnosis of dementia), and in different ethnic groups. Bias could also have been introduced by the exclusion of participants with missing data who had poorer cognitive status at baseline, thus reducing the overall power of the study. Bootstrapping was undertaken to correct the concordance statistic for over-fitting, and the results indicated bias to be low. Power could have also have been reduced by us undertaking a complete case analysis. Results from the multiple imputation analysis, however, indicated consistency in findings in the reduced (complete case) and pooled (imputed) datasets. Additional studies with larger samples are needed to determine whether small positive changes in discrimination performance and reclassification of risk would show significant results. Small significant differences in prediction, however, might not translate to improvement for the clinical setting. Finally, there is currently no consensus model for predicting risk of future dementia, and whether MRI enhances prediction in other dementia risk models requires testing. Lastly, there are discussions about the set up and utilisation of predictive models57 58; further evaluation of this, however, is outside the scope of this study.
Having a simple and accurate tool that could predict future risk of dementia would be beneficial not only to researchers (for example, to stratify samples for clinical trial recruitment) but, from a clinical perspective, might also help to provide stratified or personalised care. Indeed, while some of the variables included in the model are non-modifiable (such as age and apolipoprotein e4 status), others are modifiable (such as educational attainment, health, cognition, and functional ability). Here we show that in a population based setting, relatively simple measures can be used to predict risk with similar discrimination performance to models that incorporate MRI findings. This has important implications for the development of disease modifying or preventative strategies, as risk prediction with conventional factors, at least to some extent, seems sufficient in this setting.
In contrast, using other metrics to evaluate model performance, we found that for both all cause dementia and Alzheimer’s disease, MRI variables could have some utility in improving clinical decision making, prognostic separation, and classification. Appropriate choice of which model to use depends not only on the metric selected for evaluation but consideration of the ease of attaining the score and consequences of undertaking risk prediction.
Incorporation of complex and costly variables related to MRI does not seem to significantly improve performance of a simple model that includes age, education, cognition, health, lifestyle, functional ability, and the apolipoprotein gene at the population level for all cause dementia (or Alzheimer’s disease). The results were not robust across the various metrics used for model evaluation, however, including integrated discrimination improvement, the D statistic, or decision curve analysis. The results suggest that routine MRI is not needed to predict risk of dementia in a population based setting, particularly in the first stages of screening such as differentiating between those who are and are not at risk. Whether there is a subgroup of patients for whom brain MRI improves risk prediction will have to be determined in future studies. The results have important implications for avoiding unnecessary increases in cost for use of MRI in prediction of risk for dementia, especially in settings with limited technical and financial resources. Importantly, the results have implications for how we think about defining and operationalising models to predict risk of dementia, particularly in population based settings.
What is already known on this topic
Accurate identification of individuals at high risk of dementia is important to improve diagnostic and therapeutic protocols
Prediction of risk has conventionally been based on sociodemographic, neuropsychological, health, lifestyle, physical function, and genetic variables. Novel variables also include information from MRI of the brain
The incremental contribution to prediction models of MRI variables compared with more simple prediction variables on the population level remains unclear
What this study adds
Addition of MRI variables, including white matter lesion, brain, and hippocampal volumes (or all three variables combined), to a risk model incorporating conventional risk variables did not result in significant improvement in discrimination for incident dementia (all cause or Alzheimer’s disease) over a 10 years’ follow-up
More accurate risk classification (measured with the integrated discrimination improvement index) and prognostic separation (measured with the D statistic) was observed when hippocampal volume or all three MRI variables combined were added to the conventional risk model
Cite this as: BMJ 2015;350:h2863
We thank Stephen Kaptoge and Ian White for their assistance with Stata.
Contributors: BCMS had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. BCMS, CT, CD, AA, and TK were responsible for study concept and design. CT, SA, HA, CD, and AA acquired the data, which was analysed and interpreted by all authors. BCMS and TK drafted the manuscript, which was revised for important intellectual content by all authors. BCMS and TK are guarantors.
Funding: The Three-City (3C) Study is conducted under a partnership agreement between the Institut National de la Santé et de la Recherche Médicale (INSERM), the University of Bordeaux, and Sanofi-Aventis. The Fondation pour la Recherche Médicale funded the preparation and initiation of the study. The 3C-Study is also supported by the Caisse Nationale Maladie des Travailleurs Salariés, Direction Générale de la Santé, MGEN, Institut de la Longévité, Conseils Régionaux of Aquitaine and Bourgogne, Fondation de France, and Ministry of Research-INSERM Programme “Cohortes et collections de données biologiques.” The funding organisations played no role in the design and conduct of the study, in the collection, management, analysis, and interpretation of the data, or in preparation, review, or approval of the manuscript.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: SA serves on scientific advisory boards for Eisai and Pfizer and has received funding for travel and honorariums for educational activities from Eisai, Pfizer, Janssen, Novartis, and Ipsen. HA has received payments for lectures from Novartis Pharma, and GSK. CD is a consultant for Eisai.
Ethical approval: This study was approved by the ethical committee of the University Hospital of Kremlin-Bicêtre. All participants signed a legal consent form and were free to refuse any specific part of the examination (such as blood sampling or MRI scan); such partial refusals did not constitute an exclusion criterion.
Data sharing: No additional data available.
Transparency: BS and TK affirm that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.