Intended for healthcare professionals

CCBY Open access

Development and external validation of a risk prediction model for falls in patients with an indication for antihypertensive treatment: retrospective cohort study

BMJ 2022; 379 doi: (Published 08 November 2022) Cite this as: BMJ 2022;379:e070918
  1. Lucinda Archer, lecturer1,
  2. Constantinos Koshiaris, statistician2,
  3. Sarah Lay-Flurrie, senior statistician2,
  4. Kym I E Snell, senior lecturer1,
  5. Richard D Riley, professor of biostatistics1,
  6. Richard Stevens, associate professor2,
  7. Amitava Banerjee, professor of clinical data science3,
  8. Juliet A Usher-Smith, assistant professor4,
  9. Andrew Clegg, professor of geriatric medicine5,
  10. Rupert A Payne, senior lecturer6,
  11. F D Richard Hobbs, Nuffield professor of primary care2,
  12. Richard J McManus, professor of primary care research2,
  13. James P Sheppard, associate professor2
  14. on behalf of the STRAtifying Treatments In the multi-morbid Frail elderlY (STRATIFY) investigators
    1. 1Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
    2. 2Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
    3. 3Institute of Health Informatics, University College London, London, UK
    4. 4Primary Care Unit, Department of Public Health and Primary Care, University of Cambridge, UK
    5. 5Academic Unit for Ageing and Stroke Research, Bradford Institute for Health Research, University of Leeds, UK
    6. 6Centre for Academic Primary Care, Population Health Sciences, University of Bristol, Bristol, UK
    1. Correspondence to: J P Sheppard james.sheppard{at}
    • Accepted 21 September 2022


    Objective To develop and externally validate the STRAtifying Treatments In the multi-morbid Frail elderlY (STRATIFY)-Falls clinical prediction model to identify the risk of hospital admission or death from a fall in patients with an indication for antihypertensive treatment.

    Design Retrospective cohort study.

    Setting Primary care data from electronic health records contained within the UK Clinical Practice Research Datalink (CPRD).

    Participants Patients aged 40 years or older with at least one blood pressure measurement between 130 mm Hg and 179 mm Hg.

    Main outcome measure First serious fall, defined as hospital admission or death with a primary diagnosis of a fall within 10 years of the index date (12 months after cohort entry). Model development was conducted using a Fine-Gray approach in data from CPRD GOLD, accounting for the competing risk of death from other causes, with subsequent recalibration at five and 10 years using pseudo values. External validation was conducted using data from CPRD Aurum, with performance assessed through calibration curves and the observed to expected ratio, C statistic, and D statistic, pooled across general practices, and clinical utility using decision curve analysis at thresholds around 10%.

    Results Analysis included 1 772 600 patients (experiencing 62 691 serious falls) from CPRD GOLD used in model development, and 3 805 366 (experiencing 206 956 serious falls) from CPRD Aurum in the external validation. The final model consisted of 24 predictors, including age, sex, ethnicity, alcohol consumption, living in an area of high social deprivation, a history of falls, multiple sclerosis, and prescriptions of antihypertensives, antidepressants, hypnotics, and anxiolytics. Upon external validation, the recalibrated model showed good discrimination, with pooled C statistics of 0.843 (95% confidence interval 0.841 to 0.844) and 0.833 (0.831 to 0.835) at five and 10 years, respectively. Original model calibration was poor on visual inspection and although this was improved with recalibration, under-prediction of risk remained (observed to expected ratio at 10 years 1.839, 95% confidence interval 1.811 to 1.865). Nevertheless, decision curve analysis suggests potential clinical utility, with net benefit larger than other strategies.

    Conclusions This prediction model uses commonly recorded clinical characteristics and distinguishes well between patients at high and low risk of falls in the next 1-10 years. Although miscalibration was evident on external validation, the model still had potential clinical utility around risk thresholds of 10% and so could be useful in routine clinical practice to help identify those at high risk of falls who might benefit from closer monitoring or early intervention to prevent future falls. Further studies are needed to explore the appropriate thresholds that maximise the model’s clinical utility and cost effectiveness.


    The proportion of older adults in the population is rising,1 and with age the risk of falls increases,23 which can result in serious injury and long term disability.4 In England, falls are associated with about 235 000 emergency hospital admissions in the over 65s and cost the National Health Service more than £2.3bn ($2.6bn; €2.6bn) every year.567

    Many risk factors for falls exist, primarily related to comorbidities and frailty.238910 A key modifiable risk factor is prescribed drugs, including those that lower blood pressure.111213 Although antihypertensives are effective at reducing the risk of cardiovascular disease, typically many patients require treatment over several years to prevent a small number of events.14 Data from randomised controlled trials show that antihypertensives are associated with an increased risk of hypotension and syncope, which may lead to falls.15 Observational studies examining patients with frailty and multimorbidity suggest a direct association between antihypertensive treatment and falls.111617

    In patients who are prescribed antihypertensives or other drugs that substantially increase their risk of falls, doctors might want to consider altering or withdrawing treatment (ie, deprescribing),18 along with other interventions to reduce the risk of falls (eg, advice on lower alcohol consumption, falls prevention clinics, exercises).7 Identifying people at high risk of falls is, however, challenging. A 2021 systematic review of falls prediction models for use in the community identified a total of 72 models.10 Most of these studies were deemed at high risk of bias, and only three of the models were externally validated. These three validated models showed moderate discriminative ability, with an area under the curve of between 0.62 and 0.69. Calibration based on internal validation was only reported in seven of the studies, and it was typically moderate to poor.10 A further primary analysis aiming to predict falls in a general practice population showed good apparent discrimination for the model used (with an area under the curve of 0.87), but calibration performance was not assessed and no external validation was performed.19

    To inform clinical decision making in primary care, both patients and doctors require better prediction models to accurately identify those at high risk of serious falls (defined as any fall resulting in hospital admission or death), from the population of older adults who might be considered for antihypertensive treatment. This population includes patients with a recent high blood pressure reading, including those with a new diagnosis of hypertension, as well as those in whom intensification of treatment is being considered. We used routinely collected data from electronic health records to develop and externally validate a clinical prediction model to estimate such individuals’ risk of experiencing a fall resulting in hospital admission or death within 10 years. This study is part of a broader research programme investigating the association between blood pressure lowering drugs and side effects: STRAtifying Treatments In the multi-morbid Frail elderlY (STRATIFY): Antihypertensives.


    A retrospective observational cohort study was used to develop a prediction model for serious falls (the STRATIFY-Falls model), using data from Clinical Practice Research Datalink (CPRD) GOLD, which contains information from general practices using Vision electronic health record software (Cegedim Healthcare Solutions, London, UK). The model was externally validated using a second retrospective observational cohort comprising data from CPRD Aurum, containing data from general practices using recording software from Egton Medical Information Systems (EMIS, Leeds, UK). These data were linked to Office for National Statistics mortality data, Hospital Episode Statistics, and index of multiple deprivation data. The CPRD independent scientific advisory committee approved the protocol for this study (protocol No 19_042, see Appendix 6 in the supplementary material).


    Patients were eligible if they were registered at a linked general practice in England, contributing to CPRD between 1 January1998 and 31 December 2018. At the time of analysis, CPRD GOLD (development cohort) contained 4.4 million active patients from 674 general practices, whereas CPRD Aurum (validation cohort) contained seven million active patients from 738 practices. Both datasets have previously been shown to be representative of the patient population in England for age, ethnicity, and deprivation status.2021 To avoid duplication of patients, when practices had switched from one recording system to the other during the study timeframe, we excluded practices from CPRD Aurum (validation cohort) that were also present in the CPRD GOLD (development) dataset.

    Patients were considered eligible if they were aged 40 years or older (no upper age limit applied), registered to a CPRD “up-to-standard” practice (CPRD GOLD only), and had records available during the study period. Patients entered the cohorts at the time at which they became potentially eligible for antihypertensive treatment (ie, at the time of their first systolic blood pressure reading ≥130 mm Hg) after the study start date, and they were followed for up to 10 years. This blood pressure threshold was chosen to account for varying treatment initiation thresholds specified in different international hypertension guidelines.6 Patients with any systolic blood pressure reading >180 mm Hg were excluded from the cohort, as antihypertensive treatment would be indicated for these patients regardless of the risk of adverse events, unless clearly contraindicated for other reasons. All patient characteristics and model predictors were determined at the index date, defined as 12 months after cohort entry. The same eligibility criteria and characteristic determination methods were applied to both the development cohort and the validation cohort.


    The primary outcome was any hospital admission or death associated with a primary diagnosis of a fall within 10 years of the index date, the same time horizon as used for cardiovascular prediction models.22 Falls were based on ICD-10 (international classification of diseases, 10th revision) codes documented in Hospital Episodes Statistics and ONS mortality data (applicable ICD-10 codes shown in supplementary table S4.1). Prespecified secondary outcomes were falls (defined in the same way) within one and five years of the index date. This outcome definition was consistent across both the development cohort and the validation cohort.

    Model predictors

    We identified clinically relevant predictors of falls from the literature and through expert clinical opinion.278923 These included 30 predictors (44 predictor variables), covering patient demographics (age, sex, ethnicity, area based socioeconomic deprivation (index of multiple deprivation), body mass index (BMI), systolic and diastolic blood pressure), clinical characteristics (total cholesterol level, smoking status, alcohol intake), comorbidities (previous falls, memory problems, mobility issues, history of stroke, multiple sclerosis, activity limitation, syncope, cataract), and prescribed drugs (antihypertensives, opioids, hypnotics or benzodiazepines, antidepressants, anticholinergics) (see table S4.2 in the supplementary material). A recent literature review of falls clinical prediction tools by the National Institute for Health and Care Excellence identified the need for frailty to be considered as a predictor in models for use in the community.24 We therefore also calculated a validated electronic frailty index using the 36 comorbidities and conditions specified, including this index as a single covariate.25 Covariates were defined by any occurrence of relevant Read or SNOMED codes at any time point before the index date, with the exception of antihypertensives, which were defined as any prescription in the 12 months before the index date.

    To ensure consistency with commonly used risk calculators,2627 our prediction models do not account for changes in prescriptions of drug type or amount over time, and as such give an estimation of falls risk assuming treatment assignment policy in any application setting is similar to that in the development data.28

    Sample size

    The prespecified sample size calculation for model development was 2194 participants (15 358 person years), assuming a maximum of 40 predictors would be included in the final model (see extended methods in the supplementary material).29 For the external validation, the estimated sample size required was 12 000 patients (with at least 708 experiencing falls), sufficient to target a 95% confidence interval of width 0.2 around the estimate of the calibration slope (see extended methods in the supplementary material).30 The actual sample sizes in both the development cohort and the validation cohort far exceeded these estimates.

    Statistical analysis

    We calculated descriptive statistics for baseline characteristics in the model development and external validation cohorts separately.

    Missing data

    Multiple imputation with chained equations was used to impute missing data in both the development cohort and the validation cohort, with 10 imputations generated for the development and validation datasets. Two separate and independent imputation procedures were used, one for model development and one for model validation. The imputation models included all model covariates within each dataset, along with the Nelson-Aalen estimator for the cumulative baseline cause specific hazards for falls and for the competing event of death, and binary event indicators for each of these possible event types.3132 When information was missing on the diagnosis of comorbidities or prescribed drugs, it was assumed that no diagnosis or prescription was present. Predictor variables requiring imputation were cholesterol, ethnicity, deprivation score (validation cohort only), smoking status, and alcohol consumption.

    Imputations were assessed for consistency by comparing density plots, histograms, and summary statistics across imputations and back to the complete values. The model coefficients and predictive performance measures were then estimated in each imputed dataset separately, before being combined across imputations using Rubin’s rules.33

    Model development

    Researchers at the University of Oxford (CK, JPS) conducted the model development and apparent validation. Multivariable prediction models were fitted in each imputed dataset using a Fine-Gray subdistribution hazard model, taking into account the competing risk of death by other causes.34 The aim of accounting for the competing risk in this way was to avoid overestimation of the predicted probabilities of falls as defined in the Fine-Gray paper.3435 Predictor effects in the model are reported as subdistribution hazard ratios with 95% confidence intervals, and the post-estimation baseline cumulative incidence for falls was estimated using a Breslow type estimator.34 Analyses were undertaken using the fastcmprsk package in RStudio.36 Automated variable selection methods were not used, since the variables were all predetermined based on the literature and expert opinion, and given the large sample size would result in nearly all predictors having a statistically significant association with the outcome, regardless of effect size. To ensure a parsimonious model, we excluded variables with little or no association in multivariable analysis before fitting the final model.

    Fractional polynomial terms were examined to identify the best fitting functional form of all continuous variables.37 Fractional polynomials were identified separately within each imputed dataset, and we selected the most consistent transformation across the imputations, choosing lower order fractional polynomial terms whenever possible for the sake of parsimony. We then forced the selected fractional polynomial format for each continuous variable into the model for all imputations to ensure consistency in coefficient estimation.

    Interactions between age, sex, and antihypertensive treatments were considered but excluded from the model development owing to problems with stability or convergence, or for the sake of parsimony.

    We examined the Schoenfeld residuals to check the proportional hazards assumption for each predictor.38

    Apparent validation using development data

    Observed outcome probabilities were defined using pseudo values: jack-knife estimators representing an individual’s contribution to the cumulative incidence function for falls, accounting for competing risk, calculated by the Aalen–Johansen method. Pseudo values were generated separately in 50 groups by linear predictor value, for stability, and to account for the competing risk of death and non-informative right censoring.3940

    The model’s apparent calibration performance was assessed using calibration plots comparing the observed to predicted risks at one, five, and 10 years. The calibration plots were produced using observed pseudo values and included a smooth (non-linear) calibration curve to show apparent calibration across the spectrum of predicted risks,41 with 95% confidence intervals. Plots were generated in each imputed dataset separately and were checked for consistency across imputations. A single, representative example is reported.

    When plots showed miscalibration, we recalibrated the original Fine-Gray model separately at each time point by fitting a generalised linear equation with a logit link function directly to the observed pseudo values in the development dataset. The linear predictor from the original model was the only variable included in the recalibration model, which allowed for a non-linear recalibration effect using fractional polynomials.

    External validation

    Researchers at Keele University (LA, KIES, RDR) conducted the external validation of the prediction model, independent of the model development team. The prediction model algorithms presented in figure 1 (both the original and the final) were applied to each individual in the external validation cohort to give the predicted probabilities of experiencing a fall within one, five, and 10 years, taking account of the competing risk of death by other causes.42 Model calibration was assessed through comparison of predicted probabilities to observed pseudo values, estimated using jack-knife estimators representing an individual’s contribution to the cumulative incidence function for falls, accounting for competing risks, calculated by the Aalen–Johansen method in the external validation cohort.

    Fig 1
    Fig 1

    Final model equations for predicting risk of falls at one, five, and 10 years in patients with an indication for hypertensive treatment. Age is measured in years. Ln=natural logarithm; IMD2-IMD5=indices of multiple deprivation; TC=total cholesterol; FI=electronic frailty index. The full algorithm code (including the α, β, γ, and CIF values) is freely available for research use and can be downloaded at

    Predictive performance was quantified by calculating the observed to expected ratio, Harrell’s C statistic, Royston’s D statistic with its associated R2 statistic,43 each applied to the same pseudo values as above, and by using calibration plots and curves. Calibration plots were generated separately in each imputed dataset and checked for consistency (one illustrative example is shown for each model). All measures were calculated in each imputed dataset separately and, when appropriate, combined across imputations using Rubin’s rules. When Rubin’s rules did not apply (eg, when the posterior distribution was not expected to be normal), performance was summarised across imputations using the median and interquartile range.44

    Heterogeneity in model performance across different general practices was assessed using a random effects meta-analysis, using restricted maximum likelihood estimation, given that the case mix and incidence of falls were expected to vary between practices (see extended methods in the supplementary material).45 The observed to expected ratio was pooled across practices on the natural log scale, the C statistic on the logit scale (with the standard errors of logit C calculated using the delta method), and the D statistic on its original scale.4647 Pooled estimates are reported with prediction intervals to give an indication of expected model performance in a new general practice.

    Clinical utility was assessed by plotting the one year, five year, and 10 year risk of falls against the 10 year risk of cardiovascular disease, calculated using the Qrisk2 algorithm.22 Clinical utility was also examined using net benefit analysis, where the harms and benefits of using a model to guide treatment decisions were offset to assess the overall consequences of using the STRATIFY-Falls prediction models for clinical decision making.48 The original and final models were compared with one another at five and 10 years and with model blind methods of introducing falls prevention measures (which may include deprescribing) for all patients, or not introducing falls prevention measures (starting or continuing treatment) for all patients, regardless of falls risk. We assessed net benefit across the full range of possible threshold probabilities, with a falls risk above 10% at 10 years specified a priori as being a threshold of clinical interest, to align with current thresholds for an individual’s risk of cardiovascular disease.49

    The same external validation methods as described earlier were employed in subgroups by age (<65 years, ≥65 years), sex (women, men), and ethnicity (white, black, South Asian, other), to assess the models’ predictive performance in these clinically relevant groups.

    Patient and public involvement

    This study was developed and conducted with the help of our patient and public advisor Margaret Ogden. As a member of our study advisory group, they commented on the study protocol and have been present in all team meetings discussing results and reporting. We also held a focus group with several older adults during the study to discuss broader themes related to drugs for cardiovascular disease prevention and adverse events, which informed the interpretation of this work.


    Study population characteristics

    Figure 2 shows the flow of study participants for both the development cohort and the validation cohort. A total of 1 772 600 patients were included in the model development cohort (CPRD GOLD), with a mean age of 59 years (standard deviation (SD) 13 years) and a mean systolic blood pressure of 144 mm Hg (SD 12 mm Hg) at study inclusion (table 1). The 10 year prevalence of falls was 3.5% (n=62 691), with 10.3% of patients (n=181 731) experiencing death by other causes before any fall occurred, and a median follow-up of 6.2 years (interquartile range (IQR) 2.6-10 years) across the cohort.

    Fig 2
    Fig 2

    Flow of participants through study. CPRD=Clinical Practice Research Datalink

    Table 1

    Descriptive statistics for model development and validation cohorts, in full cohorts and stratified by outcome type at 10 years. Values are numbers (percentages) unless stated otherwise

    View this table:

    In total, 3 805 366 patients were included in the validation cohort, with 206 956 (5.4%) experiencing fall events during 10 year follow-up. A further 334 552 (8.8%) patients died during follow-up from unrelated causes, before any fall occurred. Median follow-up time in the validation cohort was 6.7 years (IQR 2.7-10 years). Total cholesterol level was missing in 48% of participants, and ethnicity data were more complete in the validation cohort than development cohort (81% v 44% complete data).

    Model development

    The original model consisted of 24 predictors, after the exclusion of variables with little or no association in multivariable analysis (table 2). Compared with men, women were more likely to experience a fall during follow-up (subdistribution hazard ratio 1.25, 95% confidence interval 1.23 to 1.27). Increasing age, white ethnicity, and being a smoker, a heavy drinker, or more deprived were predictors associated with an increased risk of falls (table 2). Increasing frailty was one of the strongest predictors of falls, with an increased falls risk of 22% for about every four deficits accrued (1.22, 1.20 to 1.23). Of the previous medical conditions examined, the strongest predictors of falls were having a history of falls (1.32, 1.29 to 1.35) and multiple sclerosis (1.71, 1.51 to 1.94). Drugs most strongly associated with falls were angiotensin 2 receptor blockers (1.19, 1.15 to 1.23), antidepressants (1.16, 1.13 to 1.18), hypnotics and anxiolytics (1.15, 1.13 to 1.18), angiotensin converting enzyme inhibitors (1.12, 1.10 to 1.14), and opioids (1.11, 1.08 to 1.13). To ensure a parsimonious final model, systolic and diastolic blood pressure, BMI, activity limitation, syncope, and cataract were excluded from the model owing to a lack of association with falls risk. No violations of the proportional hazards assumption were detected.

    Table 2

    Prediction model for falls. Values are subdistribution hazard ratios and 95% confidence intervals

    View this table:

    Internal validation and recalibration using pseudo values

    At five and 10 years, apparent calibration plots in the model development data showed significant miscalibration, with under-prediction for patients with a low predicted risk and substantial over-prediction for those with a high predicted risk (see supplementary figure S3.1). We therefore recalibrated the original model to the observed pseudo values and this improved apparent calibration (in the model development data) considerably (fig 4 and fig 5). Apparent calibration of the original model at one year was good, therefore recalibration was not required (see fig 3).

    External validation

    Predictive performance

    Upon external validation, the original model showed excellent discrimination (table 3) but poor calibration (see supplementary figure S3.1), with considerable heterogeneity across general practices (see supplementary figure S3.2). Recalibration of the model corrected miscalibration in the model development cohort, but under-prediction of risk was still present in the validation cohort (fig 3, fig 4, and fig 5). This miscalibration was less extreme than that of the original model, in the narrower range of predicted probabilities between 0 to 0.2. On average, the recalibrated model showed a pooled observed to expected ratio at 10 years of 1.839 (95% confidence interval 1.811 to 1.865, 95% prediction interval 1.284 to 2.638), suggesting that the observed incidence of falls would be around 84% (relatively) higher than expected when using the model to generate predictions. Under-prediction of 10 year falls risk was consistent across all subgroups, with the exception of the “other ethnicity” group, where both the falls incidence and the observed to expected ratio were considerably lower than in the full population (see extended results in supplementary material section 2.2).

    Table 3

    Predictive performance statistics of the falls prediction models on external validation in Clinical Practice Research Datalink Aurum

    View this table:
    Fig 3
    Fig 3

    Calibration curves for apparent performance of the final STRATIFY-Falls model in CPRD GOLD at one year, and calibration on external validation in CPRD Aurum at one year. Groups represent 10ths of linear predictor, as created between deciles. Histogram shows distribution of predicted probabilities. The model is not recalibrated to pseudo values in the development data. CPRD=Clinical Practice Research Datalink; STRATIFY=STRAtifying Treatments In the multi-morbid Frail elderly

    Fig 4
    Fig 4

    Calibration curves for apparent performance of the final STRATIFY-Falls model in CPRD GOLD at five years, and calibration on external validation in CPRD Aurum at five years. Groups represent 10ths of linear predictor, as created between deciles. Histogram shows distribution of predicted probabilities. CPRD=Clinical Practice Research Datalink; STRATIFY=STRAtifying Treatments In the multi-morbid Frail elderlY

    Fig 5
    Fig 5

    Calibration curves for apparent performance of the final STRATIFY-Falls model in CPRD GOLD at 10 years, and calibration on external validation in CPRD Aurum at 10 years. Groups represent 10ths of linear predictor, as created between deciles. Histogram shows distribution of predicted probabilities. CPRD=Clinical Practice Research Datalink; STRATIFY=STRAtifying Treatments In the multi-morbid Frail elderlY

    The ordering of participants’ predicted probabilities altered only slightly on recalibration; thus discriminative ability of the recalibrated models remained excellent at each of the analysis time points, with C statistics of 0.843 (95% confidence interval 0.841 to 0.844, 95% prediction interval 0.789 to 0.881) at five years, and 0.833 (0.831 to 0.835, 95% prediction interval 0.789 to 0.870) at 10 years, and D statistic values of 1.894 (1.746 to 2.042, 95% prediction interval 1.75 to 2.04) at five years, and 1.597 (1.472 to 1.721, 95% prediction interval 1.47 to 1.72) at 10 years (table 3). Model performance varied more among smaller practices, with more consistent performance seen as practice size increased (fig 6).

    Fig 6
    Fig 6

    Performance variability of the final STRATIFY-FALLS model on external validation across general practices, with observed to expected ratio, R2 statistic, D statistic, and C statistic. STRATIFY=STRAtifying Treatments In the multi-morbid Frail elderlY

    The model’s discriminative ability at 10 years was consistent across age and sex subgroups (see supplementary tables S2.1 and S2.2). The pooled C statistic was lowest in those of white ethnicity (0.796, 95% confidence interval 0.793 to 0.798) and highest among those of other ethnicity (0.834, 0.830 to 0.839) (see supplementary table S2.3).

    Clinical utility analysis

    Net benefit and decision curve analysis of the original and recalibrated models indicated potential clinical utility at five and 10 years around the predefined threshold of 10% (fig 7). At 10 years, basing clinical management decisions on predicted probabilities of falls yielded a benefit over the two strategies of introducing falls prevention measures (which may include deprescribing) for all and not introducing falls prevention measures (starting or continuing treatment) for all patients, when using a treatment decision threshold of 7% or higher from the original model, or a treatment decision threshold of 6% or higher from the final recalibrated model. Thus, for either model, when using our prespecified treatment decision cut-off of 10% risk of falls at 10 years, we would expect a benefit to patients over and above model blind treatment strategies (usual care). This treatment decision threshold of 10% showed a net benefit in all subgroups except other ethnicity, where a cut-off of at most 3% was required for the model to be superior to usual care for all (see supplementary figure S2.6). In the analysis at five years, using a treatment decision threshold of 3% risk or higher gave a net benefit above starting or continuing treatment for all, for both models.

    Fig 7
    Fig 7

    Decision curve analysis showing net benefit of using prediction models across different threshold probabilities for assigning treatment

    In analyses comparing the risk of falls with the risk of cardiovascular disease in CPRD GOLD, 1725 (0.1%) patients had a high risk of falls (>10%) but low risk of cardiovascular disease (<10%) at 10 years (fig 8). A further 324 884 (18.3%) patients were classified as high risk of both, and 607 228 (34.2%) had a low falls risk but high risk of cardiovascular disease.

    Fig 8
    Fig 8

    Comparison of 10 year cardiovascular disease risk (Qrisk2) and fall risk in Clinical Practice Research Datalink GOLD dataset. High risk for both conditions was defined as a risk >10%. CVD=cardiovascular disease


    Principal findings

    We developed and externally validated a clinical prediction model to determine an individual’s risk of experiencing a fall resulting in hospital admission or death within 10 years of being indicated for antihypertensive treatment (owing to raised blood pressure readings). The model incorporates routinely recorded information, including a history of previous falls, multiple sclerosis, heavy alcohol consumption, high deprivation score, and prescribed drugs, which were all strong predictors of subsequent falls, conditional on the other model variables.

    The final recalibrated model showed good discrimination upon external validation, suggesting that it can help distinguish those at a higher risk of falling, which may improve how doctors identify patients who might benefit from targeted fall prevention strategies, including multifactorial or exercise based interventions,50 and drug reviews including deprescribing. Calibration performance of the prediction model was inconsistent across the development and validation datasets, with miscalibration leading to under-prediction of fall risk across the full range of predicted probabilities. Nevertheless, such under-prediction of risk may be deemed acceptable if the model is intended to inform whether treatment should be stopped to avoid adverse effects—particularly if the treatment in question also carries benefits. Indeed, the clinical utility analysis showed that at risk thresholds around 10%, the net benefit of the model is higher than for other strategies currently employed in usual care.

    Strengths and limitations of this study

    Strengths of this work include the large, population based cohorts used, incorporating routinely collected patient data that have been shown to be representative of the patients across England, suggesting that the findings could be generalised across this (or a similar) population.2021 Analyses accounted for the competing risk of death in both model development and external validation, ensuring that falls risk was not over-estimated. This is particularly important in individuals with frailty and multiple long term conditions, where an over-estimation of falls risk might preclude prescription of antihypertensive drugs in those who could still derive benefit from continued treatment. This analysis method is superior to most prediction models in widespread use, which do not take into account competing risks.22 In these models, the stated risk of an event (cardiovascular disease, for example) is by design too high, as the actual risk of an event would be diminished by death from other (eg, non-cardiovascular) causes, particularly in older people.35

    All data were derived from routine electronic health records, including the outcome definition of falls. Such a definition might not capture all events that could be included in the ProFaNE (Prevention of Falls Network Europe) consensus definition of a fall (ie, an unexpected event in which the participants come to rest on the ground, floor, or lower level),51 and therefore the model results should be interpreted in this context. It is possible that some of these fall events were not reported or captured correctly within the electronic health record, therefore potentially underestimating the incidence of falls, which could have affected the performance of the model.

    Assessments of the models’ predictive performance were conducted across a range of general practices, with different case mix and outcome prevalence, giving an indication of the expected spread of performance across a range of subpopulations. Model performance varied more among smaller practices, with more consistent performance seen as practice size increased. This reflects the increased uncertainty in the estimation of the predictive performance measures in practices of low sample sizes, many of which individually would have failed to meet the required sample size for this external validation. Prediction intervals from meta-analyses across general practices give an indication of how well our falls models would be expected to perform in new practices, helping to inform decisions on implementation in practice. In the present study, the prediction intervals were relatively narrow across a range of performance statistics, suggesting that the models would perform similarly in a new practice from a similar population.

    All variables included in our model were predetermined based on the literature, although we did choose to exclude some variables at the model development stage that had exhibited a negligible effect on the outcome. These variables were excluded because they did not contribute substantially to model predictions and served to unnecessarily increase the complexity of the equation. We did not use statistical selection methods such as backwards or forwards elimination, as these can lead to overfitting. Although our approach may have meant that some statistically significant (but clinically insignificant) predictors were excluded from the final model, these exclusions are unlikely to have led to overfitting given the large sample size or been the reason for miscalibration in the external validation.

    For these models, we defined binary variables for antihypertensive drugs as any prescription within the year before (and including) the index date, without accounting for any changes to drugs during follow-up. Not allowing for the time varying nature of treatment could potentially affect the observed associations with falls risk, and so too the predicted risks obtained from the model. However, our model is intended to give a prediction for risk of falls over the next 1-10 years, from a particular moment in time, in the context of current care. The latter is important, because, for example, if a patient has low risk, then it means that current care (ie, treatments and monitoring strategies over the next 1-10 years) is likely to be adequate for this individual. In contrast, if an individual’s risk is high, it means that current care is likely insufficient and that additional or alternative approaches are potentially needed.

    Calibration performance of the prediction model was inconsistent across the model development and validation datasets. Such miscalibration was surprising, as populations were similar across both datasets for predictor distributions and the incidence of falls and of death (with the exception of self-reported characteristics such as smoking status, alcohol consumption, and ethnicity, which may reflect differences in how these data are captured within the electronic health record systems that underlie these databases). Distributions of the linear predictor were also consistent across the development and validation datasets, suggesting miscalibration could be due to differences in the outcomes or the outcome recording or coding. This is representative of real life, where outcome definitions vary, and both models still exhibited useful discrimination and potential clinical utility across the full population for a range of treatment decision threshold probabilities, although the predicted risk for individuals may be different (miscalibrated) from their actual risk. Indeed, miscalibration was most evident in the 5-10% of patients with the highest predicted risk (those above a threshold of 10%), and in these patients, doctors may interpret the exact predicted risks with caution, even though these patients can still be considered at higher risk.

    Comparison with previous literature

    Several prediction models can now estimate an individual’s risk of falls, including those for use in the community. A recent systematic review of development and validation studies identified a total of 72 existing models.10 These were typically poorly reported, with only 40 studies (56%) reporting discrimination statistics and seven studies (10%) reporting calibration. Only three models were externally validated. Discrimination was reported with area under the curves of 0.49 to 0.87 for internally validated models and 0.62 to 0.69 for externally validated models. Calibration was moderately good but presented in 10ths of risk across a small range of risk thresholds (eg, 0-10% 52) making it difficult to determine how calibration varied across the full range of predicted probabilities. All studies were deemed at high risk of bias owing to methods of analysis and outcome assessment along with restrictive eligibility criteria.

    In contrast, our final model, reported in line with the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines for reporting of clinical prediction models53 (see supplementary table S4.3), showed excellent discrimination upon external validation, with an area under the curve of 0.84. It demonstrated reasonable calibration across the low range of predicted risks typically examined by previous risk models (eg, 0-10%) and although miscalibration was present at higher predicted probabilities, there was still clinical utility based on the decision curve analysis. This suggests that the present model is the most promising clinical prediction model for falls available to date, and that it may be effective in identifying individuals at high risk of falls from those in primary care with raised blood pressure.

    Implications for policy and practice

    As patients age, their risk of a fall resulting in serious injury and long term disability increases.4 Identifying those most at risk is therefore important to enable targeting of fall prevention strategies.7 The present model provides primary care doctors with a method of estimating the risk of falls using data routinely available in electronic health records and could have uses beyond predicting falls in patients being considered for antihypertensive treatment.54

    Among patients aged 40 years and older, with an indication for antihypertensive drugs owing to raised blood pressure, the model was shown to distinguish well those at high risk of falls in the next 1-10 years. Miscalibration was noted, with an under-prediction of risk seen particularly at higher predicted probabilities. Depending on how the model might be used, such under-prediction might be less of a concern—for example, if the model was being used to inform treatment changes only above a certain threshold of predicted risk. In this context, doctors could be confident that the true risk is at least at this threshold, if not higher. Further studies are, however, needed to explore the appropriate thresholds that maximise the model’s clinical utility and cost effectiveness, and to examine whether recalibration is possible in local settings.

    The model may also be used to target falls prevention strategies to patients with the highest risk. These strategies might include multifactorial or exercise based interventions,50 or review of prescribed drugs, with those drugs likely to increase the risk of falls being considered for deprescribing.418 Such drug reviews are increasingly being encouraged in routine clinical practice, and the STRATIFY-Falls model may be useful for informing these reviews.55 For example, in patients prescribed antihypertensive treatment, the model might be used alongside a cardiovascular risk prediction algorithm to compare the potential for benefit and harm from continued treatment prescription.262756 For individuals with a high risk of falls but low risk of cardiovascular disease, a doctor might consider whether new or continued antihypertensive treatment is still appropriate. We examined the prevalence of this scenario in our model development population (fig 8) and identified only a small number of individuals (0.1%) who would be classified in this way, when comparing risks at 10 years. More common, however, were individuals with a low risk of falls but high risk of cardiovascular disease (affecting one in three patients). For these patients, doctors could use the model to illustrate the minimal risk of harm for individuals, potentially improving uptake of, adherence to, and persistence with antihypertensive treatment, which is known to be poor currently.57


    The STRATIFY-Falls prediction model helps to identify those at high risk of falls and could be used by doctors wanting to identify patients who might benefit from targeted fall prevention strategies, including multifactorial or exercise based interventions50 and drug reviews. Used alongside other prediction tools such as those for cardiovascular risk, such a model could be valuable when used as part of a wider risk assessment for falls prevention.

    What is already known on this topic

    • Serious falls are a possible side effect of antihypertensive treatment, which can adversely affect patients’ quality of life and increase the risk of hospital admission, especially in older people with frailty

    • Existing tools that estimate an individual’s risk of falls have been shown to be at high risk of bias, with only moderate discriminative ability

    What this study adds

    • In the present study, a clinical prediction model for the risk of falls for up to 10 years was developed and externally validated, incorporating commonly recorded patient characteristics, comorbidities, and drugs, in patients with an indication for antihypertensive treatment

    • Upon external validation, the model discriminated well between patients who went on to have a serious fall and those who did not, but calibration indicated under-prediction of risk

    • Nevertheless, a decision curve analysis suggests the model has clinical utility and so may be useful to identify patients with a high fall risk, who may require closer monitoring or early intervention to prevent future falls

    Ethics statements

    Ethical approval

    The study protocol was approved by the Clinical Practice Research Datalink (CPRD) independent scientific advisory committee in February 2019 before obtaining the data relevant to the project (protocol given in the eAppendix in the supplementary material). As all data are fully anonymised, no consent was required. A project summary is published on the CPRD website (

    Data availability statement

    Data were obtained via a Clinical Practice Research Datalink (CPRD) institutional licence. Requests for data sharing should be made directly to the CPRD. The algorithm is freely available for research use and can be downloaded from Code lists used to define variables included in the dataset are available at


    The STRAtifying Treatments In the multi-morbid Frail elderlY (STRATIFY) investigators include the authors already listed and: John Gladman, professor of medicine of older people, School of Medicine, University of Nottingham; Simon Griffin, professor of primary care, Department of Public Health and Primary Care, Primary Care Unit, University of Cambridge; and Margaret Ogden, patient and public involvement advisor.

    We thank Lucy Curtin for administrative support throughout the project and Margaret Ogden, Simon Griffin, and John Gladman for their contributions as STRATIFY Investigators to the project. The Hospital Episode Statistics data used in this analysis are reused with permission of NHS Digital, which retains the copyright for those data. We thank the Office for National Statistics for providing data on mortality. The ONS and NHS Digital bear no responsibility for the analysis or interpretation of the data. Finally, we are grateful to all those patients who permitted their anonymised routine NHS data to be used for this research.


    • Contributors: JPS conceived the project and wrote the protocol with FDRH, RJM, RS, and RDR. CK and SLF extracted data for analysis. CK developed the model under supervision of JS and RS. LA validated the model under supervision of RDR and KIES. LA, KIES, and RDR wrote the first draft of the manuscript. All authors revised the manuscript and approved the final version. JPS is the guarantor for this work and accepts full responsibility for the conduct of the study, had access to the data, and controlled the decision to publish. The corresponding author (JPS) attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

    • Funding: JPS and CK were funded in whole, or in part, by the Wellcome Trust and Royal Society via a Sir Henry Dale fellowship held by JPS (ref: 211182/Z/18/Z) and the National Institute for Health and Care Research (NIHR) School for Primary Care (project 430) awarded to JPS. JPS also receives funding via an NIHR Oxford Biomedical Research Centre (BRC) senior fellowship. RJMcM is supported by an NIHR senior investigator award. FDRH acknowledges part support from the NIHR ARC Oxford Thames Valley and the NIHR Oxford University Hospitals BRC. KIES is funded by an NIHR School for Primary Care Research (SPCR) launching fellowship. SLF was part funded by the NIHR BRC and NIHR Applied Research Collaboration (ARC) Oxford and Thames Valley. AB has received research funding from AstraZeneca, NIHR, BMA Medical Research Foundation, and UK Research and Innovation. RAP receives funding from the NIHR. AC is part funded by NIHR ARC Yorkshire and Humber and Health Data Research UK, an initiative funded by UK Research and Innovation Councils, NIHR and the UK devolved administrations, and leading medical research charities. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. The sponsor and funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    • Competing interests: All authors have completed the ICMJE uniform disclosure form at and declare: authors had financial support from the Wellcome Trust, Royal Society, and National Institute for Health and Care Research for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

    • The manuscript’s guarantor (JPS) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as originally planned (and, if relevant, registered) have been explained.

    • Dissemination to participants and related patient and public communities: Findings from this study will be press released alongside publication of this manuscript. Social media (eg, Twitter) will be used to draw attention to the work and stimulate debate about its findings. We will also publish a lay summary of our findings on our study website and make the underlying developed algorithms freely available for academic use here:

    • Provenance and peer review: Not commissioned; externally peer reviewed.

    This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: