Intended for healthcare professionals


Case finding for patients at risk of readmission to hospital: development of algorithm to identify high risk patients

BMJ 2006; 333 doi: (Published 10 August 2006) Cite this as: BMJ 2006;333:327
  1. John Billings, professor (john.billings{at},
  2. Jennifer Dixon, director of policy2,
  3. Tod Mijanovich, research scientist1,
  4. David Wennberg, president and chief operating officer3
  1. 1 Centre for Health and Public Service Research, New York University, 295 Lafayette St, New York, NY 10012, USA
  2. 2 King's Fund, London W1G 0AN
  3. 3 Health Dialog Analytic Solutions, Health Dialog Corporate Headquarters, SixtyState Street, Suite 1100, Boston, MA 02109, USA
  1. Correspondence to: John Billings
  • Accepted 12 April 2006


Objective To develop a method of identifying patients at high risk of readmission to hospital in the next 12 months for practical use by primary care trusts and general practices in the NHS in England.

Data sources Data from hospital episode statistics showing all admissions in NHS trusts in England over five years, 1999-2000 to 2003-4; data from the 2001 census for England.

Population All residents in England admitted to hospital in the previous four years with a subset of “reference” conditions for which improved management may help to prevent future admissions.

Design Multivariate statistical analysis of routinely collected data to develop an algorithm to predict patients at highest risk of readmission in the next 12 months. The algorithm was developed by using a 10% sample of hospital episode statistics data for all of England for the period indicated. The coefficients for 21 most powerful (and statistically significant) variables were then applied against a second 10% test sample to validate the findings of the algorithm from the first sample.

Results The key factors predicting subsequent admission included age, sex, ethnicity, number of previous admissions, and clinical condition. The algorithm produces a risk score (from 0 to 100) for each patient admitted with a reference condition. At a risk score threshold of 50, the algorithm identified 54.3% of patients admitted with a reference condition who would have an admission in the next 12 months; 34.7% of patients were “flagged” incorrectly (they would not have a subsequent admission). At risk score threshold levels of 70 and 80, the rate of incorrectly “flagged” patients dropped to 22.6% and 15.7%, but the algorithm found a lower percentage of patients who would be readmitted. The algorithm is made freely available to primary care trusts via a website.

Conclusions A method of predicting individual patients at highest risk of readmission to hospital in the next 12 months has been developed, which has a reasonable level of sensitivity and specificity. Using various assumptions a “business case” has been modelled to demonstrate to primary care trusts and practices the potential costs and impact of an intervention using the algorithm to reduce hospital admissions.


Improving the management of high cost patients, especially those with long term conditions, is increasingly seen as an important strategy for improving health outcomes and controlling healthcare expenditure and is a key element of current NHS policy.1 An essential component of any strategy to improve care and services for these patients is the development of a case finding mechanism to identify high risk patients accurately so as to enable interventions to be targeted before substantial preventable or avoidable costs have been incurred and health status has deteriorated further. An effective case finding tool is one that identifies as many patients as possible who will have future high costs or hospital resource use without intervention but is not so broad that it includes large numbers of patients who will not incur such costs. The ultimate goal is to target and calibrate resources for interventions to those who will benefit most, allowing savings from reduced subsequent resource use to help in supporting the cost of the intervention.

The importance of an effective approach to case finding has become starkly evident in discussions about the intensive case management programme for older people being piloted in England by Evercare (a business unit of United Health Group, a US healthcare services conglomerate). The programme used a “threshold” approach to case finding, primarily enrolling patients over the age of 65 with a history of two or more emergency admissions in the previous year. However, researchers evaluating the initiative have shown that high rates of previous admissions alone do not necessarily mean continued high risk of future admission. In analyses of historical admission data that used hospital episode statistics from 1997-8 to 2002-3, researchers found that a drop of 75% (from 2.6 admissions/year to 0.6/year) can be expected in the subsequent year for patients with two or more admissions in the base year, even with no intervention.2

Although the evidence base for case management in improving patient satisfaction and health status is not strong,3 a few studies have shown important benefits (including preliminary findings for the Evercare programme4). In an environment of limited resources, understanding the costs and benefits of new interventions and services is essential. In the case of the Medicare pilot programmes in the United States to improve management of elderly patients with chronic conditions, the authorising legislation itself requires that the initiative is budget neutral, with costs of the programmes offset by savings from reductions in hospital admissions or other resource use.5 6 In England, the NHS has stated that primary care trusts and strategic health authorities (the local NHS entities charged with commissioning health care to residents in the area) must reduce the number of emergency bed days by 5% by 2008, expecting that programmes for managing patients with long term conditions can help them to achieve these targets. These programmes are not explicitly required to pay for themselves, but the primary care trusts and strategic health authorities have an obvious interest in assessing the “business case” for intensive case management or other interventions targeted at high cost or high risk patients.

We developed a case finding algorithm as part of a project commissioned by Essex Strategic Health Authority on behalf of the 28 strategic health authorities, the Department of Health, and the NHS Modernisation Agency. In this paper we describe the development of the tool (the patients at risk for re-hospitalisation (PARR) algorithm), assess the “business case” for the algorithm under different scenarios and assumptions, and discuss the implications for policy makers and practitioners interested in implementing effective programmes to manage high risk patients.


We developed the PARR case finding algorithm by using five years of hospital episode statistics data (1999-2000 to 2003-4). We examined admissions in 2002-3 to identify a “triggering” admission for each patient and considered data on previous hospital resource use for each patient for the three previous years (1999-2000 to 2001-2) to predict whether an admission would occur in the 12 months after the triggering admission (looking at data for the remainder of 2002-3 and for 2003-4). We excluded from the analysis patients known to have died in hospital during the triggering admission.

Defining characteristics of PARR algorithm

Focus on reference conditions for which improved management can help prevent future admissions

Clearly, a large proportion of hospital admissions cannot be prevented or avoided even with the most effective care and case management. For example, most major trauma is generally not preventable or avoidable. A broad range of surgical procedures and medical conditions (heart attacks, treatment of neoplasms, congenital defects) exists for which the need for care is largely driven by factors beyond the control of a care management intervention, at least in the medium term and short term. Accordingly, the PARR case finding algorithm focuses on a range of “reference” conditions (such as congestive heart disease, chronic obstructive pulmonary disease, diabetes, sickle cell disease) for which timely and effective ambulatory care, case management, or social services have the potential to help to reduce the risks of readmission. These conditions, listed in the appendix on, represent almost a third of all emergency medical admissions.

Use of hospital admission as “triggering” event

The PARR case finding algorithm uses an emergency hospital admission for a reference condition as a “triggering” event. The algorithm incorporates diagnostic information from that admission and then examines data on previous resource use, characteristics of the patient, contextual information on the patient's electoral ward of residence, and the hospital of admission to create a “risk score” for the probability of another admission in the next 12 months. Use of this triggering event helps to improve the discriminatory power of the algorithm, as patients with reference conditions are prone to readmission, often within a short period after the first admission.

Designed to identify patients at risk of future admissions

Risk of hospital admission is dynamic—patients admitted multiple times in one year may, or may not, be admitted again in subsequent years. Using logistic regression techniques, the algorithm by design attempts to identify patients most at risk of a subsequent admission in the next 12 months, creating a risk score for each patient with a triggering admission. The risk score ranges from 1 to 100; higher scores indicate a greater risk of admission in the next 12 months.

Use of a broad range of variables to help predict risk

The PARR case finding algorithm incorporates a broad range of variables relating to the patient, community, and hospital to help predict risk of readmission.

Data on patients' previous use of hospital—Diagnostic fields in computerised hospital admission records for the current admission and any admission in the previous three years provide data on whether the patient has a chronic condition or comorbidities. Also available is previous frequency of admission, as well as day case attendance, consultant treatment specialty, and demographic characteristics (age, sex, ethnicity).

Community characteristics—The algorithm incorporates characteristics of the community in which the patient resides, including demographic data and underlying age and sex adjusted rates of admission for conditions that are sensitive to physicians' practice styles. The last variable is important because admission rates are not only a function of effective care, patients' characteristics, and social circumstances or resources but can also be affected by a physician's threshold to refer a patient to hospital and by the admitting physician's threshold for admission.7 When developing the algorithm, we saw a greater than 20-fold variation in admission rates among electoral wards in England for these conditions.

Hospital of current admission—Practice style of physicians at the hospital of current admission are also relevant for similar reasons.810 When developing the algorithm, we saw a more than threefold variation among hospitals in the rate at which patients were readmitted for practice style sensitive conditions during a 12 month follow-up period.

Designed to be used in real time or with archival analysis only

Because effective discharge planning is likely to be an essential component of many intervention strategies,11 12 the algorithm is designed primarily for application in real time while the patient is still in the hospital. Patients are most vulnerable immediately after discharge, and planning and organising an intervention during the hospital stay can be critical to effective care management. Two “archival” approaches that do not entail real time application have also been developed, involving analysis of archived admission data on a monthly or annual basis to identify patients who could be targeted for an intervention in the next 12 months, and are intended for use where local information technology capacity is limited or where obtaining real time data on admissions is difficult or not feasible.

Variables selected

We created a set of variables on previous hospital resource use and diagnostic history from hospital episode statistics data for the triggering admission and the previous three years. We also created a variable to identify emergency admissions that occurred in the 12 months after discharge for the triggering admission. We dropped admissions with missing admission and discharge dates or with missing admission classification (emergency or elective) from the analysis (less than 1% of cases). We based disease presence and diagnostic history on the presence of ICD-10 (international classification of diseases, 10th revision) codes in any diagnostic field (primary or secondary) in hospital episode statistics data. The diagnostic cost groups/hierarchical condition category variable includes diagnostic categories from the diagnostic grouping programme for public use developed by DxCG to risk adjust payments to managed care plans for the Medicare programme in the United States.13 The programme examines all diagnostic fields and assigns patients to one of the 172 hierarchical categories on the basis of the seriousness of the conditions recorded as primary or secondary diagnoses.

We combined these data with data on demographics and hospital resource use characteristics of the patient's ward of residence (as discussed above). We did a series of stepwise logistic regressions to identify which variables were helpful in predicting a subsequent admission in the next 12 months. Initially, we tested a broad set of 69 variables; in the final equation, we found that 21 variables were significant predictors and included them in the model to produce the algorithm (box).

We developed the algorithm by using a 10% reference sample of hospital episode statistics data for all of England for the period indicated. We then applied the coefficients for the 21 variables against a second 10% test sample to validate the findings of the algorithm from the first sample. Rates of case finding, specificity, and sensitivity differed by only 1-2% in the two samples, and data reported here are for the test sample. A full report detailing the development and performance of the algorithm and a specification document with regression coefficients for each variable used in the algorithm are available at A Microsoft Access program implementing the algorithm for use with admitted patient care or hospital episode statistics data sets is also available at the site at no charge.

Variables included in PARR case finding algorithm

  • Alcohol related diagnoses

  • Cerebrovascular disease

  • Chronic obstructive pulmonary disease

  • Connective tissue disease/rheumatoid arthritis

  • Developmental disability

  • Diabetes

  • Ischaemic heart disease

  • Peripheral vascular disease

  • Renal failure

  • Sickle cell disease

  • Previous admission for respiratory infection

  • Number of different treatment specialists seen

  • Age 65-74, age 75+

  • Sex

  • Ethnicity

  • Previous admission for a reference condition

  • Number of emergency admissions in previous 90, 180, and 365 days

  • Number of non-emergency admissions in previous 365 days

  • Total number of previous emergency admissions in previous three years

  • Average number of episodes per spell for emergency admissions

  • Observed:expected ratio for practice style sensitive admissions in ward of residence

  • Observed:expected ratio for rate of readmissions for hospital of current admission

  • Diagnostic cost groups/hierarchical condition category


The two most important indicators in assessing the performance of a case finding algorithm are the percentage of patients who will be admitted in the next 12 months correctly identified by the algorithm (sensitivity) and the percentage of patients flagged by the algorithm who will not be admitted in the next 12 months (1 - positive predictive value). The first indicator is important because it provides a measure of how well the algorithm performs in finding cases that are potentially in need of intervention. If the level is too low, a large number of patients will be “missed” by the algorithm, whose subsequent readmission might have been prevented. The second measure is critical in assessing the potential cost effectiveness of the algorithm and any accompanying health and social care management intervention programme. If the algorithm incorrectly identifies too many patients who would not be readmitted even without any intervention, the net total cost of the intervention initiative will be higher as potential savings from reductions in subsequent admissions are not possible for these patients to help offset the costs of the health and social care management programme.

At a risk score threshold of 50, the PARR algorithm identified 54.3% of patients admitted with a reference condition who would have an admission in the next 12 months; 34.7% of patients were flagged incorrectly (who would not have a subsequent admission). At risk score threshold levels of 70 and 80, the rate of incorrectly flagged patients dropped to 22.6% and 15.7%, but the algorithm found a lower percentage of patients who will be readmitted (table 1). The receiver operating characteristic curve in the figure illustrates the trade-offs for users between sensitivity (true positives) and 1 - specificity (false negatives) for the algorithm. The area under the curve is shown as 0.685, indicating a 68.5% probability that a randomly selected patient with a future admission will receive a higher risk score than a randomly selected patient who will not have a future admission.

Table 1

Ability of algorithm to identify patients with reference condition at risk of readmission in next 12 months, at different risk score thresholds

View this table:

Receiver operating characteristic curve for the algorithm

Accordingly, application of the algorithm presents choices to users, with trade-offs between finding as many patients as possible who will have subsequent admissions in the next 12 months and increasing the net cost of the intervention by including patients who will not be readmitted. In developing the algorithm, we aimed to help potential users to assess the “business case” for various risk score thresholds and for different assumptions about the impact of the intervention. This modelling is sensitive to the assumptions included in the analysis, particularly the cost of the intervention and the rate of anticipated reductions in hospital admissions. In table 2, we have used the “real time” approach to model various assumptions about intervention costs (£500, £750, and £1000 per patient) and reductions in hospital admissions (10%, 15%, and 20%) for patients identified by the algorithm for risk score threshold cut-offs of 50, 70, and 80 for a primary care trust with 1500 admissions for reference conditions (the average number in England). Cost per admission is based on mean hospital specific health resource group tariffs for 2003-4 for reference conditions as applied to sample patients.

Table 2

Business case modelling using algorithm and assuming 1500 “reference” admissions per year

View this table:

This analysis shows the potential business case feasibility of an intervention if it can achieve moderate levels of success in reducing hospital admissions. Critical to the breakeven analysis is the ability to target the intervention to patients most likely to have future admissions. Focusing on patients with risk scores above 70 (where only 22.6% of flagged patients do not have subsequent admissions) results in net savings for almost all assumptions about admission rates where intervention costs are £750 or less per patient. For patients above the risk score cut-off level of 80, a business case can be made for almost all assumption levels. For all of England, a risk score cut-off level of 70 would flag 50 000 patients annually (about 130 per primary care trust), and at a cut-off level of 80, 25 000 patients would be flagged for inclusion in an intervention (about 60 per primary care trust).


Potential limitations

The limitations of the approach used for the PARR case finding algorithm must be recognised. Firstly, the approach depends on computerised hospital admission data, and the deficiencies of these data are well known. Missing data and inaccurate coding (especially in diagnostic fields) can be a problem, as is the dependence on the “method of discharge” field to identify patients who die (the reliability of the field is not well established and many patients die outside the hospital). These data limitations generally tend to err in the direction of underprediction rather than over-prediction, and the improved coding that may accompany full implementation of payment by results (the diagnosis based per admission payment scheme for hospital reimbursement with a tariff for each of 550 health resource group diagnostic groups) might help to increase the power of case finding algorithms based on admission data.

In addition, using only previous hospital data (and characteristics of the community and local hospital), we cannot predict the future admissions of patients with no previous admissions. Accordingly, the PARR algorithm is not useful in identifying patients with emerging risks of high cost and high resource use, as opposed to those who are likely to have continuing high risks. Other characteristics of patients' health status are likely to be needed to improve the predictive power sufficiently to identify emerging risks of admission, and these factors are being explored in the next phase of the project when data from general practice electronic medical records (such as test results, lipid concentrations, blood pressure, glycated haemoglobin levels, body mass index, health habits, visit rates), accident and emergency data, hospital outpatient data, and social services data will be incorporated. The PARR model also helps to account for the dynamic nature of risk. Although patients with a high risk today may have lower risk tomorrow, this approach does allow the user some ability to compensate for dynamic risk as risk score thresholds for intervention can be set at higher levels when patients have a history of frequent admission and are at risk of a substantial number of future admissions (see previous and subsequent admission history in table 1).

Finally, we must recognise that the PARR algorithm identifies particular types of high risk patients who have substantial history of hospital resource use and high diagnostic severity. Although the focus on “reference conditions” is meant to target patients for whom some expectation of preventing or avoiding future admissions exists, the ability of intensive case management or other intervention strategies to have an impact on these patients has not been fully established.

Designing interventions

In the short term and medium term, a complete understanding of the most effective design of interventions for high risk patients identified by the PARR algorithm is difficult to achieve. Although a considerable amount is known about the characteristics of these patients, what remains elusive are the specific factors that lead to a preventable or avoidable admission. Could it be inadequate medical care? Lack of knowledge about identifying symptoms or warning signs of an acute episode of a chronic illness? Lack of knowledge about how to respond to such signs? Lack of confidence or motivation in self management? Social or personal factors that interfere with effective self management or optimal care seeking behaviour? Answers to these questions will be important in crafting an effective intervention strategy.

A rational approach would be to interview a sample of patients flagged by the algorithm and their providers from representative primary care trusts or strategic health authorities to learn more about the factors that contributed to any avoidable admission and obtain a better understanding of the range of their needs. This information could then be incorporated into efforts to design interventions, whether the services are ultimately “made” or “bought” by the primary care trust or strategic health authority; in the second case, the information would be used in developing the specifications to tender proposals for delivery of services from potential providers. Once the intervention has begun, primary care trusts and strategic health authorities could also consider randomising patients into intervention and non-intervention arms to learn as much as possible about the effectiveness and costs of the intervention.

What is already known on this topic

Several published studies, principally in the United States, have used statistical modelling to predict the future risk of hospital admission in individual patients

What this study adds

The factors that were most influential in predicting future admissions for reference conditions in the NHS included age, sex, previous admission, and clinical condition

A risk score from 0 to 100 can be assigned to individual patients; for a risk score of 50 the sensitivity of the algorithm is 54.3% and the false positive rate is 34.7%

A “business case” can be modelled to show the potential costs and impact of an intervention using the algorithm to reduce hospital admissions


  • Embedded ImageAn appendix is on

  • We acknowledge the help of Michael Damiani in preparing some of the hospital episode statistics data.

  • Contributors JB conceived the study, developed the model, did regression analysis, and wrote the article. JD conceived the study, contributed ideas to the development of the model, and helped to draft the article. TM helped to produce some of the indicators in the model and contributed substantially to analysis of the model. DW contributed significant ideas to the development of the model. JB is the guarantor.

  • Funding Essex Strategic Health Authority on behalf of all strategic health authorities in England; the Department of Health; the NHS Modernisation Agency. The researchers are independent of the funders.

  • Competing interests DW is the president and chief operating officer of a private company, Health Dialog Analytic Solutions, which has developed similar algorithms in the United States.

  • Ethical approval Not needed.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
View Abstract