Mr Milburn's good hospital guideBMJ 2002; 325 doi: https://doi.org/10.1136/bmj.325.7358.230 (Published 03 August 2002) Cite this as: BMJ 2002;325:230
One star for trying
News (p 236)
The self proclaimed aim of Health Secretary Alan Milburn's star awarding exercise, rating the performance of trusts, is simple. It is “to provide patients and the general public with comprehensive, easily understandable information on the performance of their local health services.”1 No one could argue with that. But execution is another matter. Two dilemmas arise when constructing a summary measure of performance in an organisation as complex and heterogeneous as the NHS. On the one hand, how “comprehensive” can it be while remaining “easily understandable?” Sophistication can all too easily turn into mystification. On the other hand, can the same sort of exercise meet the requirements of the multiple audiences involved? The general public apart, these include ministers and their officials, the boards of trusts as well as the doctors, nurses, and others working in them, and the commissioners of services. Attempting to meet everybody's expectations may mean frustration all round.
The Department of Health's performance ratings are the product of a complex game of statistical snakes and ladders.2 For acute trusts, they are based on three very different sets of information (the methodology is different for ambulance and mental health trusts). Firstly, there are the department's own political targets: nine in all, dominated by financial performance and various waiting targets. Secondly, there are the judgments of the inspectors of the Commission for Health Improvement, marking the reviewed trusts on seven dimensions. Thirdly, there are 29 performance indicators split into three groups—with a clinical, patient, and staff focus, respectively—that together make up a so called balanced scorecard.
The various inputs have their own individual technical problems. So, for example, there are statistical problems about the presentation of clinical indicators.3 The Commission for Health Improvement's ratings are based on a fragile, still evolving methodology.4 But the biggest problem lies in the process of converting 38 indicators into a summary measure. If a trust “significantly underachieves” on three of the Department of Health's key targets, it falls automatically into the category of the damned: zero stars and faces the prospect of visits from the missionaries of the NHS Modernisation Agency at best and the threat of being taken into the pupilage of a successful trust at worst. A highly critical report from the Commission for Health Improvement has the same effect. But thereafter all simplicity vanishes. If a trust achieves all of the Department of Health's key targets, it is not automatically guaranteed three star status and what goes with it: the promise of £1 000 000, earned autonomy, and keys to the heaven of foundation status. It may lose one of its stars, failing a satisfactory review from the Commission for Health Improvement or adequate balanced scorecard performance (defined as being outside the lowest 20% of the distribution for all three areas and within the top 50% in one area). Conversely, a moderate level of underachievement on the key targets may be compensated by a satisfactory balanced scorecard performance or review from the Commission for Health Improvement, turning one star into two.
The methodology is open—inasmuch as it is available on the web—but hardly transparent. What is the public to make of it all? Even assuming that there is scope for choice, a prospective patient is more likely to be interested in the performance of a specific department or doctor than in an ambiguous star award. The system may well raise unnecessary anxiety among the public as well as anger among clinicians. So, for example, two of the “starless” trusts, Bath and Bristol, score considerably better on their clinical indicators than some with three stars. Also, it is not self evident that either ministers or trust boards need a star system to stir them into action: they know all about the Department of Health's key targets, which, for better or worse, are the main driving force. Further, it is not clear how much stability there is in the ratings: less than half the acute trusts retained their 2001 ratings with 47 moving up and 37 moving down. Although differences in methodology may be partly responsible, this hardly suggests that the rating system provides a solid base for policy making.
Next year the Commission for Health Improvement takes over responsibility for the assessment system and faces the challenge of making it less opaque and more comprehensible. In doing so, it might usefully consult the original exponent of the star system: the Michelin guide. In classifying hotels Michelin does not just award stars for the cooking. Nor does it try to collapse all aspects of an institution into one metric. Instead, it has an elaborate battery of symbols for different aspects of the performance of the hotel. Something similar for trusts might be richer in information, provoke less anxiety or anger, and above all be more accurate because it is multidimensional.