Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Rapid Responses to:
|
|
Rapid Responses published:
|
|
|||
|
R Justin Davies, Surgical Specialist Registrar South Western Deanery
Send response to journal:
|
EDITOR- The data provided by Hartley, et al (1) add further to the debate on analysis of the quality of consultant surgical practice. Results are also available from cardiothoracic surgery (2,3), and efforts such as these to improve competence and protect patients are to be applauded. However, due to the nature of surgery and its potential complications, surgeons in general are very much in the firing line. How are our physician and general practice colleagues to be assessed? The number of incorrect prescriptions? The number of ward rounds carried out per week? These important questions remain to be answered. 1. Harley M, Mohammed MA, Hussain S, Yates J, Almasri A. Was Rodney Ledward a statistical outlier? Retrospective analysis using routine hospital data to identify gynaecologists' performance. BMJ 2005;330:929- 932 2. Bridgewater B, on behalf of the adult cardiac surgeons of north west England. Mortality data in adult cardiac surgery for named surgeons: retrospective examination of prospectively collected data on coronary artery surgery and aortic valve replacement. BMJ 2005;330:506-510 3. Aylin P, Alves B, Best N, Cook A, Elliot P, Evans SJ, et al. Comparison of UK paediatric cardiac surgical performance by analysis of routinely collected data 1984-96: was Bristol an outlier?. Lancet 2001;358:181-187 Competing interests: None declared |
|||
|
|
|||
|
Pantula SRK Sastry, Asst Prof & Medical Oncologist & BMT physician Tata Memorial Hospital, Parel, Mumbai, India
Send response to journal:
|
The present article provides an interesting insight. However already scientific data exists regarding surgical volume and surgical outcomes. High volume surgeon's results are probably superior and low volume surgeon's results are probably inferior, however part or most of this is dependent on centre effect, high volume centres giving better results than low volume centres.Thus to isolate the individual consultant as outlier may be incorrect, rather the factors of surgical volume and outcomes both for centre and consultant need to be addressed. Reference: Hospital Volume and Surgical Mortality in the United States Birkmeyer J. D., Siewers A. E., Finlayson E. V.A., Stukel T. A., Lucas F. L., Batista I., Welch H. G., Wennberg D. E. N Engl J Med 2002; 346:1128-1137, Apr 11, 2002. Special Articles Competing interests: None declared |
|||
|
|
|||
|
Jan Chalmers, Teacher, Keiskamma Art Project OX2 6HX
Send response to journal:
|
I welcome the paper by Harley and his colleagues. I was the nurse member of the Ritchie inquiry team, which the Secretary of State established in 1999 to consider why the serious failures in Rodney Ledward’s practice were not identified and acted on earlier. The gynaecologist member of the team and I ploughed through reams of medical and nursing notes, doing our best to tease out relevant information and set it in context. During the inquiry I watched and listened as colleagues of Mr Ledward talked about how difficult they had found it to raise their concerns and call him to account. I was also privy to the accounts of patients and their relatives, who talked about how they believed their personal and family lives, had been disrupted and in some cases destroyed by Mr Ledward’s lack of care and professionalism. As Harley et al note, our inquiry, like others, made little use of comparative data regarding the performance of individual consultants. In retrospect, if the kind of statistical analysis of hospital data that they now report had been presented in an easily understood format during the early 1990s, it seems likely that fewer women would have suffered, and a time consuming and costly enquiry might have been avoided. Competing interests: I was a member of the Ritchie Inquiry team |
|||
|
|
|||
|
C Kevin Connolly, Retired Physician Aldbrough House,Aldbrough St John,Richmond North Yorkshire, DL117TP
Send response to journal:
|
In their article Dr Harley and colleagues do not define what they mean by statistical outlier. If statistical outlier is interpreted as an individual whose outlying position in the distribution is almost certainly not due to chance, then they have failed. The numbers outside the 95% confidence intervals are as anticipated from the size of the population and therefore they are simple outliers. They probably have refined the analysis so that those at those at the extremes are more likely to be aberrant performers, but have not demonstrated statistically that this is so. To do so, it would be necessary to show that a disproportionate number were outside the expected confidence limits This might be done by demonstrating two populations or possibly by re-calculating the confidence intervals using only those between say the 20th and 80th centiles and looking for wide outliers (using 0.1% intervals),clusters and asymmetry outside the new intervals. This criticism makes the cautions expressed in the article the more valid. Attempting to uncover true bad performance by global audit is necessarily a very non-specific process. Whilst most bad performers will be outliers, the converse is not true. This means that the initial investigations of outliers must be carefully chosen and those investigated treated with delicacy in an absolutely confidential process. Competing interests: None declared |
|||
|
|
|||
|
Benjamin J. Cowling, Senior Research Assistant Department of Community Medicine, University of Hong Kong, Anthony J. Hedley
Send response to journal:
|
EDITOR - The article by Harley et al. (MBJ 2005; 330:929-32) makes an interesting proposal for retrospectively detecting outliers. However, we think that, as presented, the methodology may be quite difficult for clinical audit teams to understand and apply in routine practice. We also detected a basic error in the methodology. It might be helpful if we relate the Mahalanobis distance (MD) to a test statistic. For each consultant in each year, the authors calculate a kind of test statistic, which is related to the MD, and then they compare this test statistic against a particular percentile of a reference distribution. From Table 2 of their paper, it seems that approximately 16% of the consultants were flagged as outliers each year. The MD is in fact a straightforward method for calculating the distance (as a single value) between two points in multi-dimensional space. However for each consultant in each year, the test statistic was not the MD itself, but rather the lower boundary of an approximate 95% confidence interval for that MD, calculated by a computer simulation technique. Thus the test statistic not only incorporates information on how 'unusual' was the consultantˇ¦s performance, but also information on the degree of uncertainty about this performance. Since by random variation we would expect some consultants to be flagged as outliers each year, it is a good idea to look at a series of years. If we flag the worst 16% of consultants in each year, then we might expect 3% of consultants to be flagged in 3 or more of 5 years simply by chance (using a simple binomial model). In practice, the authors found that 11 of approximately 100 consultants were flagged as outliers in 3 or more years, including Rodney Ledward. The basic error that we found in the methodology is in the second paragraph of the Methods subsection entitled "Stage 2". The authors used the mean of the {sqrt(chi-sq)}, which is given by the {sqrt(k)} degrees of freedom, which in their study is stated as ({sqrt(7)=2.66)}. In general the expectation (mean) of a function of a random variable x is not equal to that function of the expectation of x, E(x). For example, the mean of {sqrt(x)} is not generally the same as the square root of E(x). A {sqrt(chi-sq)} distribution with 7 degrees of freedom actually has mean 2.55, not 2.66 as quoted in the paper. This would have made a practical difference to the results if the authors were intending to flag test statistics beyond the 52nd percentile of the reference distribution (corresponding to a cutoff at 2.55, the mean of the distribution) rather than the 58th percentile corresponding to a cutoff at 2.66. Competing interests: None declared |
|||
|
|
|||
|
Simon J Caswell, GP Beehive Surgery, 106/8 Crescent Road, Bolton BL3 2JR
Send response to journal:
|
I accept your concern about detecting poor performers early. You quote and analyse a case where poor performance has been established to see if this could have been found earlier. Your summary in fig.1 does,as you say show the index case as an outlier, but by an infinitessimally small amount, being the least outlying of all the shown outliers. Therefore, to have used this information prospectively, all the outliers would have to have been fully investigated to discover the problem with one of them. I am not convinced that you have demonstrated this method as viable in the real-life situation. By the way, nowhere does your paper explain what a "fitted variable" is, nor how one may acquire between 0.0 and 2.5 of such creatures, nor what is the relevance of the number of fitted variables to the test. Presumably all those with fitted variables beyond 2 SDs of the mean are outliers in the "fitted variable" analysis. Although I was top in statistics both in my University scholarship exam and at Medical School, I feel, that explanation of these matters would allow those not currently in Statistics to understand the fine details. Competing interests: None declared |
|||
|
|
|||
|
Oliver R Dearlove, Consultant Anaesthetist Royal Manchester Children's Hospital
Send response to journal:
|
This is a very interesting paper and the authors are to be congratulated for a thought provoking article which so far has not elicited much response from the section that it concerns. The big question is not whether Rodney Ledward was an outlier, but what to do with those that stuck out with him. I note that a similar question has been raised on the statistical outliers in exam marking. [1] It is only a matter of time before a CMO or minister of health of any political party points out its significance in order to gain public plaudits and a vote and tells the GMC that this sort of audit is the way forward for revalidation in order to gain more plaudits and more votes. People will notice that ministers have lately been very motivated on the subject of staying in office just recently. This motivation may soon leave them but their appetite for calling good doctors bad doctors because of their associations may well remain. We have so many measures that are non-starters in the field of revalidation that it bears some time to examine the wider implications of this article. Dame Janet, for example, tipped patient care questionnaires despite having heard months of evidence how wonderful Dr Shipman was once thought to be. This article on the face of it could be much better than that. Let us first decide to whom the article applies or could be applied. It is those with measurable output. Surgeons have been slow to realise that their output can be measured in terms of operations done, morbidity and mortality. When the going was good –heart transplants – surgeons were ready to appear on television and take the praise. Now that they are under scrutiny, we hear much more about team work, systems errors and dangerous wards. I have previously given references on the contribution of anaesthetists to morbidity following cardiac surgery. [2] It is a question of whether the statistical methods that select Rodney Ledward, work to select other surgeons who are not so good or should become administrators. First of all, of course, the authors knew Ledward was the outlier and so they knew where to look. So let us suppose, there were ten indicators of which four Ledward performed badly. In those four, how were his nearest neighbours scoring and what were their fates? The indicators were weighed or skewed to show that Ledward stuck out in a multi-dimensional space but that of course would be easy – middle-aged white gynaecologist working in the south east, with a large private practice who was suspended in 1996 would do the trick. What one really wants to show is that those stuck out on a limb with him are also labelled, and that investigation showed that their practice was not normative - otherwise the system does not work, although many votes might be garnered by saying it obviously does, or must do or could do. Isnt it lucky that Mr Ledward was not a foreign medical graduate because if that were the case all hell would be let loose. Now let us look at a few indicators where Ledward performed as mean, even though they were pre-selected as likely to be abnormal. The fact he is stuck in with the herd does not mean that they might not be significant in another set of circumstances. In short we have no idea what performance on those selected indicators “means”. In a stock market analogy we could call any authors who feel these methods are relevant, Chartists. And everyone knows in the Stock Market that there are people who put their money on chartists and there are other people who don’t. The surgeons under examination will naturally ask whether other practitioners are being subjected to this investigation especially as it has not been shown to be relevant, meaningful or applicable. The registrar above has pointed out that physicians who specialise in diabetes or cystic fibrosis will not be subject to this kind of analysis of their long term patients. I thought his suggestion of counting the number of mistakes a consultant physician makes in his prescribing was a good one, as both the registrar and I will never have seen one do it. I note there is a literal error in the script. In the paper version Leeward was suspended in 1996 and in the electronic 1966. And this I think highlights the crux of the discussion. Doctors have to perform to very high standards with very little tolerance of error and mistake. However the regulation and investigation of doctors’ performance can be based on opinion, political affiliation and trial and error, even when it comes to revocation of licences. Also it is worth pointing out that as the data collected on practitioners are so poor and unreliable in many cases, it is not indicative of the poor environment in which we are expected to work but yet we are vilified if our performance is 2 standard deviations away from the mean. Doctors have a duty to be careful in their practice but they also have a right to be assessed in a fair and equitable fashion. Patient safety must come first. That must not mean that everyone is stopped from seeing a doctor. Clearly there are still some who wish to go to a doctor because they trust him to know more about their diagnosis than they do. Regulators who judge standards must have an evidence base on which to base important decisions such as staying in office or revoking a licence. It remains to show that the authors’ method is significant, applicable and useful. Oliver R Dearlove FRCA Refs 1.Cheating in written examinations needs addressing too. http://bmj.bmjjournals.com/cgi/eletters/330/7499/1064 2. Dearlove OR Slogoff: Who was anaesthetist 7? http://bmj.bmjjournals.com/cgi/eletters/330/7499/1064 Competing interests: These views are personal and are not shared by his employer or any other organisation |
|||