Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Rapid Responses to:
|
|
Rapid Responses published:
|
|
|||
|
G H Hall, Retired physician
Send response to journal:
|
The authors say that "The likelihood ratio (LR) of a positive test is the odds that the test will be positive in a patient with the condition compared with a patient without the condition." This is not the case. "Odds" is the ratio of the probabilities of an event occurring to it not occurring. The LR is the ratio of probabilities of two quite separate events, the test positive in those with a disease and those without the disease. This error is probably the result of an uncritical acceptance of the definition provided in their reference 20 (Sackett et.al's textbook.) This shows how important it is even, or especially, for the experts to get their advice right. |
|||
|
|
|||
|
Alan Hassey, GP Fisher Medical Centre
Send response to journal:
|
We did use Sackett's definition (ref 20). The statistics were all calculated in StatsDirect(TM) using the method shown below. The EPR-Val calculator program calculates the same statistics (but no confidence intervals for LR+ or LR-) plus TPFN ratio & DBFind(10,000). The calculator does provide the correct results for the LR+ & LR-
DISEASE/OUTCOME
Present Absent
TEST + a (true +ve) b (false +ve)
- c (false -ve) d (true -ve)
Sensitivity = a/(a+c)
Specificity = d/(b+d)
Likelihood ratio of a positive test = [a/(a+c)]/[b/(b+d)]
Likelihood ratio of a negative test = [c/(a+c)]/[d/(b+d)]
Likelihood ratios allow you to quantify the effect that a test result has on the probability of an outcome (e.g. diagnosis of a disease). Using a simplified form of Bayes' theorem: posterior odds = prior odds * likelihood ratio
|
|||
|
|
|||
|
Bernard Fernando, General Practitioner Thames Avenue Surgery, Rainham,Kent. ME8 9BW
Send response to journal:
|
Validity of electronic patient records depends on the quality of the electronic system as well as the user. EDITOR - The electronic patient record (EPR) systems in general practice were, until recently, used largely as data repositories. The situation is now changing rapidly with the introduction of clinical governance. As there is a need to derive information from data in the electronic records, the validity of data becomes important. Therefore the survey by Hassey et. al.(1) is timely and much more work is needed in this area. It is of interest to note that there were two men in their survey recorded as having had cervical smears. This may appear trivial but it reflects a design fault of the EPR. Recording information on paper is a simple process with few steps, all of which are visible. However the same process on a computer involves many more steps and some are invisible. There is a potential for errors at each step, hence increasing chances of this in the EPR. The data recorded can be invalid simply because of a wrong diagnosis or accidental selection of an incorrect Read code from a pick list. Both of these are possible in the paper system; I have seen "anxiety t.d.s" written in notes as a prescription. More worryingly we have seen a cytotoxic drug incorrectly recorded for a child due to a mapping error of the system and an invisible record which could only be seen at the back end of the database. EPR can not be compared to the paper records, as some features are unique to it. The automated search facility and the ability to include an audio or video clip as part of the record are not available in the paper record. Different standards are also expected of an EPR such as error trapping at user interface level and design of the interface to minimise the error (2). Which actions are allocated to the user and which are allocated to computer is a system design issue. Building systems using tested and 'certified' components and use of object oriented software technology for development may address these issues. Bernard Fernando (1) Hassey A, Gerrett D, Wilson Ali. A survey of validity and utility of electronic patient records in general practice. BMJ 2001; 322:1401-5. (9 June.) (2) Wyatt J. Same information, different decisions: format counts. BMJ 1999; 318:1501-2. |
|||
|
|
|||
|
Philip J Bayliss Brown, Honorary Senior Lecturer in Medical Informatics Department of Diabetes & Endocrinology, St Thomas’ Hospital, Kings College, London
Send response to journal:
|
EDITOR – Hassey has rightly highlighted the importance of ensuring electronic records are accurate. (1) The study explored a method of measuring the validity and utility of electronic records in general practice including whether the coding of 15 marker diagnoses was a true reflection of the actual prevalence. However they are wrong in their assertion that no published accounts of measuring the validity of electronic record contents exist. Hogan performed a literature review and compared 20 articles that met certain quality criteria. (2) He recommended (as has been used in Hassey’s paper) that measures of completeness (sensitivity or detection rate) and correctness (positive predictive value) were valuable. These measures have also been shown to be valuable in measuring the quality of data retrieval. (3) Other measures derived from 2×2 contingency tables are less likely to be helpful because of the combination of a large total number of records and true negatives. In order to compensate for this Hassey proposes two new descriptive statistics. Previous reports have however used Cohen’s kappa (4) - this is a measure of the strength of agreement between the observed retrieval and the gold standard against that, which might be expected by chance. Cohen’s kappa has the advantage of being a well-validated single index and has been shown to be a useful index of measuring data retrieval from electronic records where performances of >0.9 can be achieved. (3) When Cohen’s kappa is applied to Hassey’s data it highlights similar priority areas of data concern where the value is <0.9 (obesity = 0.04, hypothyroidism = 0.89, iron deficiency anaemia = 0.86, asthma = 0.86). Prescriptions generated were also compared to those dispensed by a local pharmacy. As they were computer generated unsurprisingly 99.7% were reported to be valid; however of the 10 handwritten prescriptions only 80% were accurately recorded; perhaps a more suitable design would have been to check in a sample how many of the prescriptions reflected the correct dose and frequency? Hassey claims that the principal innovation of the study was the use of Read codes as the test for the true presence of a diagnosis despite Gray’s earlier account of identifying patients with ischaemic heart disease using a similar technique and reporting exactly the same sensitivity rate (96%). (5) The approach used by Hassey in triangulating disease codes with treatments and other findings has merit but due consideration should have been given to existing literature. 1 Hassey A, Gerrett D, Wilson A. A survey of validity and utility of electronic patient records in a general practice. BMJ 2001; 322: 1401- 1405. (9 June.) 2 Hogan WR, Wagner MM. Accuracy of data in computer-based patient records. JAMIA. 1997; 4: 342-55. 3 Brown PJB, Sönksen P. (2000) Evaluation of the quality of information retrieval of clinical findings from a computerised patient database using a semantic terminological model. JAMIA; 7: 401-412. 4 Brown PJB, Sönksen P, Price C, Young P. (1999) A Standard for Evaluating the Retrieval Performance of Clinical Terminologies. In Lorenzi N (ed) Proceedings of the 1999 AMIA Fall Symposium. Philadelphia: Hanley & Belfus 1999: 1031. 5 Gray J, Majeed A, Kerry S, Rowlands G. Identifying patients with ischaemic heart disease in general practice: cross sectional study of paper and computerised medical records BMJ 2000;321:548-550. ( 2 September.) |
|||
|
|
|||
|
Robert G Newcombe, Senior Lecturer in Medical Statistics University of Wales College of Medicine
Send response to journal:
|
Hassey, Gerrett & Wilson (1) indicate the need to validate electronic patient records in primary care. While findings are appropriately expressed in percentages as in this article, their EPR-Val toolkit yields incorrect confidence intervals. For the diabetes data, the calculated 95% confidence intervals are incorrect on two counts. Incorrect use of the table total as the denominator in calculating standard errors results in intervals which are too narrow, indeed grossly so for sensitivity and positive predictive value. Furthermore, the traditional method is inferior, especially for proportions near 100%. Our table shows their results, recalculated using the traditional, and the preferred Wilson method (2,3):
Statistic Estimate 95% confidence interval (%)
(%) EPR-Val Correctly Wilson
calculated method
traditional
method
Sensitivity 98.3 98.1 to 98.5 96.8 to 99.8 96.0 to 99.3
Specificity 100 100.0 to 100.0 100.0 to 100.0 99.9 to 100
Positive
Predictive
Value 99.3 99.3 to 99.4 98.4 to 100.3 97.5 to 99.8
Negative
Predictive
Value 100 99.9 to 100 99.9 to 100 99.9 to 100
Even with large samples the traditional method can give impossible values exceeding 100%, as for the positive predictive value here. The preferable Wilson method is available in Confidence Interval Analysis software (4) and for Excel (5). We are disturbed by the dissemination of the inadequately tested EPR-Val software, which should be withdrawn immediately from the BMJ website. Potential users should check new software using data with known answers, as errors are quite common. (6) Furthermore, some of the measures displayed are redundant while others, especially accuracy, are potentially misleading. The quoted accuracy of 99.9% conceals the fact that about 1 in 60 diagnosed diabetics is not coded as such on the database. There is a danger in using terms such as sensitivity, specificity and predictive value, familiar from the clinical or screening context, in data validation. In the former situation implicitly the "gold standard" is whether the individual really has the disease. In the data validation context, these quantities measure how two parts of the record agree. Clearly some of the 13302 patients whose records do not indicate "diabetes" would have diagnosable disease, if sought using systematic diagnostic criteria. We are concerned lest clinicians and managers naively believe such figures indicate the practice has successfully identified all prevalent diabetics and is managing them proactively. The study usefully showed that many diagnosed cases of asthma, iron deficiency anaemia, hypothyroidism and IHD are not adequately identifiable within present standards of record keeping. It is helpful to demonstrate such deficiencies, complete the audit cycle and correct them. But the converse is false: high sensitivity and specificity do not imply all is well. Certainly high "accuracy" does not. Even with improved consistency of record keeping for asthma etc., there could still be many practice patients with unidentified disease, just as for diabetes. 1 Hassey A, Gerrett D, Wilson A. A survey of validity and utility of electronic patient records in a general practice. BMJ 2001; 322: 1401-5. 2 Wilson EB. Probable inference, the law of succession, and statistical inference. J Am Stat Assoc 1927; 22: 209-12. 3 Newcombe RG, Altman DG. Proportions and their differences. In: Altman DG, Machin D, Bryant TN, Gardner MJ . Statistics with Confidence. 2nd edition, 2000. BMJ Books, London: 45-56. 4 Bryant TN. Computer software for calculating confidence intervals (CIA). In: Altman DG, Machin D, Bryant TN and Gardner MJ. Statistics with Confidence. 2nd edition, 2000. BMJ Books, London: 208-13. 5 http://www.uwcm.ac.uk/epidemiology_statistics/research/statistics/newcombe.htm 6 Bland JM, Altman DG. Misleading statistics: errors in textbooks, software and manuals. Int J Epidemiol 1988;17:245-7. Dr. Robert G. Newcombe
Prof. Douglas G. Altman
Dr. Trevor N. Bryant
|
|||
|
|
|||
|
Alan Hassey, GP Fisher Medical Centre, Skipton
Send response to journal:
|
We are grateful to Newcombe, Altman and Bryant for showing an error in our calculation of confidence intervals in the EPR-Val toolkit. Initial analysis of all our data was performed using SPSS(TM) & StatsDirect(TM) commercial software packages. The EPR-Val toolkit was developed and tested against StasDirect(TM) output for the practice data. We then added the confidence interval calculations, TP:FN ratio & DBFind(10,000) derived statistics. These were not externally tested against a commercial package. We believe that all other statistics are correctly calculated in EPR-Val and we see no need for the package to be withdrawn, though we will correct the CI calculations and undertake to do this as soon as possible. The authors(1) believe that there is no single best measure of validity for electronic patient records (EPRs). The EPR-Val toolkit provides a range of statistics calculated from the 2x2 contingency tables so that users may describe exactly what statistics they have used to establish EPR validity. We agree that some of the measures displayed may seem misleading - particularly "accuracy" (the proportion of all tests that are correct) because EPR validity has previously been calculated using sensitivity and PPV as measures of "completeness" and "accuracy" respectively.(2)(3) (4). We recommend that in future studies, those measuring EPR validity should say exactly what they mean by validity and state what measures they have calculated from thier data. We have provided the EPR-Val toolkit to facilitate this process. Finally we make no claims that measures of EPR validity reflect the true prevalance of any diagnostic condition in the community. Nor do these results reflect the effectiveness of our clinical management for these conditions. Our survey was designed to measure ONLY the validity of the data we hold in the clinical records. The derived statistics TPFN ratio and DBFind10,000 are included to help health workers understand how many true cases of the test condition (e.g. diabetes) remain undiagnosed in the database and to help quantify the benefits of validating a clinical database for those conditions. Time will tell whether future workers will find these measures useful. References 1. Hassey A, Gerrett D, Wislon A. A survey of validity and utility of electronic patient records in general practice. BMJ 2001;322:1401-5 2. Whitelaw FG, Nevin SL, Milne RM, Taylor RJ, Taylor MW, Watt AH. Completeness and accuracy of morbidity and repeat prescribing records held on general practice computers in Scotland. Br J Gen Pract 1996;46(404):181 -186. 3. Whitelaw F, Nevin S, Taylor R, Watt A.. Morbidity and prescribing patterns for the middle-aged population of Scotland. Br J of Gen Pract 1996;46:707-714. 4. Mant J, Mant F, Winner S. How good is routine information? Validation of coding for acute stroke in Oxford hospitals. Health Trends 1997/98;29(4):96-99. |
|||
|
|
|||
|
Alan Hassey, GP Fisher Medical Centre, Skipton, Davide Gerrett
Send response to journal:
|
We agree with Bayliss-Brown that sensitivity and positive predictive value are useful measures of the validity of electronic patient record systems. Indeed we give several references in our paper (1) to other studies that have used these methods and we believe that we have given due consideration to the existing literature. We did consider the use of Cohens (2) Kappa (K), originally proposed as a measure of agreement between two assessors allocating to nominal level categories. However, we were concerned with its reliance on symmetric marginal distributions and the major difficulty of interpreting the statistic. This was recognised by Cohen in the qualifying statistic K max, calculated by multiplying the marginal values of each column and row, and dividing by the total number of observations. The value is the maximum value that K could achieve in the given circumstances. Thus, 1 - Kmax is the proportion of possibilities excluding chance, which cannot be achieved as a consequence of differing marginals. Indeed, Collis interprets 1-Kmax as indicating the extent to which judges are using different criteria to make their judgements.(3) Our problem was that there are no acceptable standards for balancing and interpreting K, Kmax, 1-Kmax. This is to say nothing of weighted Kappa,(4) Partial Kappa (2) or the Proportion agreement. Further difficulties have been noted in the literature, which increased our reluctance to use the statistic.(5) (6) It might have been possible to use a multiple category, multiple assessor extension of Kappa as described by light.(7) Unfortunately the authors know of no probability density distribution calculation for this statistic. Overall, we felt that use of Kappa would have lead us into interpretation difficulties at a time when our goal was to provide a simple, easily usable tool. 1. Hassey A, Gerrett D, Wilson Ali. A survey of validity and utility of electronic patient records in general practice. BMJ 2001; 322:1401-5. 2. Cohen JA. A coefficient of agreement for nominal scales. Educational Psychological Measurement 1960;20(1):37-45. 3. Collis GM, Kappa, measures of marginal symmetry and intraclass correlations. Educational Psychological Measurement 1985;45:55-61. 4. Fleiss J. Cohen J, Everitt BS. Large Sample standard errors of Kappa and Weighted Kappa. Psychological Bulletin 1969;72(5):323-7 5. Maclure M. Willett WC. Misinterpretation and misues of the Kappa statistic. American Journal Epidemiology 1987;126(2):161-9 6. Brennan RL, Prediger DJ. Coefficient Kappa and some uses, misuses and alternatives. Educational Psychological measurement 1981;41:687-99. 7. Light RJ. Measures of response agreement for qualitative data. Psychological Bulletin 1971;76(5):365-77 |
|||
|
|
|||
|
lan Hassey, GP Fisher Medical Centre, Skipton
Send response to journal:
|
Following my earlier response above to Newcombe et al, I am pleased to report that a new version of the EPR-Val (EPR-Val2) calculator has been tested and submitted to the BMJ for publication on their website in due course. EPR-Val2 now uses Wilson's method for the calculation of 95% confidence intervals for the sensitivity, specificity, PPV & NPV. Those users who prefer a professional/commercial statistical package should note that the latest version of StatsDirect(TM) now includes routines for the calculation of confidence intervals for proportions (e.g. sensitivity). The method of calculation is different from Wilson's method (advocated by Newcombe et al) & returns slightly different confidence interval values than EPR-Val2. Both methods have a sound statistical basis. |
|||
|
|
|||
|
Alan Hassey, GP Fisher Medical Centre, Skipton
Send response to journal:
|
We are grateful to Newcombe, Altman and Bryant (1) for showing an error in our calculation of confidence intervals in the EPR-Val toolkit(2). These errors were corrected on 19 July 2001, and we would recommend that users upgrade from EPR-Val to EPR-Val2. The upgrade is available from the BMJ website(3). References (1) http://bmj.com/cgi/eletters/322/7299/1401#EL5 (2) Alan Hassey, David Gerrett, and Ali Wilson. BMJ 2001;322:1401-1405. (3) http://bmj.com/cgi/content/full/322/7299/1401/DC1 Alan Hassey, David Gerrett, Ali Wilson |
|||
|
|
|||
|
Phil Hughes, GP Eastfield Surgery
Send response to journal:
|
Hassey et al. have produced a thought provoking paper. I have attempted to repeat their work in a practice that has changed to EPR in the last few years. A number of questions have arisen. I note that they have used Read coded entries over the previous five years, (or one for asthma and ischaemic heart disease). I wonder about the rationale behind this, was this a pragmatic decision based on when they had gone paperless or is there another reason? I presume thay are searching for a consultation or coded entry with the Read code rather than the date of onset or diagnosis as these may have been many years before. This may apply particularly to Breast cancer, and hyperthyroidism where a patient may have had the disorder diagnosed for example 10 years ago and now be on no treatment. I am also interested in the date range they applied to the drug searches. Are these current, recent ie in the last 6 months or so, or ever been prescribed? Again this will affect the yields of the searches and subsequent data. In addition I suspect that the 2x2 contingency tables are incorrect for breast cancer and prostate cancer. As these conditions are sex specific should the population base not be female for breast cancer and male for prostate. Otherwise the number of true negatives will be double that which it should be. (false true negative?!) One final thought is that this paper is of great relevance to training practices. Normally EPRs are compared with paper records to assess the quality and accuracy of the summaries, however increasingly the paper records are inaccurate. Would there come a time when pracices would be expected to present this data along with their protocols for note summarisation and data capture as part of the validation procedure. Phil Hughes |
|||
|
|
|||
|
Alan Hassey, GP researcher Fisher Medical Centre, Skipton
Send response to journal:
|
In response to Dr Phil Hughes: We chose 5 years retrospective entries for Read Codes because we had been paperless for 6 years at the time of the study. Thus the practice EPR system should have been valid for the whole of this time. We searched for Read coded entries for the test conditions during the study period and compared this to specific (drug) treatments, diagnostic tests and procedure codes to validate the entry for each test condition. We included conditions with a matching entry made during the study period. This would only include conditions that were more than five years old where there was an entry about that condition during the 5 year study period. The same criteria applied to the drug searches. We think the contingency tables for breast and prostate cancer are correct. This is because that entries about these 2 conditions could still be made in the wrong records (male or female), so validation must include all active records within the EPR system. We were also aware that 1% of breast cancer occurs in men. As a training practice ourselves, we are also interested in the application of our study to training practice reapproval. It may well be that some form of validation or audit of "paperless" training practice EPR systems will become a necessary part of that process. However, at this stage, a review of processes rather than outcomes might seem a reasonable basis for training practice reapproval. |
|||
|
|
|||
|
susanne McCabe, retired home
Send response to journal:
|
The latest guidance on how electronic records should be used in GP surgeries has been published by DoH and General Practice Committee. This is not simply a voluntary activity but one about which all GPs are obliged to inform themselves and their services users. Those who have access to the net can find it on the NHSIA Stakeholder Bulletin site. Point 3 ' Information Governance' is particularly useful but all of it is accessable. Even this minimal information in the surgery would be useful and take a few minutes to download for those who have no access to computers. As for the tagging of children with their NHS number from birth, parents need to be aware that mistakes and slipups are already being identified and advice being given to NHS staff to double check - parents may wish to treble check. Competing interests: None declared |
|||