Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Rapid Responses to:
|
|
Rapid Responses published:
|
|
|||
|
Graham B Byrnes, Senior research fellow Centre for genetic epidemiology, University of Melbourne, 3081 Australia
Send response to journal:
|
Just how confusing is illustrated by the fact of two errors, one in the secondary title and quoted in the editorial. The author opens with the rhetorical question: "Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?" Actually I can't, but neither can he. What he actually shows is that for a rare condition like Lupus, only 1% of those predicted to have the condition by the test will in truth have it. In other words, it severly over-estimates the number with Lupus. Suppose 1 million people were tested, among whom 330 had Lupus. The test would be expected to correctly identify 94% of the 330, or 310. It would also, unfortunately, incorrectly suggest that a further 30,000 or so also have Lupus due to the 97% specificity. In other words its predictive value is poor (low positive predictive value) although it doesn't miss many true cases (high sensitivity). In the conclusion the author has another go: "our left brain has difficulty grasping how a test can be 94% sensitive and yet be correct only 1% of the time." That's not right either. In the example above, it will give about 30,000 false positives and 20 false negative. Let's say a total of 30,000 mistakes. It gets the other 970,000 right. To me that sounds like it's right 97% of the time. Yours sincerely,
Competing interests: None declared |
|||
|
|
|||
|
Nita G Forouhi, SpR Public Health Brent PCT, Wembley Centre for Health & Care, 116 Chaplin Road, Wembley, Middlesex, HA0 4UZ
Send response to journal:
|
Dear Sir Thanks to Dr Tze-Wey Loong for sharing his clear and concise explanation of sensitivity, specificity and predictive value of diagnostic tests.(1) This is the best "tutorial" on this subject I have ever come across! Just wanted to point out though, that in the summary table (last table of clinical review), PPV should be summarised as no. of true positives divided by no. of positive results, not no. of people with the disease. If that is correct, I truly have started using the right side of my brain today, as suggested in this excellent review! Sincerely, Nita G. Forouhi Reference: 1. Tze-Wey Loong. Clinical Review: Understanding sensitivity and specificity wtih the right side of the brain. BMJ 2003;327:716-9 Competing interests: None declared |
|||
|
|
|||
|
Matthew McKenna, Medical Officer 30333
Send response to journal:
|
This article confuses more than it clarifies. If the test has a sensitivity of 95%, that means it identifies 95% of the people in the population who are affected. The question opening the article proposes an impossibility. What it should say, based on what is in the rest of the article, is that: "If a test has a sensitivity of 95%, how can only 1% of the people who are identified as positive actually be affected." Competing interests: none declared |
|||
|
|
|||
|
GH Hall, Retired physician EX1 2HW
Send response to journal:
|
Colour does not help the colour blind. The strain of detecting which was what distracted any of my brain resources left from understanding. More generally, are the brain mechanisms for geometry and algebra (+ arithmetic) on opposite sides? And can one choose which one to use? Any experienced Bayes watcher can rely on almost every exposition of the mysteries to contain a mistake,as here too,alas. I think the best model for comprehending by play is to construct three coloured(!) transparent circles signifying disease, symptom 1, and symptom 2, and "Do a Venn." Then one can see directly how changing the overlaps affects the parameters. Competing interests: None declared |
|||
|
|
|||
|
Antony R Goldstone, Honorary Clinical Fellow & MRC Research Clinician MRC Cancer Cell Unit, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 2XZ, Susan Charman [Senior Scientist, MRC Biostatistics Unit, Cambridge]
Send response to journal:
|
Dear Editor I was intrigued by your challenge – “my guess is that not one BMJ reader in a thousand could answer that question, but the numbers are in many ways the easy bit”. I eagerly scrambled to read the article, but I am not sure the question has the correct answer. In Loong’s example of screening for SLE, he correctly concludes that the positive predictive value is only 1% in this low incidence population. (31 true positives / (31 true positives + 3067 false positives) = 1%) However I suspect the author may have confused sensitivity and positive predictive value in the wording of his opening statement - "Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?". The latter, identification of affected individuals, is a test of sensitivity and not positive predictive value. Thus in Loong’s illustrated example of 100,000 screened, 33 had the disease, and of these 31 were picked up by the test. Thus the test detected 31 of 33 of those affected (94%) in the general population. The 1% figure relates to the fact that only 1% of those that tested positive, will turn out to have the disease. Thus the opening conundrum should properly read: "Can you explain why, in a test with 95% sensitivity, only 1% of those that test positive may turn out to have the disease?" Your sincerely Dr. A. R Goldstone Honorary Clinical Fellow, MRC Cancer Cell Unit & Department of Surgery dr@tonygoldstone.com Dr. L Sharples Senior Scientist, MRC Biostatistics Unit Linda.Sharples@mrc-bsu.cam.ac.uk Competing interests: None declared |
|||
|
|
|||
|
Angus Dobbie, Consultant Clinical Geneticist St James's Hospital Leeds LS87TF UK
Send response to journal:
|
"Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?" asks the introduction to Loong's article. No I cannot, because a test with 95% sensitivity will pick up 95% of those affected. Before we email Dr Loong, I wonder if any editor on the BMJ would like to come clean, since this claim is made in the intoduction under the title. Does someone on the staff need to read the article they are introducing? Competing interests: None declared |
|||
|
|
|||
|
Susanne Haga, Project Director, Human Genetics The Center for the Advancement of Genomics Rockville, Maryland 20850 USA
Send response to journal:
|
While Loong presents a wonderful illustration to understand test sensitivity, specificity, and predictive value, there is an error in the calculations of the 'real example' (Figure 12). For a condition with a given prevalence of 33/100,000 (.033%), test sensitivity of 94% and test specificity of 97%, the number of true negatives should be 96,968 (not 96,900) and the number of false positives should be 2,999 (not 3,067). The calculations for negative predictive value and positive predictive value are still correct given the extremely low prevalence. Competing interests: None declared |
|||
|
|
|||
|
Salvo Fedele, MD Paediatrician viale Galileo Galilei 99, 90145 Palermo (Italy)
Send response to journal:
|
Dear Sir In your very useful article there is a little mistake (only of the left brain). In the last figure (summary points) You write: “PPV is N° of true positive divided by N° of people with disease” but your right brain help the readers to correct the mistake: PPV is N° of true positive divided by N° of positive results. Using both sides of the brain is really a good idea… I live in Italy and I know Mr Berlusconi and his contempt of the “left” brains! Thank you very much for your article. Yours sincerely, Salvo Fedele, MD Paediatrician – viale Galileo Galilei 99 90145 Palermo Italy – email: sfedele@tin.it Competing interests: Normaly I hope to use both sides of the brain (right too). |
|||
|
|
|||
|
Andrea Glassberg, clinical fellow, pulmonary and critical care medicine University of California at San Francisco
Send response to journal:
|
Bravo to Tze-Wey Loong for his (?her) use of graphics in explaining the concepts of specificity and sensitivity. But shame on him (?her) for using such imprecise language as a teaser to bring the reader in. The question asked by the author: "Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?" is misleading and inaccurately phrased. The true answer to this question should be "no" for everyone because it is statistically impossible for a test with 95% sensitivity to identify anything other than 95% of those affected if the test is properly administered. The question I believe the author meant to ask is "Can you explain why a test with 95% sensitivity might properly characterize as being affected only 1% of those who test positive in the general population?" This is a completely different question, the answer to which is provided in the paper. Competing interests: None declared |
|||
|
|
|||
|
Zeno Bisoffi, Head, Centre for Tropical Diseases S. Cuore Hospital, 37024 Negrar, Verona, Italy, Jef Van den Ende, Prince Leopold Institute of Tropical Medicine
Send response to journal:
|
Despite the obvious mistake in the sub-title, the visual representation of the main characteristics of a test is a noble purpose, though its didactic efficacy needs to be assessed. Nevertheless, the model lacks an important component, that is, the visual representation of likelihood ratio (LR). Yet, positive and negative LRs are a very important concept to clinicians, giving an immediate idea of the strength of a test to confirm the suspicion of a disease, if it is positive, and the opposite if it is negative. We have used for years a simple visual model based on four squares of different colours to represent the true positive (left upper square), false positive (right upper square), false negative (left lower square) and true negative (right lower square). Each square's area is proportional to the size of each group. When we want to explain the test properties (sensitivity, specificity and LR), regardless the prevalence, the four square areas are proportional to the actual rates over equal populations of people with and without the disease. When we introduce the concept of predictive values, the areas are proportional to the actual size of each group in the real population, according with its prevalence. Students immediately grasp the main concepts this way.
Sincerely.
Competing interests: None declared |
|||
|
|
|||
|
James H MacCabe, Clinical research worker Institute of Psychiatry, London SE5 8AF
Send response to journal:
|
I would like to congratulate Dr Loong on a his article, which I have no doubt will become a valued resource for teachers of epidemiology. The very clear text and diagrams alone would have made it a strong contender for the reading lists of most research methodology courses, but its status is guaranteed by something that will delight students for years to come - the chance to play "spot the howler". As noted by your previous correspondents, the sub-title invites the reader to explain the impossible, and the "summary points" gives the formula for sensitivity instead of that for PPV. Competing interests: None declared |
|||
|
|
|||
|
David JR Hutchon, Consultant Obstetrician Memorial Hospital, Darlington. DL3 6HX
Send response to journal:
|
Oh dear! The question should have been “Can you explain why a leading international journal, with access to expert referees, fails to identify serious statistical misinformation – which is then repeated by the editor.” Last weeks BMJ also had statistical misinformation. (1,2) As the author points out in the text “The test has correctly identified 24 out of the 30 people who have the disease. Therefore the sensitivity of this test is 24/30 = 80%” Turning this statement round a test with 80% sensitivity will correctly identify 80% of the people who have the disease. Perhaps the author was confused with positive predictive value. A test with 95% sensitivity can easily have a positive predictive value (PPV) of 1% but the positive predictive value does not equal the percentage of people correctly identified. The PPV is the number of true positives divided by the number of total positives. Therefore the specificity of the test determines whether the PPV is high or low. The PPV of any test will always be lower in a population of low prevalence than in a population of high prevalence. (Unless the test is 100% specific) The visual concept is very good and perhaps this was a deliberate error to make people think more about what is really a fundamental concept of medicine. Let us hope so! 1. Recent developments in obstetrics Andrew H Shennan BMJ 2003; 327: 604-608. 2. Positive predictive value wrong David JR Hutchon bmj.com, 17 Sep 2003 Competing interests: None declared |
|||
|
|
|||
|
Robert W Baker, GP Principal and Trainer Swanage Health Centre, Dorset, BH19 3HE
Send response to journal:
|
Editor - Loong is to be congratulated on the excellent pictorial representation of specificity, sensitivity and predictive value. I too have struggled with these concepts since medical school, but when I showed the tables to my 13 year old son he, like me, found them perfectly clear. My attention to the article was drawn by the rather outlandish claim by the editor that only one in a thousand readers would be able to explain the question at the beginning of the article. Thanks to the clarity of the article itself it was obvious to my son and I that a test with 95% sensitivity will pick up 95% of those affected (not 1%!) The summary points box at the end is excellent, but "No of people with the diseae" (bottom left) should read "No of positive results". Competing interests: None declared |
|||
|
|
|||
|
Christopher J Martin, GP & Lead researcher Laindon Health Centre Primary Care Research Team, High Road, Laindon, Essex, SS15 5TR
Send response to journal:
|
EDITOR - This is an excellent article by Loong that describes the differences between sensitivity, specificity and positive and negative predictive values very well. However, I must confess that I am unable to explain why a test with a 95% sensitivity might only identify 1% of the affected people in a population as this is the definition of sensitivity. I can however explain why a positive test with a 95% sensitivity might only be correct 1% of the time. Competing interests: None declared |
|||
|
|
|||
|
Richard G Fiddan-Green, None None
Send response to journal:
|
If a test has a sensitivity of 100% and its specificity "clearly could not be worse" the test is not necessarily useless as these authors claim. On the contrary having a 100% sensitivity could be the make or break of a test if, as in the case of monitoring the critically ill, the intention of the test is to distinguish between those who might have cause for concern from those that do not. Thus if the gastric intramucosal pH, which has been reported to have a sensitivity approaching and in some papers reaching 100% in predicting the development of organ dysfunctions and death, is normal and more importantly remains normal there is no cause for concern. Indeed a persistently normal gasric intramucosal pH may be good grounds for not admitting a patient to hospital or to an ICU or for discharging him/her. The specificity and positive predictive value of the gastric intramucosal pH improves as the degree and duration of a gastric inramucosal acidosis increases and if the arterial lactate rises. These values aid in establishing the urgency of admission and in deciding what to do to avert organ dysfunctions and death. Some have claimed that gastric tonometry is too sensitive to be of any clinical value because it promotes unnecessary and potentially harmful interventions. On the contrary the key to improving outcome in the critically ill is to increase the sensitivity of monitorng so that clinicians can intervene earlier and thereby improve the cost- effectiveness of their utilisation of resourses. Competing interests: None declared |
|||
|
|
|||
|
Tze-Wey Loong, Clinical teacher (part time) COFM National University of Singapore
Send response to journal:
|
Thank you for all the feedback so far. Three clear errors have so far been uncovered in the article – it was kind of some of you to suggest that they were intentional but alas they are not. 1. The statement “Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?” is clearly a boo-boo. It was first spotted by Dr. Byrnes (“It can really be confusing…”), but I especially want to thank Dr. Dobbie (“Don’t confuse us further!”), for not assuming that this teaser paragraph was written by me (it wasn’t). 2. The summary (last) figure is also wrong and this is my fault. As Dr. Forouhi was the first to point out, PPV should be “no. of true positives divided by no. of positive results”, and not be “no. of true positives divided by no. of people with the disease”. 3. The correct figures for the actual number of true negatives and false positives was given by Dr. Haga (“Error in calculations in real example”). This error probably occurred through an over reliance on the right side of my brain. I must have obtained the erroneous numbers by counting the squares in figure 12 instead of letting my left-brain crank out the figures with a calculator, as I should have. Luckily the errors were not large enough to affect the calculation of NPV and PPV. As Dr. Hall points out, the colour scheme used in the illustrations is unfortunate for those with red-green colour blindness. From the rapid responses and direct e-mail to me, I gather that those who are having trouble understanding sensitivity and specificity generally find the article helpful; on the other hand, those who already have an established paradigm (whether right brained or left brained) did not find it useful to have an alternative visual explanation pressed upon them. Perhaps the error in the teaser should be left in and this statement added: “If you can spot the error in the teaser, read no further, but if you cant, you might find the following helpful”. Thanks again and I look forward to further comments. Competing interests: Umm - I'm the author? |
|||
|
|
|||
|
Christopher J Martin, GP and lead researcher, Laindon Health Centre Primary Care Research Team High Road, Laindon, Essex, SS15 5TR
Send response to journal:
|
EDITOR - Loong's article admirably demonstrates how positive and negative predictive value can vary with different prevalence of disease (prior probability). It is often assumed that sensitivity and specificity are fixed, however this is not the case. When I teach about the variations of sensitivity (SEN), specificity (SPEC) and positive (PPV) and negative predictive value (NPV), I use microcytosis (a low mean cell volume MCV) and iron deficiency anaemia (IDA) as the example instead of systemic lupus erythematosus. It is easy to demonstrate the variation in the PPV and NPV between primary care surgeries and secondary care gynaecology clinics where the prevalence of IDA is much higher. However, you can also demonstrate how the sensitivity of the test is lower in the antenatal clinic where folate deficiency reduces the sensitivity of the low MCV for IDA as it has the tendency to competitively increase the MCV and thus mask the effect of IDA. Similarly, in a surgery in an area with a large Bangladeshi community, a low MCV will have a lower specificity as a result of the increased prevalence of beta-thallasaemia. Thalassaemia may cause a low MCV in the absence of iron deficiency, and will thus cause false positives when the low MCV is being used to diagnose iron deficiency. Competing interests: None declared |
|||
|
|
|||
|
Pisut Katavetin, Internist Bangkok, Thailand
Send response to journal:
|
Loong (1) and most of us believe that sensitivity and specificity of a test will not change in any population we test. But I think it's probably CHANGE. Generally any test has problem to detect mildly affected person (false negative=FN) and the most extreme normal person (false positive=FP). Sensitivity is roughly the ratio of number of severely affected person to number of all affected person and specificity is roughly ratio of less extreme normal person to number of all normal person. Our statement that sensitivity of any test will not change in any population we test is base on ASSUMPTION that the distributions of mildly and severely AFFECTED person would be the same in any population. Accordingly, specificity of any test will not change in any population IF the distributions of less extreme and most extreme NORMAL person are the same. If we use a test in the population with tendency to have more severely affected person, which generally have more extreme normal person, it might have a more sensitivity and less specificity. For example, if we test sensitivity and specificity of abdominal circumference (AC) of more than 90 centimeter (cm) to detect obesity; body mass index (BMI) > 30, in the LEAN ISLAND which mean and standard devition (SD) of BMI are 22 and 4 respectively. We can see that most of non-obese have BMI far below 30 which are unlikely to have AC > 90 (high specificity) and most of obese have BMI slightly above 30 which are likely to have AC < 90 (low sensitivity) In the FAT ISLAND which mean and SD of BMI are 38 and 4 respectively, we can see that most of non-obese have BMI slightly below 30 which are likely to have AC > 90 (low specificity) and most of obese have BMI far above 30 which are unlikely to have AC < 90 (high sensitivity). Finally, I think that sensitivity and specificity of a test may CHANGE in different population with different severity of a trait. 1. Loong T-W. Understanding sensitivity and specificity with the right side of the brain. BMJ 2003;327:716-9 Competing interests: None declared |
|||
|
|
|||
|
Jennifer S Mindell, Honorary senior clinical lecturer , Department of Epidemiology and Public Health, Imperial College London, London W2 1PG
Send response to journal:
|
Others have already highlighted the BMJ Editor's repetition of Loong's main error in asking the wrong question. However, I believe there is an answer to the question as asked, to "explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?" Surely the answer to this conundrum is that about 99% of those affected were not tested! Tze-Wey Loong's final point is disingenuous. To demonstrate a sensitivity of 99.999%, one would need to test 100,000 individuals with the disease to yield a single false negative result. With a disease prevalence of 0.033%, this would require a population of 303,030,303 to find 100,000 affected individuals in whom to test the test. Competing interests: None declared |
|||
|
|
|||
|
Colin C Geddes, Consultant Nephrologist Western Infirmary, Dumbarton Road, Glasgow, UK G11 6NT
Send response to journal:
|
I enjoyed Tze-Wey Loong's article and illustrations using the right side of the brain to understand the concepts of sensitivity, specificity, positive predictive value (PPV) and negative predictive value. However I believe the question at the top of the article used to grab the reader's attention ("Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population?") was not answered in the article. The correct answer is that a test with 95% sensitivity will only identify 1% of affected people in the general population if it's sensitivity is based on a selected population. The answer that Loong seems to give in the article demonstrates the left side of the brain struggling to understand what sensitivity and positive predictive value actually tell us about the test. As Loong points out, sensitivity refers to "how good a test is at correctly identifying people who have the disease". Therefore by definition a test with 95% sensitivity will identify 95% (not 1%) of affected people in the general population (assuming the test's sensitivity is based on the general population). Loong illustrates this (in figure 12) using antinuclear antibody (sensitivity 94%, specificity 97%) to screen for systemic lupus erythematosis (prevalence 33/100,000) and also shows that for the same test it is the PPV ("the chance that a positive result will be correct") that is only 1%. In the accompanying editorial Richard Smith suggests that "not one reader in a thousand" could answer the original question. From Loong's discussion of sensitivity and PPV I think the question should have been "Can you explain why a screening test with 95% sensitivity may identify a group of patients in whom only 1% have the target disease?" I would suggest that many more than 1 in 1000 readers would know that the answer is that the disease in question has a low prevalence. Competing interests: None declared |
|||
|
|
|||
|
Thomas F. Heston, clinical practice Sandpoint, Idaho USA 83864
Send response to journal:
|
What an interesting article by Loong; all of the rapid responses so far have been excellent and have helped me better understand sensitivity and specificity. I especially enjoyed the comments about sensitivity and specificity changing with different patient populations. A major challenge physicians face is explaining sensitivity and specificity to patients. Drawing a mental picture by simplifying the statistics into simple fractions is especially useful. The pictures used by Loong in an attempt to activate right brain thinking are confusing to me. By having a diagram with 100 dots, with four different categories (dark circle in red, open circle in red, dark circle in green, open circle in green), the left brain is tempted to dominate the thinking process. I prefer to use the "big picture" as a method to activate the right brain. For example, a treadmill test has a sensitivity of about 65% and a specificity of about 75% (these are rough numbers used to demonstrate my point). When explaining the test to my patient, I tell them that 1 out of 3 people with heart disease will have a normal treadmill test. That is, the test will miss 1 out of every 3 people with heart disease. On the other hand, I also tell them that 1 out of 4 people without heart disease will have a false-positive treadmill test . By using simple fractions to describe sensitivity and specificity, the right brain is able to create a mental picture (e.g. it is easy to picture 3 out of 4, but difficult to picture 75 out of 100). Also, by using simple fractions, the left brain is activated (numbers are used) but does not dominate the thinking process (since a mental picture is easily generated). Ideally, this leads to whole brain thinking and a better grasp of the concept. When analyzing data for research, using percentages to describe sensitivity and specificity is required. But when explaining the concepts in person to patients or other doctors, using simple fractions is much easier to grasp. Competing interests: None declared |
|||
|
|
|||
|
Sam Lewis, General Practitioner Newport, Pembrokeshire, SA42 0TJ
Send response to journal:
|
Richard, you need to do some work on your right brain. You missed the point entirely - a 100% sensitive test with no specificity will result in ALL the population being admitted to intensive care.. and you think that's cost- effective ?? Try jogging, sailing, hang-gliding.. but do get out more. Competing interests: None declared |
|||
|
|
|||
|
Louis B. Jacques, Associate Professor CMS, Baltimore MD 21244
Send response to journal:
|
Students often confuse themselves with tortured mathematical manipulations. I have found that 2 simple rules eliminate the majority of their errors.
Given a standard 2x2 table, with disease state in columns and test result in rows, two patterns emerge. All four measures: sensitivity, specificity, positive predictive value and negative predictive value, are computed with 1 number (1 box of the 2x2 table) in the numerator and the sum of 2 numbers in the denominator. Since I teach in the US, I remind the students that we start reading from the top left of the page. (Clearly this analogy would not be appropriate in some countries). So we begin our calculations from the top left box. Thus the possible options for calculating sensitivity are very limited. Take the top left box (TP) and divide it by sum the top left (TP) and bottom left (FN) boxes. For specificity, we do the opposite. The opposite of top left is bottom right. So take the bottom right box (TN) and divide it by the sum of the bottom right (TN) and top right (FP) boxes. Similarly, positive and negative predictive values follow the same "rules" but go across the table. Student feedback suggests that that this has been helpful. Competing interests: None declared |
|||
|
|
|||
|
Sidney B Rosalki, Consultant 27 Harley Street, Lond.W1G 9QP, Nil
Send response to journal:
|
I was dismayed at the question posed in the introduction to the article by Dr Loong (BMJ 327, p716,2003),repeated by the Editor (Editor's Choice)and the "answer" thereto. By definition, a test with 95% sensitivity (sensitivity=positivity in disease)will identify 95% of "affected people"in the general (or other) population.The intended question (I assume)might reasonably have been " Why might only 1% of the general population showing test positivity with a test of 95% sensitivity have the disorder being tested?". It is only then that specificity (negativity in non-disease) and disease prevalence need be taken into account. Competing interests: None declared |
|||
|
|
|||
|
Chris D Evans, Consultant Psychiatrist in Psychotherapy Rampton Hospital, Nottinghamshire Healthcare NHS Trust, DN22 0PD
Send response to journal:
|
Loong asks "Can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general population? The visual approach in this article should make the reason clearer" (p.716) and Smith suggests (editorial) that "not one BMJ reader in a thousand could answer that question". Well, well, problems in the communication of risk eh?
I think a test with 95% sensitivity will identify 95% of those with the disorder if all the population with the disorder are screened with the test. That's the definition of sensitivity. It may well be the case that the specificity of the test, and the true prevalence are such that the PPV of the test will be only 1% (i.e. 99% of those testing positive are actually healthy) but that's not what you asked and I think 99.9% of readers of the BMJ may be too puzzled to quibble. Of course, if most of the general population who have the disorder (TB, scabies, deeply ingrained dirt everywhere, many psychological disorders) are in the homeless population and the test is only applied to those who are easy to screen, it is easy for a sensitive test to miss most of those with the problem ... but that's a different problem isn't it? Chris Evans Competing interests: None declared |
|||
|
|
|||
|
Nicola Petrucci, Consultant in Anaesthesia Azienda Ospedaliera Desenzano (Italy)
Send response to journal:
|
Dear Sir, The article by Tze-Wey Loong (1), trying to explain what sensitivity and specificity mean, failed to address the actual usefulness of these tests in clinical practice. Many clinicians think that tests with high sensitivity are useful for 'ruling in' diagnoses and tests with high specificity are useful for 'ruling out' diagnoses. In fact, when a test has a high sensitivity, a negative result effectively 'rules out' the diagnosis (it identifies the true negatives). Similarly, a very high specificity test effectively 'rules in' the diagnosis (it identifies the true positive)(2). 1. Tze-Wey Loong Understanding sensitivity and specificity with the right side of the brain BMJ 2003; 327: 716-719. 2. Sackett DL et al. Evidence-Based medicine. Churchill Livingstone, 2000 Competing interests: None declared |
|||
|
|
|||
|
adrienne j j garner, gp locum hp4 2pn
Send response to journal:
|
Thanks to the excellent article by Loong I thought I had finally grasped the intricacies of sensitivity/specificity as well as positive/negative predictive values. Unfortunately the final diagram - Summary points - confused me totally. Surely the bottom left box should be titled "no. of positive results" not "no. of people with the disease" - there are lots of pink open circles. In the same issue on page 723 I presume that what the author meant to say is that "Fibroids....halve the live birth rate in IVF cycles" and not half. Finally, whilst I'm in writing mode, in issue 7415 of Sept.13th I read a POEM which told me that anti-inflammatories don't slow cognitive decline in Alzheimers followed two pages later by an editorial saying that a reduced risk of Alzheimer's in long term users of non-steroidals is supported by "more recent data". Which should I believe ? Adrienne Garner - Competing interests: None declared |
|||
|
|
|||
|
Giulio Nati, General Practitioner Rome 00197 - Italy
Send response to journal:
|
Understanding the concepts of sensitivity and specificity is a
difficult task for everyone. So, in many cases, clinical tests are used
without considering their statistical meaning. I often try to explain the
topic using this example (slightly adapting numbers):
This is true when I apply a test to unselected people. Clinical practice is different from screening: in any case, after a negative test, the infection probability will rise by 5% to 50%. This may be relevant in many situations, e.g. in older people or for medico-legal purpose. As correctly pointed out by the autor, rarely shall I perform an antinuclear antibody test in patients withouth suggestive symptoms. Pre test probability in such patients is hard to evaluate and the correct interpretation of the results of a test is only a part of the clinical proceeding. Competing interests: None declared |
|||
|
|
|||
|
Pisut Katavetin, Internist Bangkok, Thailand
Send response to journal:
|
I think when we select a test to 'rule in' or 'rule out' any condition in any population, we should concern about 'positive predictive value (PPV)' and 'negative predictive value (NPV)' instead of specificity and sensitivity, respectively. As pointed out by Loong (1), the test with high specificity (97%) may have low PPV (1%) in population with very low prevalence of disease (33 in 100,000). Clearly, we cannot use this test to 'rule in' the disease in this population. In the other way, the test with high sensitivity may have low NPV if test in population with very high prevalence. In fact, tests with higher specificity generally are better in 'rule in' the disease (less false positive [FP], higher PPV) than tests with lower specificity in the SAME population and tests with higher sensitivity generally are better in 'rule out' the disease (less false negative [FN], higher NPV) in the SAME population. What about the other side? Are there any effect of sensitivity on PPV and specificity on NPV? Yes, they REALLY have some effect. In the SAME population, tests with higher sensitivity will have higher PPV if they have EQUAL specificity (have more true positive [TP] same FP) and tests with higher specificity will have higher NPV if they have EQUAL sensitivity (have more true negative [TN] same FN). We may see that PPV and NPV are depend on prevalence, sensitivity AND specificity. But predominate role of specificity on PPV and sensitivity on NPV should be emphasized. Tests with 100% specificity will also have 100% PPV no matter how low sensitivity of the tests and prevalence of the condition and tests with 100% sensitivity will have 100% NPV no matter how low specificity and how high of the prevalence. This is the reason we were taught that good test to 'rule in' is the test with high specificity and good test to 'rule out' is the test with high sensitivity. 1. Loong T-W. Understanding sensitivity and specificity with the right side of the brain. BMJ 2003;327:716-9. Competing interests: None declared |
|||
|
|
|||
|
Sanjeev Sharma, Cosultant Gynaecologist Southport PR8 6PN
Send response to journal:
|
Author of this article should be given a medal, failing that a big pat on the back for making this concept so simple. I have read about this subject every few months to make sure that I do not confuse sensitivity with specificity. I have also hoped that if I read about it often enough I will evetually understand it completely . However the visual description has done for me what years of reading could not do. Competing interests: None declared |
|||
|
|
|||
|
Edward M Sivills, GP VTS SHO Medicine Milton Keynes General Hospital
Send response to journal:
|
I believe a mistake may have been made when printing the summary points of this helpful clinical review. Postive Predictive Value = number of true positives divided by the number of positive results The summary on p 719 shows the correct diagram with the wrong label "no of people with the disease" Competing interests: None declared |
|||
|
|
|||
|
Paul G McIntyre, consultant virologists ninewells hospital dundee DD1 9SY Scotland
Send response to journal:
|
The teaser paragraph presumably added by a sub-editor with a poorer grasp of the concepts than the author has done the article a grave mis- service by confusing sensitivity and PPV. I did not find the diagrams helpful. For my money the best simple explanation of these concepts is given in the book Clinical Chemistry edited by Marshall. Read and understand. Note that: PPV is correlates with specificity and with disease prevalence. NPV correlates with sensitivity and inversely with disease prevalence. Good try by the author. I have tried to teach these concepts to medical and science undergrads yearly for about 7 years using a problem solving approach, and it is TOUGH. Competing interests: None declared |
|||
|
|
|||
|
Adam Jacobs, Director Dianthus Medical Limited, London SW19 3TZ
Send response to journal:
|
The schoolboy error in the subtitle of Loong's article (and the accompanying editor's choice) has been well described above. What I would like to know is how such an obvious error came to be printed in the first place. I think we are owed an explanation by the BMJ about the quality control procedures that are in place, and why they failed so spectacularly on this occasion. How many similar errors slip through that are less obvious, but still important? Another small point, but the use of colour in this article made it totally impossible to distinguish between positive and negative test results when printed in black and white. I don't believe I am the only person who downloads pdf files from the BMJ website and prints them on a monochrome printer. Or is this a subtle ploy to persuade us to subcribe to the paper BMJ? Apart from those two points, Loong deserves to be heartily congratulated for what was otherwise an excellent article, giving the best explanation of sensitivity and specificity I have seen for a long time. Competing interests: None declared |
|||
|
|
|||
|
Romolo M Dorizzi, Laboratory physician Azienda Ospedaliera di Verona (Italy)
Send response to journal:
|
Why not test the right side of the brain approach advocated by Dr.Tze-Wey Loong in preparing a presentation about Evidence Based Laboratory Medicine delivered today to a very heterogeneous group of 150 people (laboratory physicians, laboratory scientists, medical technologists, medical laboratory officers)? I added some of the images contained in Dr.Loong’s article to the slides prepared following the classical approach (1) and the attendants’ response was definitively favourable to them. I agree also with Bisoffi’s and Van den Ende’s appeal made on September 27 (Don’t forget likelihood ratio). It is really important that the clinicians correctly visualize sensitivity and specificity; moreover, being able to calculate the likelihood ratio (LR), they could better request the diagnostic tests. A clinician has to understand that he cannot be helped, whichever the pre-test probability, by a test with a positive LR (LR+) around one; on the contrary a test with a LR+ higher then ten is helpful as suggested by Jaeschke et al (2). The Fagan’s nomogram, easily kept in the pocket of physicians and laboratorians, allows a rough but substantially accurate calculation of the post-test probability of the disease. I feel that the laboratory work should be tightly linked to the clinical work; the laboratory should provide the clinician not with “numbers” but with information. A better and shared understanding of sensitivity and specificity, LRs and Fagan’s nomogram may help to establish a “new covenant” between clinicians and laboratorians in the management of the patient. The laboratory could help in answering the question: “Is this patient ill?” (or, better: “How high is the probability of this patient being ill?”), and not limit itself to provide a number (3). 1) Sackett DL, Straus S, Richardson WS, Rosenberg W, Haynes RB, editors, Evidence-based Medicine. How to practice and teach EBM. London: Churchill Livingstone, 2000. 2) Jaeschke R, Guyatt GH, Sackett DL for the Evidence Based Working. Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994; 271: 703-7. 3) Dorizzi RM, Senna G, Caputo M. Likelihood ratios and Fagan’s nomogram: valuable but underrated tools for in vitro latex sensitisation assessment. Clin Chim Acta 1999; 282: 175–83. Competing interests: None declared |
|||
|
|
|||
|
Paul A Reeve, Specialist physician Waikato DHB
Send response to journal:
|
Editor, I could not resist the challenge posed by your assertion in "editor's choice" in the 27 September 2003 BMJ that not one BMJ reader in a thousand could answer the question posed by Tze-Wey Loong "can you explain why a test with 95% sensitivity might identify only 1% of affected people in the general populations?" You suggested the numbers were the easy bit and communication is the harder bit. You were right but also wrong. I could not work out the answer but that is because the question is incorrectly worded. A test with 95% sensitivity will always identify 95% of the affected individuals. For the reasons Tze-Wey Loong outlined in a very low prevalence population, if the test has less than 100% sensitivity only a small proportion of individuals with positive tests will be correctly identified as having disease. The question should have been "can you explain why, when a test with 95% sensitivity is used in the general population, only 1% of individuals with a positive result will actually be affected by disease". My congratulations to Tze-Wey Loong in an excellent article which I will use in teaching the concepts of sensitivity, specificity and predictive values. The diagrams clearly illustrate what these terms mean. Unfortunately the language continues to confuse. Competing interests: None declared |
|||
|
|
|||
|
EAA Addou, Graduate Student 22043
Send response to journal:
|
As a student studying Public Health, I read Dr. Loong's article "Understanding Specificity and Sensitivity.." with great interest. I had an assignment due on the very topic of Sensitivity and Specificity. I was disappointed to find Dr. Loong's explanation to be utterly confusing and incoherent. Instead of relying on diagrams that resemble a puzzle, a 2x2 table(as published by Leon Gordis in textbook "Epidemiology" Chapter 4 see page 70, Second Edition, W.B. Saunders Company, C.2000) would have been sufficient in describing the concept of Sensitivity and Specificity. Sometimes a short cut does not work as well as the old standby. Competing interests: None declared |
|||
|
|
|||
|
Taqi F Hashmi, GP Locum London
Send response to journal:
|
Like all lazy readers I quickly skim read the article drawn by the tantalising bait of the question (95% and 1%) and then moved on to the quick responses, they usually quickly pick up on the main points and serve as a better read many a time. Working towards MRCGP I need a quick fire, works under pressure technique to remember what sensitivity and specificity are, the keys to understanding are 2: 1. It's all TRUE (believe me!)
Everyone usually succeeds in remembering "TRUE positives" and "TRUE negatives" have got something to do with sensitivity and specificity BUT WHICH WAY ROUND IS IT!!! Back to the "weird P" - P should stand for positive, but as it is "weird" it's the other way round! sPecificity = no of TRUE negatives
But the real question that should concern us, who was the weird one
who decided to name these inter-related concepts with words which are so
similar:
Amazing isn't it, they even have the same length! Competing interests: None declared |
|||
|
|
|||
|
Arturo Knol, general practitioner Paterswoldse weg 9728 BB Groningen Netherlands
Send response to journal:
|
What is a bad test?He writes that with a sensitivity of 100% the specificity of 0 % could not be worse and illustrates this with the example in figure 7. In fact 0%Sp100% can also be associated with a bad test. Tze-Wey Loong1 gives an attractive graphic presentation of sensitivity (Se)
and specificity (Sp).However he restricts himself to just one limiting case of
a bad test. An extension is possible with a more general representation of bad
tests ble using the authors method. Figure1 Results of an investigation of a bad test. The disease
prevalence is 20%.In figure 1 a more general case is visualised. We follow the authors method of using a horizontal border separating the black and white dots. The borders between red and green are vertically aligned. Parallel displacement of the aligned borders gives all possible combinations of bad tests. The border between red and green can be shifted to the right, giving the situation pictured by Tze-Wey Loong in figure 7. We
can show that for every Sp there exists a Se with the property of a bad test
for the combination . We analyze
the results of an investigation as shown in figure 1. Litterature
|
|||
|
|
|||
|
Tze-Wey Loong, clincal tutor National University of Singapore
Send response to journal:
|
I have enjoyed reading, and learnt a lot, from the responses to my article “Understanding sensitivity and specificity with the right side of the brain”. I will try to summarise the responses to date: But first, enough is enough: I DID NOT WRITE THE TEASER! For details on this error and two other errors (which are my fault), see the rapid response “Summary of responses so far” dated 29 Sept 2003. Having got that off my chest… Mr. Martin and Dr. Katavetin pointed out that the sensitivity and specificity of a test may not be fixed. Thanks to both for informing me of this and giving good examples. Several readers (Dr. Bisoffi, Prof. Jacques and Dr.Addou) expressed a preference for the good old fashioned 2X2 table with disease state in columns and test result in rows. If the area represented in such a diagram is proportional to the size of the subgroup concerned, this should work well in helping the student remember the formulas for calculating sensitivity, specificity, PPV and NPV. However, I suspect that many students who are taught this table will still grossly underestimate the PPV of a test in “real world” situations. My article was really an attempt to answer the question “How can a test can be 94% “sensitive” and yet be correct only 1% of the time”. This is, of course, a psychological question, not a mathematical one. Mathematically, a test with 94% sensitivity and 97% specificity, applied to a population with a disease prevalence of 0.033% will have a positive predictive value of 1%. Psychologically, however, we have been conditioned to believe that “90%” is pretty good - as in 90% fat free, 90% off the regular price, 90% of consumers prefer brand X. The problem is, with a disease prevalence of 0.033%, 90% is not good enough. To overcome this psychological barrier, some readers suggested alternative measures of test “accuracy” such as likelihood ratio. Others offered aide memoires such as “ PPV correlates with specificity and disease prevalence while NPV correlates with sensitivity and inversely with disease prevalence”, or “A test with high specificity can be used to “rule in” a disease while a test with high sensitivity is useful for “ruling out” a disease”. Overall, I think the best solution is to: 1. Use natural frequencies instead of percentages.(See Dr. Heston’s rapid response “A different approach to right brain thinking” and Gigerenzer and Edwards article “Simple tools for understanding risks: from innumeracy to insight” in the same issue of the BMJ). For instance, instead of “94% sensitivity” we could say “ about 1 in 17 people with the disease will test negative”. 2. Ask ourselves: “How likely is this patient to have the disease?” before performing the test on him or her. Using the example of the ANA test again: Sensitivity of 94% means “about 1 in 17 people with SLE will have a negative ANA test”. Specificity of 97% means “about 1 in 33 well people will have a positive ANA test”. Now answer the question: “How likely is my patient to have SLE?” before doing the test. If the answer is “likely” – use the test to confirm your clinical diagnosis, as there is only a 1 in 17 chance that you will get a false negative. If the answer is “unlikely” – you may not want to use the test, because there is a 1 in 33 chance that the patient will test positive anyway. If the answer is “somewhere in between” – well, I suppose that’s where “clinical judgement” comes in! Looking back on the lively discussion so far, I think that what we have here is a classic example of the blindfolded men/elephant interface. We are really all talking about the same things, but each has brought his or her own experience/point of view/left or right brain preference to the conference. It is clear that some prefer a picture, some prefer a different kind of picture, and some prefer to follow chains of Bayesian logic. I am glad that several people in different parts of the world (the UK, the US, Italy, Germany and Cuba), have found the article useful.I do know that others have found it confusing - my apologies to them. Thanks for all the feedback. Competing interests: None declared |
|||
|
|
|||
|
Neil Watson, MA, MD, FRCS, Instructor Academy of Art College. San Francisco.
Send response to journal:
|
I was interested to read Loong's paper 'Understanding sensitivity and specificity with the right side of the brain'. (BMJ 2003: 716-9 Sep 27) It is now many years since Roger Sperry was awarded the Nobel prize for his work on the lateralisation of brain function. The notion of 'right brainness' was popularised by Betty Edwards in her 1986 book 'Drawing on the Right side of the Brain.' (1) However I believe that this term, a lay term, is used widely without a proper scientific basis, and is a simplistic hypothesis for an extaordinarily complex pattern of brain activity, combining eye movements, visual perception and imagination. Besides which how does it apply to people who are left handed? I was therefore surprised to read in Loong's article that the right side of the brain is the visual side. It is not! There can be no doubt that much, perhaps even most, of our thinking is visual and Howard Gardner's work on various forms of intelligence has helped with our understanding of these matters. (2) More recently work by Zeki (3), Hugh Watson, my youngest son, (4) and others have furthered our understanding of brain function and eye movements as it applies to visual artists; people who have specific training in matters visual and, usually, visual memories which are better developed than non artists. Loong is right to emphasise the benefits of displaying statistical information graphically, making it much easier to comprehend for most people. But I do not think that a scientific journal, such as the BMJ, should, by implication, condone the term 'right brainness', or derivations from it; it is not a term that can be used in the same way as, for example, left handedness. In the course of researching a new book on drawing, published this year (5), I had cause to read more than 200 books on drawing, and matters relating to its practice. I believe that it is time to abandon the right brain 'notion' and accept the fact that a large body of scientific work now exists which clarifies many matters of vision and visual understanding. Indeed it is the 'front to back' connections between frontal and occipital cortices and, perhaps even more importantly, the connections to visual memory and the subconscious, as well as to imagination, which are far more important to visual understanding than the supposed left brain/right brain hypothesis. In the end many truly artistic decisions are based on intuition and personality. But then that's another story! 1 Edwards, Betty. Drawing on the Right side of the Brain. Simon and Schuster. New York. 1986. 2 Gardner, Howard. Intelligence Reframed. Multiple Intelligences. Basic Books. New York. 1999. 3. Zeki, Semir. Inner Vision. An exploration of Art and the Brain. Oxford University Press. Oxford. 1999. 4. Watson, Hugh.An investigation into artists' and non-artists' eye movements and memory performance using a portrait drawing task from computer presented stimuli. BSc. Thesis. University of Nottingham. 2000 5. Watson, Neil. The Drawing Spirit; Developing the Art of your Drawing Hand. CD-ROM. Point Richmond, California. 2003 Competing interests: None declared |
|||
|
|
|||
|
Fiona J Day, SHO Community Child Health Royal Hospital for Sick Children,Edinburgh
Send response to journal:
|
Sir, I greatly enjoyed the article which I have just read and was positively (predictably?!) delighted to notice for myself the typo in the summary box. It helped me to realise how much I had learned from the 'connect-4 model' approach, despite the article's flaws. For all our negativity about the internet, I also really enjoyed later reading the rapid responses. They made for a fascinating extended learning opportunity and unique forum for discussion with colleagues around the world. Sincerely Dr Fiona Day Competing interests: None declared |
|||
|
|
|||
|
Kit Byatt, consultant physician Hereford County Hospital, Hereford, UK, HR4 7QN
Send response to journal:
|
For the "Rule in" "Rule out" question (as raised by Katavetin and others), I have found Sackett's "SpPin/SnNout" mnemonic invaluable [1]. If a sensitive (Sn) test is negative (N), rule the diagnosis OUT If a specific (Sp) test is positive (P), rule the diagnosis IN. One can then easily SpPIN or SnNOUT the right answer... Competing interests: I have invested money in Sackett's book |
|||
|
|
|||
|
Ian P Rodd, SpR Paediatrics RHCH, Winchester, SO22 5DJ
Send response to journal:
|
I came to this discussion some months after giving up on the original article when it became clear to me that the initial question was impossible. The reason for revisiting it was reading the SpPIN or SnNOUT line somewhere else, and thinking that it was equally wrong. I decided to do some reading around, and here I am. Surely, as demonstrated by a variety of means, a specific test (97%), applied to a rare disease (33/100000)does not rule it in (99% false positives if sensitivity 95%)? Isn't that the whole point, or did I miss something? Competing interests: None declared |
|||
|
|
|||
|
Ben D Ewald, PhD student and GP 2300
Send response to journal:
|
Sir The problem of communicating mathematical concepts to an audience not used to dealing with them faces doctors, researchers and educators. Your journal in September 2003 Loong (1) published a diagrammatic method of teaching the concepts of sensitivity and specificity, which I have further developed as a research tool for use in interviews with clinicians. I have been conducting qualitative research into the diagnostic utility of a new blood tests for B Natriuretic Peptide to determine general practitioner’s perceptions of the level of test validity required before the new test would be useful in their practice. An A4 size diagram is laminated in plastic and can be drawn on during the interview to facilitate discussion of the concepts of sensitivity, specificity, and positive and negative predictive values. During interviews to date clinicians have rapidly accepted the diagram and used it to clearly demonstrate what they are saying, even when confusing the technical terms.
Fig 1 Diagram representing 100 patients with signs or symptoms of CHF, of whom 30 have left ventricular systolic dysfunction , marked to show sensitivity of 80% and specificity of 90%, positive predictive value of 77% and negative predictive value of 91%. As can be seen in figure 1 the positives are placed along the left side rather than across the bottom, which gives the diagram the same orientation as standard epidemiologic 2x2 tables. The diagram should be prepared so as to present the local prevalence of the condition of interest. I believe this may be of interest to your readers for use in teaching or research. Ben Ewald
References 1. Loong T-W. Understanding sensitivity and specificity with the right side of the brain. bmj 2003;327:716-9. Competing interests: None declared |
|||
|
|
|||
|
Anthony M. Benis, Sc.D., M.D., ret. Mt. Sinai Medical Center, New York 10029
Send response to journal:
|
Sensitivity and specificity are ratios. SeNsitivity conveniently has an "N" and is a measure of false negatives. SPecificity has a "P" and is a measure of false positives. Nevertheless, if the prevalence of the condition is very low, even if the specificity ratio is very high, then most positives will be false positives. Repeat: If the prevalence of a condition is very low, then most positives will be false positives. Tell the students to memorize this statement and "never forget it for as long as you live". They won't. And neither will you. Competing interests: None declared |
|||
|
|
|||
|
Roxsie A Degen, biomedical science student Tasmania, Australia 7310
Send response to journal:
|
Thankyou for this concise explanation of sensitivity and specificity. Although your explanation did not help me directly, it gave me a different way to look at what sensitivity and specificity mean, allowing me to come to my own logical conclusion. I could not understand why something so simple was so hard to grasp, but it was because i was looking at it the wrong way. The graphical representation of sensitivity being only interested in those people who have the disease (I think this was figure 4) was the most helpful. Again thankyou, you have relieved alot of my frustration! Competing interests: None declared |
|||