Intended for healthcare professionals

Opinion

When I use a word . . . . Too much healthcare—self-assessment

BMJ 2022; 378 doi: https://doi.org/10.1136/bmj.o2372 (Published 30 September 2022) Cite this as: BMJ 2022;378:o2372
  1. Jeffrey K Aronson
  1. Centre for Evidence Based Medicine, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
  1. Twitter @JKAronson

Underconfidence and overconfidence are the Scylla and Charybdis of medical practice. Too much of the former and you will hesitate and overinvestigate; too much of the latter and you will misdiagnose and overtreat. Either way the patient suffers. The Dunning–Kruger effect supposedly predicts how good people are at self-assessment—those with little knowledge or competence overestimate their ability, while those who are highly competent underestimate it. However, although the reality of the effect has been widely accepted, several criticisms have been raised, including the small numbers studied, the problem of statistical regression to the mean, and the problem of the random noise that accompanies the measurement of self-assessment. In reality, although qualified experts are better at self-assessment than novices are, there is no strong tendency for individuals of all grades of competence to be overconfident in self-assessment and very few people (about 5%) are actually “unskilled and unaware of it,” as Kruger and Dunning originally put it.

The confidence spectrum

Underconfidence and overconfidence are the Scylla and Charybdis of medical practice. Too much of the former and you will hesitate and overinvestigate; too much of the latter and you will misdiagnose and overtreat. Either way the patient suffers. But how can you navigate the narrow channel between the two extremes? How good are you at assessing your own competence?

Kruger and Dunning (1999)

In 1999 two psychologists, Justin Kruger and David Dunning, published the results of an experiment.1 What they found has become known, in the perverse way that these things sometimes happen, as the Dunning–Kruger effect.

Kruger and Dunning measured how students performed in assessments of their sense of humour (n=65), logical reasoning ability (two studies, n=65 and n=140), and English grammar knowledge (n=84), and then asked them how well they thought they had performed. They then compared the subjective and objective assessments. They then divided the students into four groups, those with low, low medium, high medium, and high test scores. When they compared the test scores in each quartile with the students’ estimates of their own performance, the perceived values were higher than the measured ones in the bottom three quartiles. The differences were greatest in the bottom quartile and became progressively smaller through the second and third quartiles. In the top quartile the perceived performance was lower than the measured one.

Kruger and Dunning concluded that those who performed least well overestimated their competence and that the better they performed the less they overestimated up to the very best performers, who underestimated their actual competence. Here is the abstract that precedes their paper:

“People tend to hold overly favorable views of their abilities in many social and intellectual domains. The authors suggest that this overestimation occurs, in part, because people who are unskilled in these domains suffer a dual burden: Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it. Across 4 studies, the authors found that participants scoring in the bottom quartile on tests of humour, grammar, and logic grossly overestimated their test performance and ability. Although their test scores put them in the 12th percentile, they estimated themselves to be in the 62nd. Several analyses linked this miscalibration to deficits in metacognitive skill, or the capacity to distinguish accuracy from error. Paradoxically, improving the skills of participants, and thus increasing their metacognitive competence, helped them recognize the limitations of their abilities.”

In other words, people who were poor at something lacked the mental ability to understand how poor they were and overestimated their competence, while those who, with time, had become more competent became better at assessing themselves.

In support of their conclusion, Kruger and Dunning quoted Charles Darwin, from The Descent of Man and Selection in Relation to Sex (1871, page 3): “... ignorance more frequently begets confidence than does knowledge.” However, they omitted both Darwin's preamble and follow-up to this. His full sentence reads “It has often and confidently been asserted, that man's origin can never be known: but ignorance more frequently begets confidence than does knowledge: it is those who know little, and not those who know much, who so positively assert that this or that problem will never be solved by science.” So Darwin was referring to beliefs about the acquisition of knowledge, not competence. Nevertheless, Darwin's statement has often been taken to be a statement of the Dunning–Kruger effect, avant la lettre, perhaps by people who are overconfident in their opinion about Darwin, not to mention their opinion about Kruger and Dunning.

Is the Dunning–Kruger effect real?

To date Kruger and Dunning’s paper has been cited over 8300 times, according to Google Scholar, and over 3000 times, according to Web of Science. In most cases their results have been accepted as demonstrating a real effect, and in many cases have apparently been replicated, using similar methods. In very few cases has any doubt been cast.

When I first became aware of the Dunning–Kruger effect I thought that it was probably real. After all, most, if not all of us, have encountered incompetent people who appear to be confident in their abilities and others, highly competent, who underestimate their abilities. However, when I plotted their data, I started to have doubts. And when I looked into the literature, I found that others had doubts too.

There are several problems. First, the sample studied was relatively small. Secondly, it is very hard to measure competence in the way that Kruger and Dunning did, and an individual may perform in widely different ways at different times. Individuals may be primed with a prior expectation that they will do well at something. In some cases they will have been led to believe in their competence, perhaps by doting parents or by having had a few recent lucky successes; in that case, without feedback on their performance by comparison with others they will tend to overestimate their competence. A modelling study supports this interpretation.2 Furthermore, when individuals assess their own performance they have no yardstick against which to measure themselves and most of us tend to think that we are better than average at many different things. After all, although most of us have no idea what the qualities of a good driver are, don’t we all think we’re better than average drivers?3

The last of these factors is related to the Peter principle,4 named after Laurence J Peter who first enunciated it: “In a hierarchy every employee tends to rise to his level of incompetence.” This seems obvious—if you are competent at some level of responsibility, you are likely to be promoted to the next level, then again, until you reach a level at which you are no longer competent, which is when the promotions stop. I do not know how exceptional it is for anyone to be able to estimate how competent they will be at the next level and to deny themselves promotion. In any case it’s hard to resist the attraction of higher status, increased income, and possibly greater power. Consider the members of recent, and perhaps not so recent, government cabinets.

Another important problem is regression to the mean, which explained the results of a study in which 60 students completed an easy and a difficult task and estimated their performances.5 In reply, Kruger and Dunning reanalysed their data and found no evidence of regression to the mean.6

However, perhaps the best evidence against the view that the Dunning-Kruger effect is real is the fact that the pattern observed can be reproduced using small datasets contaminated by random noise and that modelling pure noise with random numbers can mimic the signal that arises from real data.78 If the ability to assess one’s own competence, which is hard to measure, is not itself random noise but is contaminated by random noise because of difficulty in measurement, that could explain the results of Kruger and Dunning.

A final thought

The Dunning–Kruger effect has been widely accepted as real, but that view is controversial and the matter has not been settled. There is comforting evidence for clinical practice that although qualified experts are better at self-assessment than novices are, there is no strong tendency for individuals of all grades of competence to be overconfident in self-assessment and very few people (about 5%) are actually “unskilled and unaware of it,” as Kruger and Dunning originally put it.7

The problem with that observation, in a much broader context, is that it doesn’t take many such people to cause trouble. Bertrand Russell is often quoted as having written, in “The Triumph of Stupidity,” a 1933 essay about the rise of the Nazis in Germany, that “The fundamental cause of the trouble is that in the modern world the stupid are cocksure while the intelligent are full of doubt. Even those of the intelligent who believe that they have a nostrum are too individualistic to combine with other intelligent men from whom they differ on minor points.”9 The Dunning–Kruger effect doesn’t relate to stupidity, but substitute “ignorance” and the point is made.

Dunning and Kruger themselves anticipated some of the more recent analyses of their hypothesis in their concluding remarks: “Although we feel we have done a competent job in making a strong case for this analysis, studying it empirically, and drawing out relevant implications, our thesis leaves us with one haunting worry that we cannot vanquish. That worry is that this article may contain faulty logic, methodological errors, or poor communication. Let us assure our readers that to the extent this article is imperfect, it is not a sin we have committed knowingly.”

Footnotes

  • Competing interests: none declared.

  • Provenance and peer review: not commissioned; not externally peer reviewed.

References