Statistics Notes: Measurement errorBMJ 1996; 313 doi: http://dx.doi.org/10.1136/bmj.313.7059.744 (Published 21 September 1996) Cite this as: BMJ 1996;313:744
- a Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE,
- b IRCF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, PO Box 777, Oxford OX3 7LF
- Correspondence to: Professor Bland.
Several measurements of the same quantity on the same subject will not in general be the same. This may be because of natural variation in the subject, variation in the measurement process, or both. For example, table 1 shows four measurements of lung function in each of 20 schoolchildren (taken from a larger study1). The first child shows typical variation, having peak expiratory flow rates of 190, 220, 200, and 200 1/min.
Let us suppose that the child has a “true” average value over all possible measurements, which is what we really want to know when we make a measurement. Repeated measurements on the same subject will vary around the true value because of measurement error. The standard deviation of repeated measurements on the same subject enables us to measure the size of the measurement error. We shall assume that this standard deviation is the same for all subjects, as otherwise there would be no point in estimating it. The main exception is when the measurement error depends on the size of the measurement, usually with measurements becoming more variable as the magnitude of the measurement increases. We deal with this case in a subsequent statistics note. The common standard deviation of repeated measurements is known as the within-subject standard deviation, which we shall denote by sw.
To estimate the within-subject standard deviation, we need several subjects with at least two measurements for each. In addition to the data, table 1 also shows the mean and standard deviation of the four readings for each child. To get the common within-subject standard deviation we actually average the variances, the squares of the standard deviations. The mean within-subject variance is 460.52, so the estimated within-subject standard deviation is sw = (square root)460.52 = 21.5 1/min. The calculation is easier using a program that performs one way analysis of variance2 (table 2). The value called the residual mean square is the within-subject variance. The analysis of variance method is the better approach in practice, as it deals automatically with the case of subjects having different numbers of observations. We should check the assumption that the standard deviation is unrelated to the magnitude of the measurement. This can be done graphically, by plotting the individual subject's standard deviations against their means (see fig 1). Any important relation should be fairly obvious, but we can check analytically by calculating a rank correlation coefficient. For the figure there does not appear to be a relation (Kendall's (tau) = 0.16, P = 0.3).
A common design is to take only two measurements per subject. In this case the method can be simplified because the variance of two observations is half the square of their difference. So, if the difference between the two observations for subject i is di the within-subject standard deviation sw is given by s2w = 1/2n(summation)d2i, where n is the number of subjects. We can check for a relation between standard deviation and mean by plotting for each subject the absolute value of the difference—that is, ignoring any sign—against the mean.
The measurement error can be quoted as sw. The difference between a subject's measurement and the true value would be expected to be less than 1.96 sw for 95% of observations. Another useful way of presenting measurement error is sometimes called the repeatability, which is (square root)2 x 1.96 sw or 2.77 sw. The difference between two measurements for the same subject is expected to be less than 2.77 sw for 95% of pairs of observations. For the data in table 1 the repeatability is 2.77 x 21.5 = 60 1/min. The large variability in peak expiratory flow rate is well known, so individual readings of peak expiratory flow are seldom used. The variable used for analysis in the study from which table 1 was taken was the mean of the last three readings.1
Other ways of describing the repeatability of measurements will be considered in subsequent statistics notes.