BMJ 1996;312:1654 (29 June)

Statistics notes

Measurement error

J Martin Bland, professor of medical statistics,a Douglas G Altman, head b

a Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE, b IRCF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, PO Box 777, Oxford OX3 7LF

Correspondence to: Professor Bland.

Several measurements of the same quantity on the same subject will not in general be the same. This may be because of natural variation in the subject, variation in the measurement process, or both. For example, table 1 shows four measurements of lung function in each of 20 schoolchildren (taken from a larger study1). The first child shows typical variation, having peak expiratory flow rates of 190, 220, 200, and 200 l/min.


Table 1--Repeated peak expiratory flow rate (PEFR)
measurements for 20 schoolchildren
---------------------------------------------------
             PEFR (l/min)
Child----------------------------------------------
No      1st   2nd   3rd  4th      Mean     SD
---------------------------------------------------
 1      190   220   200  200     202.50   12.58
 2      220   200   240  230     222.50   17.08
 3      260   260   240  280     260.00   16.33
 4      210   300   280  265     263.75   38.60
 5      270   265   280  270     271.25    6.29
 6      280   280   270  275     276.25    4.79
 7      260   280   280  300     280.00   16.33
 8      275   275   275  305     282.50   15.00
 9      280   290   300  290     290.00    8.16
10      320   290   300  290     300.00   14.14
11      300   300   310  300     302.50    5.00
12      270   250   330  370     305.00   55.08
13      320   330   330  330     327.50    5.00
14      335   320   335  375     341.25   23.58
15      350   320   340  365     343.75   18.87
16      360   320   350  345     343.75   17.02
17      330   340   380  390     360.00   29.44
18      335   385   360  370     362.50   21.02
19      400   420   425  420     416.25   11.09
20      430   460   480  470     460.00   21.60

Let us suppose that the child has a "true" average value over all possible measurements, which is what we really want to know when we make a measurement. Repeated measurements on the same subject will vary around the true value because of measurement error. The standard deviation of repeated measurements on the same subject enables us to measure the size of the measurement error. We shall assume that this standard deviation is the same for all subjects, as otherwise there would be no point in estimating it. The main exception is when the measurement error depends on the size of the measurement, usually with measurements becoming more variable as the magnitude of the measurement increases. We deal with this case in a subsequent statistics note. The common standard deviation of repeated measurements is known as the within-subject standard deviation, which we shall denote by (zeta)w.

To estimate the within-subject standard deviation, we need several subjects with at least two measurements for each. In addition to the data, table 1 also shows the mean and standard deviation of the four readings for each child. To get the common within-subject standard deviation we actually average the variances, the squares of the standard deviations. The mean within-subject variance is 460.52, so the estimated within-subject standard deviation is (zeta)w=(square root)460.5 = 21.5 1/min. The calculation is easier using a program that performs one way analysis of variance2 (table 2). The value called the residual mean square is the within-subject variance. The analysis of variance method is the better approach in practice, as it deals automatically with the case of subjects having different numbers of observations. We should check the assumption that the standard deviation is unrelated to the magnitude of the measurement. This can be done graphically, by plotting the individual subject's standard deviations against their means (see fig 1). Any important relation should be fairly obvious, but we can check analytically by calculating a rank correlation coefficient. For the figure there does not appear to be a relation (Kendall's (lau) = 0.16, P = 0.3).


Table 2--One way analysis of variance for the data of table 1
----------------------------------------------------------------------------------------------------
                        Degrees of                                     Variance ratio   Probability
Source of variation      freedom      Sum of squares      Mean square        (F)            (P)
----------------------------------------------------------------------------------------------------
Children                    19         285318.44          15016.78           32.6          <0.0001
Residual                    16          27631.25            460.52
----------------------------------------------------------------------------------------------------
Total                       79         312949.69



View larger version (19K):
[in this window]
[in a new window]
 
Fig 1--Individual subjects' standard deviations plotted against their means

A common design is to take only two measurements per subject. In this case the method can be simplified because the variance of two observations is half the square of their difference. So, if the difference between the two observations for subject I is di the within-subject standard deviation (zeta)w is given by when n is the number of subjects. We can check for a relation between standard deviation and mean by plotting for each subject the absolute value of the difference--that is, ignoring any sign--against the mean.

The measurement error can be quoted as (zeta)w. The difference between a subject's measurement and the true value would be expected to be less than 1.96 (zeta)w for 95% of observations. Another useful way of presenting measurement error is sometimes called the repeatability, which is (square root)2 x 1.96 (zeta)w or 2.77 (zeta)w. The difference between two measurements for the same subject is expected to be less than 2.77 (zeta)w for 95% of pairs of observations. For the data in table 1 the repeatability is 2.77 x 2.5 = 60 l/min. The large variability in peak expiratory flow rate is well known, so individual readings of peak expiratory flow are seldom used. The variable used for analysis in the study from which table 1 was taken was the mean of the last three readings.1

Other ways of describing the repeatability of measurements will be considered in subsequent statistics notes.

  1. Bland JM, Holland WW, Elliott A. The development of respiratory symptoms in a cohort of Kent schoolchildren. Bull Physio-Path Resp 1974;10:699-716.
  2. Altman DG, Bland JM. Comparing several groups using analysis of variance. BMJ 1996;312:1472. [Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to StumbleUpon StumbleUpon   Add to Technorati Technorati   Add to Twitter Twitter    What's this?

This article has been cited by other articles:

  • Tyc, V. L., Lensing, S., Vukadinovich, C. M., Hovell, M. F. (2009). Can parents of children with cancer accurately report their child's passive smoking exposure?. Nicotine Tob Res 11: 1289-1295 [Abstract] [Full text]  
  • Console, J W, Sakata, L M, Aung, T, Friedman, D S, He, M (2008). Quantitative analysis of anterior segment optical coherence tomography images: the Zhongshan Angle Assessment Program. Br J Ophthalmol 92: 1612-1616 [Abstract] [Full text]  
  • Mudge, S., Stott, N. S. (2008). Test--retest reliability of the StepWatch Activity Monitor outputs in individuals with chronic stroke. Clin Rehabil 22: 871-877 [Abstract]  
  • Krabbendam, I., Jacobs, L. C. A., Lotgering, F. K., Spaanderman, M. E. A. (2008). Venous response to orthostatic stress. Am. J. Physiol. Heart Circ. Physiol. 295: H1587-H1593 [Abstract] [Full text]  
  • Lazo, M., Selvin, E., Clark, J. M. (2008). Brief Communication: Clinical Implications of Short-Term Variability in Liver Function Test Results. ANN INTERN MED 148: 348-352 [Abstract] [Full text]  
  • Telles, R., Lanna, C., Ferreira, G., Souza, A., Navarro, T., Ribeiro, A. (2008). Carotid atherosclerotic alterations in systemic lupus erythematosus patients treated at a Brazilian university setting. Lupus 17: 105-113 [Abstract]  
  • Ramamurthy, C., Cutler, L., Nuttall, D., Simison, A J. M., Trail, I. A., Stanley, J. K. (2007). The factors affecting outcome after non-vascular bone grafting and internal fixation for nonunion of the scaphoid. J Bone Joint Surg Br 89-B: 627-632 [Abstract] [Full text]  
  • Ng, Q. S., Goh, V., Klotz, E., Fichte, H., Saunders, M. I., Hoskin, P. J., Padhani, A. R. (2006). Quantitative Assessment of Lung Cancer Perfusion Using MDCT: Does Measurement Reproducibility Improve with Greater Tumor Volume Coverage?. Am. J. Roentgenol. 187: 1079-1084 [Abstract] [Full text]  
  • Nasermoaddeli, A., Sekine, M., Kagamimori, S. (2006). Gender Differences in Associations of C-Reactive Protein With Atherosclerotic Risk Factors and Psychosocial Characteristics in Japanese Civil Servants. Psychosom. Med. 68: 58-63 [Abstract] [Full text]  
  • Aung, T., Nolan, W. P., Machin, D., Seah, S. K. L., Baasanhu, J., Khaw, P. T., Johnson, G. J., Foster, P. J. (2005). Anterior Chamber Depth and the Risk of Primary Angle Closure in 2 East Asian Populations. Arch Ophthalmol 123: 527-532 [Abstract] [Full text]  
  • Jongerius, P. H., Rotteveel, J. J., van Limbeek, J., Gabreels, F. J.M., van Hulst, K., van den Hoogen, F. J.A. (2004). Botulinum toxin effect on salivary flow rate in children with cerebral palsy. Neurology 63: 1371-1375 [Abstract] [Full text]  
  • Locatelli, L., Zivadinov, R., Grop, A., Zorzon, M. (2004). Frontal parenchymal atrophy measures in multiple sclerosis. Mult Scler 10: 562-568 [Abstract]  
  • Victorino, J. A., Borges, J. B., Okamoto, V. N., Matos, G. F. J., Tucci, M. R., Caramez, M. P. R., Tanaka, H., Sipmann, F. S., Santos, D. C. B., Barbas, C. S. V., Carvalho, C. R. R., Amato, M. B. P. (2004). Imbalances in Regional Lung Ventilation: A Validation Study on Electrical Impedance Tomography. Am. J. Respir. Crit. Care Med. 169: 791-800 [Abstract] [Full text]  
  • Childs, C., Harrison, R., Hodkinson, C. (1999). Tympanic membrane temperature as a measure of core temperature. Arch. Dis. Child. 80: 262-266 [Abstract] [Full text]  
  • Childs, C, Goldring, S, Tann, W, Hillier, V F (1998). Suprasternal Doppler ultrasound for assessment of stroke distance. Arch. Dis. Child. 79: 251-255 [Abstract] [Full text]  
  • Molyneux, P D, Tofts, P S, Fletcher, A, Gunn, B, Robinson, P, Gallagher, H, Moseley, I F, Barker, G J, Miller, D H (1998). Precision and reliability for measurement of change in MRI lesion volume in multiple sclerosis: a comparison of two computer assisted techniques. J. Neurol. Neurosurg. Psychiatry 65: 42-47 [Abstract] [Full text]  
  • Gawne-Cain, M L, O'Riordan, J I, Coles, A, Newell, B, Thompson, A J, Miller, D H (1998). MRI lesion volume measurement in multiple sclerosis and its correlation with disability: a comparison of fast fluid attenuated inversion recovery (fFLAIR) and spin echo sequences. J. Neurol. Neurosurg. Psychiatry 64: 197-203 [Abstract] [Full text]  
  • Bland, J M., Altman, D. G (1996). Statistics Notes: Measurement error and correlation coefficients. BMJ 313: 41-42 [Full text]  

Rapid Responses:

Read all Rapid Responses

Error in measurement error
gary rogers
bmj.com, 7 May 2003 [Full text]
comparing within-treatment repeatability variance for correlated samples
bruce e siskowski
bmj.com, 11 Jun 2004 [Full text]



Access jobs at BMJ Careers
Whats new online at Student 

BMJ