Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
a Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE, b ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, PO Box 777, Oxford OX3 7LF
Correspondence to: Professor Bland.
Measurement error is the variation between measurements of the same quantity on the same individual.1 To quantify measurement error we need repeated measurements on several subjects. We have discussed the within-subject standard deviation as an index of measurement error,1 which we like as it has a simple clinical interpretation. Here we consider the use of correlation coefficients to quantify measurement error.
A common design for the investigation of measurement error is to take pairs of measurements on a group of subjects, as in table 1. When we have pairs of observations it is natural to plot one measurement against the other. The resulting scatter diagram (see figure 1) may tempt us to calculate a correlation coefficient between the first and second measurement. There are difficulties in interpreting this correlation coefficient. In general, the correlation between repeated measurements will depend on the variability between subjects. Samples containing subjects who differ greatly will produce larger correlation coefficients than will samples containing similar subjects. For example, suppose we split this group in whom we have measured forced expiratory volume in one second (FEV
Table 1--Pairs of measurements of FEV |
|
The correlation coefficient between repeated measurements is often called the reliability of the measurement method. It is widely used in the validation of psychological measures such as scales of anxiety and depression, where it is known as the test-retest reliability. In such studies it is quoted for different populations (university students, psychiatric outpatients, etc) because the correlation coefficient differs between them as a result of differing ranges of the quantity being measured. The user has to select the correlation from the study population most like the user's own.
Another problem with the use of the correlation coefficient between the first and second measurements is that there is no reason to suppose that their order is important. If the order were important the measurements would not be repeated observations of the same thing. We could reverse the order of any of the pairs and get a slightly different value of the correlation coefficient between repeated measurements. For example, reversing the order of the even numbered subjects in table 1 gives r = 0.80 instead of r = 0.77. The intra-class correlation coefficient avoids this problem. It estimates the average correlation among all possible orderings of pairs. It also extends easily to the case of more than two observations per subject, where it estimates the average correlation between all possible pairs of observations.
Few computer programs will calculate the intra-class correlation coefficient directly, but when the number of observations is the same for each subject it can be found from a one way analysis of variance table2 such as table 2. We need the total sum of squares, SS
Then
r
where m is the number of observations per subject. For table II, m = 2 and
r
Table 2--One way analysis of variance for the data in table 1
-----------------------------------------------------------------------------------------------
Degrees of Sum of Mean Variance Probability
Source of variation freedom squares square ratio (F) (P)
-----------------------------------------------------------------------------------------------
Children 19 1.52981 0.08052 7.4 <0.0001
Residual 20 0.21670 0.01086
-----------------------------------------------------------------------------------------------
Total 39 1.74651 |
In practice, there will usually be little difference between r and r
The correlation coefficient can be used to compare measurements of different quantities, such as different scales for measuring anxiety. We could make repeated measurements of all the quantities on the same subjects and calculate intra-class correlations. The measures with the highest correlation between repeated measurements would discriminate best between individuals; in other words they would carry the most information. For most applications, however, we prefer the within-subjects standard deviation as an index of measurement error, as it has a more direct interpretation which can be applied to individual measurements.1
Read all Rapid Responses
What can you learn from this BMJ paper? Read Leanne Tite's Paper+