Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
a Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE, b Medical Statistics Laboratory, Imperial Cancer Research Fund, PO Box 123, London WC2A 3PX
Correspondence to: Dr Bland.
This is the thirteenth in a series of occasional notes on medical statistics
In earlier Statistics Notes1 2 we commented on the analysis of paired data where there is more than one observation per subject. It can be highly misleading to analyse such data by combining repeated observations from several subjects and then calculating the correlation coefficient as if the data were a simple sample.1 The appropriate analysis depends on the question we wish to answer. If we want to know whether an increase in one variable within the individual is associated with an increase in the other we can calculate the correlation coefficient within subjects.2 If we want to know whether subjects with high values of one variable also tend to have high values of the other we can use the correlation between the subject means, which we shall describe here.
Means of repeated measurements of intramural pH and Paco |
The table shows the mean pH and Paco
We can calculate the usual correlation coefficient for the mean pH and mean Paco
This analysis does not take into account the different numbers of measurements on each subject. Whether this matters depends on how different the numbers of observations are and whether the measurements within subjects vary much compared with the means between subjects. We can calculate a weighted correlation coefficient, using the number of observations as weights. Many computer programs will calculate this, but it is not difficult to do by hand.
We denote the mean pH and Paco
An easy way to calculate the weighted correlation coefficient is to replace each individual observation by its subject mean. Thus the table would yield 47 pairs of observations, the first four of which would each be pH=6.49 and Paco
The actual formula for a weighted correlation coefficient is: (summation)m
For the data in the table the weighted correlation coefficient is r=0.08, P=0.9. There is no evidence that subjects with a high pH also have a high Paco2. However, as we have already shown,2 within the subject a rise in pH was associated with a fall in Paco2.
What can you learn from this BMJ paper? Read Leanne Tite's Paper+