- WRITE FOR US: The BMJ is looking for doctors and journalists to contribute articles relevant to doctors in India
- Our online table of contents is updated at least twice each day. Read all articles published in the last 7 days.
- You can use bmj.com to help you with your continuing medical education. Find out about CME/CPD credits for BMJ articles
- Keep up to date with cardiology: Access the latest cardiovascular medicine resources from across BMJ Group.
- OPEN ACCESS: All research articles are freely available online, with no word limit. Find out more about the BMJ's open access policy. Submit your paper.
- Find out how study types differ in our How to read a paper section.
Recent rapid responses
Rapid responses are electronic letters to the editor. They enable our users to debate issues raised in articles published on bmj.com. Although a selection of rapid responses will be included as edited readers' letters in the weekly print issue of the BMJ, their first appearance online means that they are published articles. If you need the url (web address) of an individual response, perhaps for citation purposes, simply click on the response headline and copy the url from the browser window.
Displaying 1-1 out of 1 published
18 August 2011
In a recent statistical note in this journal, Dr. Bland and Dr. Altman cautioned the use of the correlation of two random variables, X and Y, in a restricted range of data (ref. 1). They concluded that when we restrict the range of one of the random variables, says X, the correlation coefficient between X and Y will be reduced naturally, therefore a smaller correlation observed in a restricted range of one variable does not necessarily imply any particular different relationship between two variables in the range. They further explained that the reduction through the meaning of the coefficient of determination, which measures the proportion of the variability in Y explained by the variability in X: if we restrict the range of one of the random variables X, we reduce the variation in X. Therefore it will explain less variability in Y, and hence the correlation between X and Y in the restricted range of X will naturally be reduced.
We congratulate Dr. Bland and Altman for this important observation on the natural reduction of the correlation coefficient if we restrict one of the variables in a particular range. However, the explanation of the reduction could be made clearer and more general. In addition, a discussion on the magnitude of this reduction would be of great interest. For example, we may want to know how this reduction is related to the probability of X falling in the restricted range. Below, we shall explicitly explain why the reduction occurs and how the reduction depends on the range of the restriction.
Suppose that we are interested in estimating the correlation between X and Y based on a random sample of n paired samples from a bivariate normal distribution. As a bivariate normal distribution can be transferred to the standard bivariate normal distribution with the same correlation coefficient, without loss of generality, we assume the bivariate normal distribution has mean (0, 0), variance (1, 1), and a correlation r. Now let us consider the correlation in the restricted interval: X is between a and b. Let f(.) and F(.) denote the standard normal probability density function and cumulative distribution function. Similar to what we have done before (ref. 2), it can be shown that, when n is large, the correlation in this restricted interval of X, converges to
is the variance of the truncated standard normal variable of X within the range of (a, b).
Knowing that the variance of the truncated standard normal variable within the range of (a, b) is smaller than or equal to 1, the variance of the unrestricted X, we derive that the restricted correlation is less or equal to r. Therefore, the correlation in this restricted interval of X attenuates the correlation between X and Y. Figure 1 illustrates how the level of attenuation depends on the range (a, b) graphically. Specifically, it shows how the attenuation depends on the probability of X <=a and the probability of a< X <= b.
Contributors: LN and HC equally initiated, designed, and drafted the paper. All authors approved this version of the paper.
Views expressed in this paper are the author's professional opinions and do not necessarily represent the official positions of the U.S. Food and Drug Administration.
1. Bland JM, Altman DG. Correlation in restricted ranges of data. BMJ 2011;342:d556.
2. Chu H, Nie L, Cole SR. Sample size and statistical power assessing the effect of interventions in the context of mixture distributions with detection limits. Stat Med 2006;25(15):2647-57.
Figure 1: The relationship between the correlation coefficients rr for the restricted interval of X and the probability of X <= a and the probability of a< X <= b. The 19 lines presented in each plot correspond to the (unrestricted) coefficient being - 0.9 to 0.9 by 0.1 from top to bottom.
Competing interests: None declared
OB/OTS/CDER, the US FDA
Click to like: