Recent rapid responses
Rapid responses are electronic letters to the editor. They enable our users to debate issues raised in articles published on bmj.com. Although a selection of rapid responses will be included as edited readers' letters in the weekly print issue of the BMJ, their first appearance online means that they are published articles. If you need the url (web address) of an individual response, perhaps for citation purposes, simply click on the response headline and copy the url from the browser window.
Displaying 1-1 out of 1 published
18 August 2011
In a recent statistical note in this journal, Dr. Bland and Dr. Altman cautioned the use of the correlation of two random variables, X and Y, in a restricted range of data (ref. 1). They concluded that when we restrict the range of one of the random variables, says X, the correlation coefficient between X and Y will be reduced naturally, therefore a smaller correlation observed in a restricted range of one variable does not necessarily imply any particular different relationship between two variables in the range. They further explained that the reduction through the meaning of the coefficient of determination, which measures the proportion of the variability in Y explained by the variability in X: if we restrict the range of one of the random variables X, we reduce the variation in X. Therefore it will explain less variability in Y, and hence the correlation between X and Y in the restricted range of X will naturally be reduced.
We congratulate Dr. Bland and Altman for this important observation on the natural reduction of the correlation coefficient if we restrict one of the variables in a particular range. However, the explanation of the reduction could be made clearer and more general. In addition, a discussion on the magnitude of this reduction would be of great interest. For example, we may want to know how this reduction is related to the probability of X falling in the restricted range. Below, we shall explicitly explain why the reduction occurs and how the reduction depends on the range of the restriction.
Suppose that we are interested in estimating the correlation between X and Y based on a random sample of n paired samples from a bivariate normal distribution. As a bivariate normal distribution can be transferred to the standard bivariate normal distribution with the same correlation coefficient, without loss of generality, we assume the bivariate normal distribution has mean (0, 0), variance (1, 1), and a correlation r. Now let us consider the correlation in the restricted interval: X is between a and b. Let f(.) and F(.) denote the standard normal probability density function and cumulative distribution function. Similar to what we have done before (ref. 2), it can be shown that, when n is large, the correlation in this restricted interval of X, converges to
is the variance of the truncated standard normal variable of X within the range of (a, b).
Knowing that the variance of the truncated standard normal variable within the range of (a, b) is smaller than or equal to 1, the variance of the unrestricted X, we derive that the restricted correlation is less or equal to r. Therefore, the correlation in this restricted interval of X attenuates the correlation between X and Y. Figure 1 illustrates how the level of attenuation depends on the range (a, b) graphically. Specifically, it shows how the attenuation depends on the probability of X <=a and the probability of a< X <= b.
Contributors: LN and HC equally initiated, designed, and drafted the paper. All authors approved this version of the paper.
Views expressed in this paper are the author's professional opinions and do not necessarily represent the official positions of the U.S. Food and Drug Administration.
Reference
1. Bland JM, Altman DG. Correlation in restricted ranges of data. BMJ 2011;342:d556.
2. Chu H, Nie L, Cole SR. Sample size and statistical power assessing the effect of interventions in the context of mixture distributions with detection limits. Stat Med 2006;25(15):2647-57.
Figure 1: The relationship between the correlation coefficients rr for the restricted interval of X and the probability of X <= a and the probability of a< X <= b. The 19 lines presented in each plot correspond to the (unrestricted) coefficient being - 0.9 to 0.9 by 0.1 from top to bottom.
Competing interests: None declared
OB/OTS/CDER, the US FDA








Re: Study proposes antibiotics as possible new treatment for some types of chronic low back pain
Published 20 May 2013
Obstructive sleep apnoea in adults : a simple non-invasive, novel, innovative and painless "MAINPURI TECHNIQUE" for its effective management
Published 20 May 2013
Re: Minimal access surgery compared with medical management for gastro-oesophageal reflux disease: five year follow-up of a randomised controlled trial (REFLUX)
Published 20 May 2013
Re: Good medicine: homeopathy
Published 20 May 2013