Intended for healthcare professionals

Education And Debate

Inflation in epidemiology: “The proof and measurement of association between two things” revisited

BMJ 1996; 312 doi: (Published 29 June 1996) Cite this as: BMJ 1996;312:1659
  1. George Davey Smith, professor of clinical epidemiologya,
  2. Andrew N Phillips, reader in epidemiology and biostatisticsb
  1. a University of Bristol, Department of Social Medicine, Bristol BS8 2PR
  2. b University Department of Primary Care and Population Sciences, Royal Free Hospital School of Medicine, London NW3 2PF

    The details of methods of statistical analysis used in studies reported in the BMJ will often be skimmed rapidly by readers who want to quickly assimilate the main message. The exact nature of the statistical methods may become a focus of attention, but this can be seen as an arcane area, of interest perhaps to the specialist and pedant, but not to the general reader. Increasingly, however, the particular details of analytical methods can greatly influence the apparent nature and importance of the findings. This can be illustrated by reference to the recent paper and commentaries in the BMJ regarding new analyses of Intersalt data.

    One potentially contentious area is the manner in which the association between sodium excretion and blood pressure has been “corrected for regression dilution bias.” For many readers the basic principle of dealing with the underestimation of associations caused by poor measurement may seem reasonable, but the validity of applying the particular corrections which are used has to be taken on trust. Confusion may be increased by the presentation of a set of “updated” corrected estimates, to replace the already corrected estimates given in the initial Intersalt report.1

    Comparison of the results reported in different ways reveals the degree to which such “corrections” can alter the picture. The difference in systolic blood pressure associated with 100 mmol higher 24 hour urinary sodium excretion is presented. The latter represents a considerable difference in sodium excretion—roughly two standard deviations in the British Intersalt centres or the difference between the means in the Kenyan and the British centres. In the original report the estimated blood pressure difference across this range was 1.6 mm Hg, which was reduced to 1.0 mm Hg on adjustment for body mass index, alcohol intake, and urinary potassium excretion. Applying an adjustment for regression dilution bias increased the differences to 3.5 mm Hg and 2.2 mm Hg respectively. In the new analyses these have been further increased to 4.3 and 3.2 mm Hg.

    These estimates of systolic blood pressure reduction consequent on lowering salt consumption sufficient to reduce urinary sodium excretion by 100 mmol would translate into reductions in long term cardiovascular disease mortality ranging from 4% using the original uncorrected estimate to 21% using the latest corrected estimate.2 These different projected mortality reductions would certainly lead to differences in the level of enthusiasm with which public health interventions aimed at reducing salt consumption were applied. In this contribution we suggest that the assumptions which are made by investigators who correct effect estimates for regression dilution bias may often not hold and that the corrections which are performed can therefore be spurious.

    Is it correct to “correct”?

    What is now referred to as regression dilution bias in some sections of the medical literature was introduced as “attenuation by errors” when discussed in detail by Spearman in his seminal paper of 1904.3 Biological variability or technical measurement error will often lead to measures of association being biased towards the null value. Thus any measurement error in urinary sodium concentrations—due to incomplete collections, for example—and, in so far as it is not correlated with changes in blood pressure, any biological variability in urinary sodium concentration will lead to any true association between measurements of urinary sodium concentration and blood pressure being attenuated. In these circumstances, repeat measurements of sodium excretion allow the degree of such variation to be estimated, and various correction factors based on these estimates have been proposed.3 4 A correction factor of this type was applied in the initial Intersalt report.1

    Applying such corrections is useful for illustrating the size of effects that may underlie observed associations, but they are not without potentially serious problems.3 5 6 Firstly, the correction methods implicitly assume that if on a given day an individual has a urinary sodium concentration which is above his or her average then this does not imply that the blood pressure is likely to be above (or below) average at the same time. If this assumption were wrong, and if fluctuations in urinary sodium and blood pressure within individuals tended to coincide, then the association would not have been underestimated to the degree that the correction methods assume. An inappropriately inflated estimate would thus result from the correction procedure.

    Preliminary data are available which suggest that the assumption may not hold. Although based on casual urine samples rather than the 24 hour collections which were used in Intersalt, a study in the Gambia found a significant positive within person correlation between systolic blood pressure and urinary sodium concentration.7 This was based on 65 subjects with up to nine measures made over a 15 month period. Similarly a study from India has shown that month to month variation in blood pressure is associated in the expected direction with month to month variation in 24 hour urinary sodium excretion.8 Indeed, the very hypothesis under test—that sodium intake (and thus excretion) is related to blood pressure—would predict the associations which are found. The important point is that few investigators who magnify the strength of their associations through correction for regression dilution bias actually check whether the assumptions they are making apply.

    A second issue is that corrections could as well be applied to spurious associations as to causal ones. For example, yellow fingers indexed by simple inspection would be related to the risk of lung cancer in a prospective study, and reliability studies could be performed on the ascertainment of yellow fingers. These reliability studies would reveal substantial measurement imprecision, and correction factors could be applied using exactly the same logic as that used by the Intersalt investigators, which would magnify the strength of association between yellow fingers and risk of lung cancer. Judgment has to be applied to decide if an association is causal, and the fact that it can become very strong after correction for measurement imprecision should not contribute here.

    The associations between poorly measured exposures and outcomes are often those which should not be “corrected” upwards. Consider the positive relation between reported intake of trans fatty acids and risk of coronary heart disease which has been found in some studies. It may well be that the positive association exists because health conscious respondents give reports of their dietary intake which are aspirational rather than actual and thus report lower intake of trans fatty acids than less health conscious people.9 Health conscious individuals are likely to be healthier and have lower risk of coronary heart disease than less health conscious folk. In this situation a poorly measured exposure—the correlation between questionnaire estimates and adipose tissue biopsy estimates of trans fatty acid intake is only 0.349—could be associated with risk of developing coronary heart disease because of reporting bias, yet application of “correction factors” would greatly increase the magnitude of the association. Indeed, in one study of this issue the observed association between reported trans fatty acid intake and risk of coronary heart disease would imply that the underlying association is infinitely strong.9

    Corrections in the presence of confounders

    Applying correction factors to associations that are confounded or biased can be seriously misleading, but in observational epidemiology it is generally difficult to know whether this is what is being done. In the Intersalt study there are many potential confounders of the association between sodium excretion and blood pressure. Degree of obesity is one of these: various indicators of obesity are related to both higher blood pressure and greater sodium excretion. In the initial Intersalt report1 adjustment for body mass index markedly reduced the magnitude of the association between sodium excretion and blood pressure. It is widely recognised that measurement error in confounding factors can lead to incomplete adjustment10 11; thus imprecision in indexing smoking behaviour would leave a residual association between yellow fingers and lung cancer risk, even after adjustment for smoking. If correction for measurement imprecision is to be performed, then such corrections should certainly be applied to confounding factors as well as to the exposure of interest. The asymmetry in this regard in the original Intersalt report was noted by Mertens, who commented that “investigators are tempted to apply corrections when the methods inflate the primary associations under study, and are less keen to apply corrections to confounders when this may shrink the risk estimates of positive results.”12

    Since the original report the Intersalt investigators have developed their correction method beyond the simple method applied initially.13 14 This takes into account the fact that repeat measurements of sodium excretion will overestimate reliability, owing to the association of sodium excretion with age and sex. The simple correction used in the initial report consisted of multiplying the regression coefficient of blood pressure on urinary sodium excretion by 1 plus the ratio of within individual to between individual variance. As Intersalt participants were of both sexes and had a wide age range, the between individual variance was large and the correction factor was thus reduced in size in comparison with one calculated for individuals of the same age and sex. The method now applied adjusts the correction factor to take this into account.

    The new method further incorporates the idea that some confounders, in particular body mass index, are measured with little or no error and that adjustment for them therefore overadjusts the sodium excretion-blood pressure association. The principle here may appear reasonable. If a factor which is itself not causally related to a disease is associated with a causal factor and is measured more precisely than this causal factor then in multivariate analyses the spurious exposure may seem to be more strongly associated with the outcome of interest than is the actual causal factor.15 16 In the multivariate situation, however, corrections for regression dilution bias are extremely sensitive to the value of the reliability estimates for the exposures.16 Thus in one example making different—but plausible—assumptions about the reliability of measurement of two correlated exposures led to a complete reversal of the findings.16

    In one set of analyses in the recent paper the authors make the (impossible) assumption that body mass index is measured without error. This leads to marked further magnification of the size of the sodium excretion-blood pressure association. The assumption in a previous paper14 of the very high reliability of 0.98 for body mass index as the lowest level of reliability considered still seems unlikely and is not supported with data from the Intersalt study. Equally importantly, body mass index will be serving as a proxy for other aspects of body composition which will themselves be independently related to blood pressure.17 18 19 Various cross sectional and prospective studies have shown that body fat distribution, the percentage of body mass which is adipose tissue, and the amount of intra-abdominal fat contribute to the prediction of blood pressure even when simple measures of obesity, such as body mass index, have been taken into account. These additional indices of body composition will themselves be related to sodium excretion as well as to blood pressure.20 21 22 In this situation the use of body mass index (even if this were perfectly measured) will represent underadjustment for confounding by body composition and the reliability coefficient in this case is the (low) value relating body mass index to the underlying aspect of body composition for which it is serving as a proxy. In such circumstances the multivariate “corrections” for regression dilution bias which utilise the high reliability coefficients for body mass index could produce very misleading results.

    This point is illustrated by a well designed earlier study by one of the Intersalt principal investigators.23 Sodium excretion was accurately measured, with seven daily collections per participant. The association between sodium excretion and blood pressure was attenuated considerably more by adjustment for height and weight separately than by adjustment for body mass index. In these conditions it is simply inappropriate to assume that body mass index itself is a perfect marker of the aspects of body composition which confound the sodium excretion-blood pressure association and then to use the reliability coefficient for body mass index to “correct” the association.

    Finally, the multivariate correction methods which have been applied to the Intersalt data assume that there are no interactions between the exposures under consideration. In Intersalt itself there is a suggestion, albeit not statistically significant, that the association between sodium excretion and blood pressure is greater among individuals with higher body mass indices.24 Experimental data suggest that such interactions could be important. Obese adolescents whose blood pressure is salt sensitive become less salt sensitive after weight loss.25 This may reflect the influence of insulin resistance on blood pressure responses to dietary sodium.26 27 In the presence of such interactions multivariate correction methods could produce very unreliable results.

    Conclusions: why bigger isn't always better

    In this commentary we have outlined our misgivings regarding the application of methods to correct associations for measurement imprecision. We have illustrated this with regard to a recent presentation of Intersalt data. We are not intending to imply that the basic Intersalt findings are erroneous, but we feel that the mode of analysis and presentation of the data illustrate why considerable caution needs to be retained when estimates of effect size are corrected for regression dilution bias. Several conclusions relating to corrections for measurement imprecision can be drawn. Firstly, the fact that such corrections depend on several stringent criteria being met should be recognised, and serious attempts should be made to test if these assumptions actually hold. Secondly, while it is important to consider the possible size of the association which could, if certain conditions hold, underlie the observed association, corrected estimates should not be reported as the main findings in abstracts and summaries of papers. Thirdly, equal attention should be given to applying such corrections to confounding factors as well as to main effects, and the fact that there may be missing confounders, factors measured only by proxies, or interactions between exposures should always be considered. Fourthly, sensitivity analyses should be performed, varying the assumptions that are made, and imprecision in the reliability measures should be incorporated in these. The added uncertainty introduced by correction should be reflected in the estimates of precision that are given. Finally, it should be remembered that improving study design by incorporating multiple measurements on all participants and relating outcomes to better measures of exposure will give more reliable estimates in most situations, even when this may lead to a reduction in sample size.5


    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.
    23. 23.
    24. 24.
    25. 25.
    26. 26.
    27. 27.
    View Abstract