Education And Debate

How to read a paper: Statistics for the non-statistician. II: “Significant” relations and their pitfalls

BMJ 1997; 315 doi: http://dx.doi.org/10.1136/bmj.315.7105.422 (Published 16 August 1997) Cite this as: BMJ 1997;315:422
  1. Trisha Greenhalgh, senior lecturera (p.greenhalgh@ucl.ac.uk)
  1. a Unit for Evidence-Based Practice and Policy, Department of Primary Care and Population Sciences, University College London Medical School/Royal Free Hospital School of Medicine, Whittington Hospital, London N19 5NF

    Introduction

    This article continues the checklist of questions that will help you to appraise the statistical validity of a paper. The first of this pair of articles was published last week.1

    Correlation, regression, and causation

    Has correlation been distinguished from regression, and has the correlation coefficient (r value) been calculated and interpreted correctly?

    For many non-statisticians, the terms “correlation” and “regression” are synonymous, and refer vaguely to a mental image of a scatter graph with dots sprinkled messily along a diagonal line sprouting from the intercept of the axes. You would be right in assuming that if two things are not correlated, it will be meaningless to attempt a regression. But regression and correlation are both precise statistical terms which serve quite different functions.1

    The r value (Pearson's product-moment correlation coefficient) is among the most overused statistical instrument. Strictly speaking, the r value is not valid unless the following criteria are fulfilled:

    Summary points

    An association between two variables is likely to be causal if it is strong, consistent, specific, plausible, follows a logical time sequence, and shows a dose-response gradient

    A P value of <0.05 means that this result would have arisen by chance on less than one occasion in 20

    The confidence interval around a result in a clinical trial indicates the limits within which the “real” difference between the treatments is likely to lie, and hence the strength of the inference that can be drawn from the result

    A statistically significant result may not be clinically significant. The results of intervention trials should be expressed in terms of the likely benefit an individual could expect (for example, the absolute risk reduction)

    • The data (or, more accurately, the population from which the data are drawn) should be normally distributed. If they are not, non-itemmetric tests of correlation should be used instead.1

    • The …

    Sign in

    Subscribe