Intended for healthcare professionals

Research Methods & Reporting Statistics Notes

Population attributable fraction

BMJ 2018; 360 doi: (Published 22 February 2018) Cite this as: BMJ 2018;360:k757
  1. Mohammad Ali Mansournia, assistant professor of epidemiology1,
  2. Douglas G Altman, professor of statistics in medicine2
  1. 1Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran
  2. 2Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
  1. Correspondence to: M A Mansournia mansournia_ma{at}

Much statistical analysis seeks to identify associations between exposures and outcomes. The population attributable fraction (PAF) is an epidemiologic measure widely used to assess the public health impact of exposures in populations. PAF is defined as the fraction of all cases of a particular disease or other adverse condition in a population that is attributable to a specific exposure; PAF equals (O − E)/O, where O and E refer to the observed number of cases and the expected number of cases under no exposure, respectively. The term “attributable” has a causal interpretation: PAF is the estimated fraction of all cases that would not have occurred if there had been no exposure.1 As an example, in early 1950,2 Doll derived O = 11189 and E = 1875 using the Doll and Hill case-control study of smoking and lung cancer deaths throughout England and Wales,3 so the smoking PAF for lung cancer deaths was (11189 − 1875)/11189 = 83%.

Using a cohort study, following Miettinen, we can estimate the PAF from the estimated relative risk (RR) for the exposure and the prevalence of exposure among cases (pc), as PAF = pc(1 − 1/RR).4 Suppose that a particular exposure doubles the risk of a certain outcome (that is, RR = 2). If the prevalence of exposure among cases is 0.6, then PAF = 0.6(1 − 0.5) = 0.3 (that is, 30%). PAF depends not only on the increased risk associated with the exposure but is also directly related to the prevalence of exposure. PAF is usually expressed as a percentage.

Cohort studies are observational and thus liable to confounding,56 so crude (unadjusted) RR should not be used. An adjusted RR can be used in the Miettinen formula to estimate a valid PAF. Alternatively, one can directly estimate the PAF original formula “(O − E)/O” using results from a multivariable logistic regression model.7

As an example of the latter approach, the authors of a recent BMJ paper8 calculated the population attributable fraction (PAF) of concurrent benzodiazepine/opioid use for the risk of opioid overdose in a retrospective analysis of claim data. This fraction represents opioid overdose case reduction in the population that would occur if concurrent benzodiazepine/opioid use could be eliminated entirely. The PAF estimate was 15% (95% confidence interval 14 to 16%).8 Valid 95% confidence intervals for PAF should take into account the uncertainty in both the observed and expected number of cases.7

The PAF formula with adjusted RR is easily generalised to exposures with more than two levels.9 In a cohort study the PAF for the effect of maternal overweight and obesity on infant mortality in relation to normal weight was estimated as 11%.10 Similarly, we can calculate PAF for the joint effects of two or more exposures. Such a PAF is expected to be less than the sum of the PAF for each exposure because people exposed to both exposures should not be counted twice. Finally, for preventive exposures one can reverse the coding: RR is now the adjusted risk ratio for no exposure and pc is the prevalence of no exposure among cases. The result is known as preventable fraction: the fraction of all cases that would be prevented if the whole population were exposed.

We can use valid estimates of hazard ratio (or rate ratio) from cohort studies or odds ratio from case-control studies instead of RR in the Miettinen PAF formula if the outcome is uncommon. Here we assume that removing an exposure does not affect the person-time at risk, which may not be true. For example, omitting smoking expands person-year at risk of coronary deaths by removing other competing risks for deaths such as lung cancer.11

Other important assumptions underlie the PAF. As usual, we make the strong assumptions that there is no bias in the study design and data analysis; in particular, that the estimated effect is adjusted for all confounders. In addition, we assume that removing the exposure does not affect other risk factors. This assumption may not be true in practice; for example, removing smoking may decrease alcohol consumption, making interpretation of smoking PAF for coronary deaths difficult.

Also, PAF assumes that there is a perfect intervention which eradicates the exposure. However, complete removal of an exposure is often unrealistic; even with legal restrictions and cessation programmes, many people will continue to smoke. A measure that allows for these realities is the generalised impact fraction, which is the fractional reduction of cases that would result from changing the current level of exposure in the population to some modified (partially removed) level.12 More technical issues about PAF, including its difference from aetiologic fraction, can be found elsewhere.13


  • Contributors: MAM and DGA jointly wrote and agreed the text.

  • Competing interests: We have read and understood the BMJ Group policy on declaration of interests and have no relevant interests to declare.

  • Provenance and peer review: Not commissioned; not externally peer reviewed.