Relation between income inequality and mortality: empirical demonstration

Michael Wolfson, George Kaplan, John Lynch, Nancy Ross, Eric Backlund

Institutions and Social Statistics Branch, Statistics Canada, Ottawa, Canada K1A 0T6

Michael C Wolfson,
director general
Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, 48109-2029, United States

George Kaplan,
professor and chair of epidemiology

John Lynch,
assistant professor
Social and Economic Studies Division, Statistics Canada

Nancy Ross
analyst
Federal Building #3, US Bureau of the Census, Washington DC 20233-8700, United States

Eric Backlund,
mathematical statistician
Correspondence to: M Wolfson wolfson{at}statcan.ca

Abstract

Objective  To assess the extent to which observed associations at population level between income inequality and mortality are statistical artefacts.

Design  Indirect "what if" simulation by using observed risks of mortality at individual level as a function of income to construct hypothetical state level mortality specific for age and sex as if the statistical artefact argument were 100% correct.

Setting Data from the 1990 census for the 50 US states plus Washington, DC, were used for population distributions by age, sex, state, and income range; data disaggregated by age, sex, and state from the Centers for disease Control and Prevention were used for mortality; and regressions from the national longitudinal mortality study were used for the individual level relation between income and risk of mortality.

Results  Hypothetical mortality, while correlated with inequality (as implied by the logic of the statistical artefact argument), showed a weaker association with states’ levels of income inequality than the observed mortality.

Conclusions  The observed associations in the United States at the state level between income inequality and mortality cannot be entirely or substantially explained as statistical artefacts of an underlying individual level relation between income and mortality. There remains an important association between income inequality and mortality at state level over and above anything that could be accounted for by any statistical artefact. This result reinforces the need to consider a broad range of factors, including the social milieu, as fundamental determinants of health.

Introduction

Considerable debate surrounds the impact of socioeconomic circumstances on individuals’ health. Recent results suggest that there is a link not only between individual socioeconomic circumstances and health but also between the socioeconomic milieu in which individuals live and their health. Research has shown that higher levels of inequality in income among nations, states, or cities in the United States, or other geographically defined populations, are associated with higher mortality. (1) (2) (3) (4)

Concerns have been raised by Gravelle, however, that these results may be no more than a statistical artefact. (5) Gravelle points out, as others have noted previously, (6) (7) that a "diminishing returns" protective effect of higher individual income on individual risk of death is sufficient to account for differences in mortality between populations if there are differences in the extent of wealth and poverty, hence in the degree of income inequality.

The logic of this argument is correct. At the individual level, higher income (or some closely related but unmeasured factor, such as social status, for which income is a proxy) is causally associated with greater longevity. (8) Moreover, while an extra dollar or pound of income is protective, the amount of protective effect tails off as total income rises. (8) (9)

At the level of a population there is always some mixture of people with low, middle, and high incomes. If one population has a more equal distribution of income than another this is equivalent to there being fewer individuals with either very high or very low incomes and more with incomes closer to the middle. But if a poorer individual is £1000 better off in a second population the beneficial effect on his or her risk of mortality is larger than the adverse impact on the risk of some richer person being £1000 worse off because of the diminishing protective returns of additional income. Thus, a population with a more equal distribution of income can have a lower mortality, other things being equal, solely as a result of a generic curvilinear individual level causal relation between income and risk of mortality.

This logical possibility, however, is not a sufficient reason to dismiss the potential importance of inequality in income as an independent determinant of population level mortality. This remains an empirical question.

We approached this question indirectly by first estimating a generic individual level relation between income and risk of mortality. We then simulated the extent to which variations in the distribution of income across populations can account for the observed population level relation between income inequality and mortality. In other words, we asked "what if" our well specified individual level income-mortality relation were fully causal, the key step in Gravelle’s argument. We therefore applied this relation to all individuals in a population group based on its actual income distribution and then calculated expected mortality. The extent to which we reproduce the observed population level association between income inequality and mortality is then an empirical test of the statistical artefact hypothesis. (Alternative direct tests of the importance of income inequality have also been undertaken, with mixed results. (10) (11) (12))

Our conclusion, based on analysis of data for US states, is that the observed association between income inequality and mortality rates cannot be substantially explained as a statistical artefact.

Methods

On the relations among income inequality, individual income, and health

At the individual level there is a convincing association between socioeconomic status and health status. In the United Kingdom these associations are most often observed in terms of the registrar general’s social classes and mortality, (13) whereas in data from the United States, socioeconomic status is typically measured by annual income or educational attainment. (14) Furthermore, there is good evidence, based on longitudinal analyses of representative populations, that the individual level association reflects, for at least a substantial proportion of the population, a causal pathway from higher (lower) income to a lower (higher) expectation of mortality. (8)

At the level of groups of individuals or populations there is also clear evidence showing that polities with higher levels of income inequality have higher mortalities. (1) (2) (3) (4) These two measures—income inequality and mortality—are both inherent attributes of populations hence intrinsically ecological. Neither has any meaning at the individual level. Any analysis that seeks to disentangle relations among these factors, some of which pertain to individuals while others apply at the ecological or population level, requires data at both levels of observation.

In the case of individual income, however, at least two meanings are often attached. One is "absolute", where the focus is on what material goods and services, such as nutritious food and adequate housing, a given of income can buy. Used in this absolute sense, income can be treated as an individual characteristic like blood pressure. On the other hand, income can also be interpreted as a marker of social position—for example, rank in a hierarchy of social status—or a perceived sense of relative deprivation. In these latter "relative" cases it is impossible to interpret a given level of income for an individual without knowledge of the incomes of at least some of the other people in that individual’s "social context".

Gravelle frames the artefact hypothesis as a simple debate between the proponents of an "absolute" and a "relative" income hypothesis, where the "relative" hypothesis is given a different meaning, referring to the extent of inequality in the distribution of income. The essence of his argument is that the "relative income" hypothesis is (at least in part) nothing more than a direct logical consequence of the "absolute income" hypothesis, where the critical feature of the latter is that the relation between individual level income and mortality is non-linear and, more precisely, convex.

This logical possibility of an relation between income inequality and mortality was established well before Gravelle’s analysis. The irony is that the earlier authors used this possibility in a positive sense. Unlike Gravelle, who uses it as a criticism of the meaningfulness of observed associations between mortality and income inequality, they used the curvilinear (convex) individual level relation between income and mortality risk to explain why we should not be surprised that there is a population level relation between inequality and mortality. For example, Duleep states, "income may be an important determinant of a country’s aggregate mortality if there is a causal association between income and mortality for individuals and the underlying individual-level relationship is non-linear." (6) Similarly, Lynch and Kaplan, referring in turn to a series of studies dating back to Preston in 1975, (15) note that, "the [individual-level] non-linear relationship between income and mortality is a sufficient condition for income distribution to be a determinant of mortality, but it is probably not the most interesting or, indeed, most important part of understanding how income distribution impacts health." (7)

Empirical background

Even though Gravelle’s artefact argument is presented as a question of logic, the empirical question remains: Is the observed ecological association solely or predominately attributable to the individual level relation between income and mortality or is something else also going on? We used two main kinds of data to answer this question, one at the individual level, the other at the population level.

The first individual level data analysis was a specially generated estimate of the relation between individual income and risk of mortality risk, based on about 7.6 million person years of data for a representative sample of the population of the United States—the 10 year follow up for the national longitudinal mortality study, which in turn was based on matching files with household income and other demographic information from the current population survey conducted by the US Census Bureau to the National Death Index. We use a Cox proportional hazards regression to fit for mortality as a function of baseline log individual income (defined as the total pretax income received by all members of the individual’s household) measured as a categorical variable with the top open ended interval ($50 000 and over in 1980 dollars, which, with inflation, corresponds to about $80 000 in 1990 dollars (£120 000) containing 5.2% of the sample. The hazard regression also included a series of dummy variables for age starting at 25 in 5 year intervals up to 80-84 and the open ended interval age 85 and over, for household sizes 1 to 7 and over, and for sex.

Results

The downward sloping curves (close together) in figure 1 show the results—the estimated relation between household income and the relative risk of mortality, plus a 95% confidence interval, after age and sex (significant) and for household size (non-significant) were controlled for. This resulting pattern is highly significant both statistically and substantively and is clearly consistent with Gravelle’s and others’ assumption of a convex individual level relation between income and mortality risk (though recall that the curve for incomes above $100 000 (£150 000) should be treated as an extrapolation). The other "humped" curve shows the frequency distribution of all individuals in the US population in 1990 by the same definition of household income. This income distribution curve is discussed further below.
 


(F1) Fig 1 Relative risk of dying and population distribution for US individuals by household income ($)

The choice of a logarithmic specification for income in the hazard regression, hence a power relation for the mortality relative risk as a function of income, is clearly arbitrary. Backlund et al, however, tested this specification using an earlier version of the same NLMS data and concluded that it was a reasonable characterisation of the shape of the relation. (9) Moreover, the analysis in Wolfson et al for a large cohort of older Canadian men, where there was no such a priori assumption, supports this choice of functional form for the individual level income-mortality relation. (8) It turns out that a re-examination of these results generates an almost identical coefficient as that from the regression shown in figure 1. The relative risk of dying is about equal to: (individual income/mean income )–0.2, where –0.2. is the estimated exponent.

To complement this individual level relation for the US population, we generated consistent population level or ecological data on income inequality and mortality for each of the 50 US states, plus Washington DC. For income inequality, special tabulations were purchased from the 1990 US population census. These tables provided detailed counts of the numbers of individuals living in households by state, sex, age range (<1, 1-4, 5 year intervals to 90–94, and 95), and the income of the household in which they lived (the same definition of income as in the proportional hazards regression in figure 1) broken down into 32 intervals ranging up to $250 000 (375 000) and above. The income distribution curve shown in figure 1 is derived from these data. Counts of the non-household (institutionalised) population were also tabulated by state, sex, and the same age intervals. Meaningful income data for the these populations were unavailable.

For the household population, an additional table of mean incomes within each age/sex/state/income interval cell was generated. These cell mean incomes allow more accurate estimates of hypothetical mortality as detailed in the appendix as well as better estimates of income inequality.

Mortality data were downloaded from the Centers for Disease Control and Prevention’s CDC WONDER site (http://wonder.cdc.gov/). In this case, data by state, sex, and age for 1989, 1990, and 1991 were used. The death counts were then averaged over the 3 years to improve the stability of the mortality estimates. Where the age intervals in the CDC data were coarser (for example, 10 year intervals) mortality was assumed to be identical for the 5 year subintervals in the census data. (Admittedly, this is a crude assumption but we thought that using more sophisticated interpolation would not materially affect the results. There were some differences between the denominators of the CDC mortality and the 1990 census population counts. As a result, the CDC (3 year average) mortality was used and then applied to the 1990 census population counts to derive estimated counts of deaths.) Finally, for each state (plus Washington, DC) a series of inequality measures was computed, as well as estimates of mean and median income.

Constructing an empirical test

Given these data, an indirect empirical assessment of the artefact hypothesis can be constructed by supposing that the artefact hypothesis is completely true and then asking what kind of ecological association between income inequality and mortality would result. In other words, the hypothesis can be tested by constructing a hypothetical series of state-specific mortalities. For each state, the overall relation between individual level income and mortality risk shown in figure 1 as estimated for the entire US population is assumed not only to be 100% causal (that is, absolute income level causes mortality risk and not the reverse) but also to be the only casual factor that matters. This "national truth" about the relation between individual level income and mortality risk is then applied to the actual income distribution of the population within the state as observed in the 1990 census. This generates a set of age-sex expected relative mortality risks for each state, averaged over that state’s income groups, which is then multiplied by the corresponding US national age-sex specific mortality to generate the desired hypothetical state specific mortality.

More precisely, for each state, we have the household population arrayed by sex, 21 age groups, and 32 income ranges as well as the institutionalised population arrayed by sex and 21 age groups. In each of these categories we know how many individuals there were, and for the household population we also know how many were in each income range and their average income within that range. We then simply "plug these data" into the relation of figure 1. The result is each age-sex-state-income subgroup’s relative risk of dying as if the only reason for differences between states in mortality risks for a given age-sex group were differences in the inequality of state income distributions.

Next, these relative risks are multiplied by the corresponding national age-sex specific mortalities. The result is a set of age-sex-state-income specific expected mortalities for the household population—under the assumption that the only reason a state’s mortality experience should differ from the national average is that its population is distributed by income group in a different way than the national pattern, exactly as posited by Gravelle in his explanation of the artefact hypothesis.

For the institutionalised population there were no meaningful income data so their incomes were arbitrarily set equal to the US national average income. As a result their mortality risks end up equal to exactly 1× general (household plus institutionalised population for all states combined) age-sex specific mortality. This is probably an underestimate of mortality among institutionalised people but there were no readily available data for determining a separate mortality. The assumption of a relative risk of 1 for this small subpopulation (about 2.7% of the total; and 4.4 % of those age 60) does not affect the thrust of our empirical test. (For example, if the incomes of institutionalised people are all assumed to be 0, the overall results are generally unaffected.)

It is important to note that the artefact hypothesis as stated by Gravelle is only a sketch. It lacks important detail such as how different age-sex groups should be treated. For example, the average income of (non-institutionalised) elderly people within each state in all cases is well below the state mean income so their relative risks computed as above will generally be substantially greater than 1. They therefore end up having considerably higher hypothetical than actual average mortality given our (necessarily) much more detailed elaboration of the artefact hypothesis. It should be emphasised, however, that if the average incomes of elderly people were close to the mean income, in other words if the aspect of income inequality associated with age were greatly reduced, differences between actual and hypothetical mortality for elderly people would essentially disappear.

A formal algebraic description of the process for constructing these hypothetical "artefact as the only cause of interstate variations" mortalities is given in the appendix.

Statistical analysis

Given a set of hypothetical mortalities explicitly constructed as if the artefact hypothesis were 100% correct, the empirical test of the hypothesis consists simply of examining whether hypothetical and actual state mortalities are similarly distributed.

As a first step we examined the range of variation in actual and hypothetical mortality across the 50 US states plus Washington DC. For no other reason than the variability in state mean incomes we expected that the convex relation between individual level income and mortality in figure 1 would generate some variability in the hypothetical mortality. In other words, two different states with different mean incomes but the same relative inequality would have different hypothetical mortality as their populations cluster at different points along the x axis in terms of figure 1, hence at different levels of relative risk. The results are shown in figure 2 for one demographic subgroup—men of working age (25 to 59 years). This is a scatter plot of both actual mortality (solid circles) and hypothetical mortality (open circles) by state mean income. The latter rates have been constructed as if the only cause of differences in interstate mortality was the artefact hypothesis, The area of each circle is proportional to the state’s population. (Note that in this and later cases of age groups that are broader than the age intervals in the underlying data, the mortality is weighted averages based on the overall US population distribution by age and sex.)
 


(F2) Fig 2 Actual and hypothetical mortality by state mean income for working age (25 to 59 years) men

The open circles in figure 2, representing the hypothetical mortality, show a clear though modest association with mean income. This is entirely expected because of the way the hypothetical rates have been constructed. Figure 2, however, shows far less variability for the hypothetical than for the actual mortality for this demographic group. Moreover, actual mortalities show virtually no association with state’s mean incomes.

The table gives a summary statistical indication of this comparative variability of hypothetical and actual state specific mortality for each of six demographic subgroups and for the entire US population, measured by the interquartile range (75th less 25th centile). The last column shows the first column as a percentage of the second. Nowhere does the variability in the hypothetical mortality exceed about half of the actual variability. These results, and figure 2 above, are the most straightforward indication that while logically correct, Gravelle’s artefact hypothesis is incomplete as an explanation.

Table 1 Interquartile ranges of actual and hypothetical mortality by demographic group
 
Population group (age)
Interquartile range of mortality per 100 000
Hypothetical interquartile range as percentage of actual range
Hypothetical
Actual
Infants (<1)
67
172
39
Young (1-24)
4
16
23
Men (25-59)
21
120
18
Women (25-29)
28
50
56
Men ($60)
178
514
35
Women ($60)
131
362
36
All demographic groups
38
116
33

The table is limited, however, because it does not show the relations of either the hypothetical or the actual mortality to inequality. Figure 3 therefore shows scatter plots for each of the demographic subgroups shown in the table. The x axis in all cases ranks states according to one straightforward summary inequality indicator—the "median share"—that is, the proportion of total household income accruing to the bottom half of the population. A carefully selected range of other inequality indicators has also been examined, including measures sensitive to both "poverty" and "affluence" as well as measures of polarisation as distinct from inequality. (16) The specific choice of inequality indicator does not materially affect the results, though this is contrary to the findings of Daly et al. (11) In addition, a weighted ordinary least squares regression line was fitted to each scatter to show the slopes of the relations.
 


(F3) Fig 3 Hypothetical and actual mortality for six demographic groups by income inequality

If the observed ecological association between average mortality and income inequality were completely artefactual, then the two scatters of points (actual and hypothetical, solid and open circles) would essentially be on top of one another and the two regression lines would be superimposed. This, however, is clearly not the case. The mortality based on the artefact hypothesis shows some slope in the expected direction—a higher share of income accruing to the bottom half of the population, indicating lower inequality, is associated with lower mortality. But these slopes are less than the slopes of the actual mortality in relation to the median share indicator of income inequality. (Recall that the level of hypothetical mortality, particularly for the older age groups, is higher than the actual because elderly people have incomes that are much lower than average in the US.)

Discussion

The pattern of mortality generated from a literal application of Gravelle’s artefact hypothesis provides a poor fit with the observed data in the US. As a consequence, the observed association in the at the state level between income inequality and mortality cannot be entirely or substantially explained as statistical artefacts of an underlying individual level relation between income and mortality risk. There remains a significant and important association between state level income inequality and mortality over and above anything that could be accounted for by any statistical artefact. This result reinforces the need to consider a broad range of factors, including the social milieu, as fundamental determinants of health.

We acknowledge helpful discussions with Richard Wilkinson, George Davey-Smith, Eric Brunner, Bruce Kennedy, Ichiro Kawachi, Geoff Rowe, and Jean-Marie Berthelot; comments by two anonymous referees; participants in the conference on economic equity in Ann Arbor, 4-6 June; and members of the population health programme of the Canadian Institute for Advanced Research on earlier versions of this paper. We also thank Susan Leroux for helpful analytical assistance. We remain responsible for any errors or infelicities.

Contributors: MCW conceived the methods used for assessing empirically the artefact hypothesis, specified, acquired, and analysed the US census data, and developed and wrote the software for constructing the hypothetical counter factual. GK and JL inspired the analysis and participated in the framing and writing of the final papers. NR undertook the statistical analysis of the state level data, prepared the graphical results, and participated in the writing of the papers. EB undertook the special regression analyses for the individual level relation between mortality and income and participated in the writing of the papers. MCW is guarantor.

Funding: MCW—Statistics Canada and Canadian Population Health Initiative; GK—University of Michigan Initiative on Inequalities in Health NR.

Competing interests: None declared.

Appendix

This appendix describes formally the construction of the hypothetical set of mortalities for a collection of geographic areas (in this case, US states), based on the assumption that 100% of the ecological relation is in fact artefactual. The empirical test of the artefact hypothesis is then based on comparison of this set of hypothetical mortalities with the observed mortality. If the two sets of mortalities are similar then the claim of an artefactual relation is reasonable. On the other hand, if there remain large differences then the claim is clearly wrong.

For convenience of notation, we have left out a subscript for sex, though sex was explicitly taken into account in all the empirical analysis. The following formulas set out the construction of the hypothetical mortality:

x=an index for age, where 5 year age groups to 95 have generally been used

j=index for geographic area, in this case states in the US

I=index for income group

mxj=mortality at age x, for geographic area j (from CDC)

PHixj=observed household population age x in area j and income group I (from 1990 census cross-tab)

PHxj=observed household population age x, in area j = i Pixj

PIxj observed institutionalised population age x, in area j (from 1990 census cross-tab)

Pxj=PHxj+PIxj=observed total population age x, in area j

pAx=proportion of total population age x, all areas combined= jPxj/ j xPxj so that xpAx=1

amrj= xmxjpAx=average mortality for area j, based on the national (reference) age structure

mAx=(weighted) average national mortality at age x for all areas combined= j Pxjmxj/ jPxj

yixj=mean household income in income interval I for area j and age x (from 1990 census cross-tab)

µxj=mean household income for all household individuals in area j and age x= iPixjyixj/Pxj

=estimated coefficient (0.194 to be precise) of the relation in figure 1 of main text

RRixj="relative risk" of mortality in income group I, area j, and age x=(yixj )/(µj )

m*xj="expected" mortality at age x in area j if it were "simply" the combination of an overall (national) baseline mortality hazard mAx, the area j-specific distribution of individual level incomes, and the relation between the national individual level income and mortality reflected by and resulting RR curve

=mAx[PIxj+ iPHixjRRixj]/Pxj

amr*j= xm*xjpAx

  1. Wilkinson RG. Unhealthy societies: the afflictions of inequality. London: Routledge, 1996.
  2. Kaplan GA, Pamuk ER, Lynch JW, Cohen RD, Balfour JL. Inequality in income and mortality in the United States: analysis of mortality and potential pathways. BMJ 1996;312:999-1003 (see correction in BMJ 312:1253).
  3. Kennedy BP, Kawachi I, Prothrow-Stith D. Income distribution and mortality: cross sectional ecological study of the Robin Hood index in the United States. BMJ 1996;312:1004-7 (see correction in BMJ 312:1194).
  4. Lynch JW, Kaplan GA, Pamuk ER, Cohen RD, Heck K, Balfour JL, Yen IH. Income inequality and mortality in metropolitan areas of the United States. Am J Public Health 1998;88:1074-80.
  5. Gravelle H. How much of the relation between population mortality and unequal distribution of income is a statistical artefact? BMJ 1998;316:382-5.
  6. Duleep HO. Mortality and income inequality among economically developed countries. Social Security Bull 1995;58:34-50
  7. Lynch JW, Kaplan GA. Understanding how inequality in the distribution of income affects health. J Health Psychol 1997;2:297-314.
  8. Wolfson M, Rowe G, Gentleman JF, Tomiak M. Career earnings and death: a longitudinal analysis of older Canadian men. J Gerontol Soc Sci 1993;48:S167-79.
  9. Backlund E, Sorlie PD, Johnson NJ. The shape of the relationship between income and mortality in the United States: evidence from the national longitudinal mortality study. Ann Epidemiol 1996;6:1-9.
  10. Fiscella K, Franks P. Poverty or income inequality as predictor of mortality: longitudinal cohort study. BMJ 1997;314:1724-7.
  11. Daly MC, Duncan GJ, Kaplan GA, Lynch JW. Macro-to-micro links in the relation between income inequality and mortality. Milbank Q 1998;76:315-39.
  12. Kennedy BP, Kawachi I, Glass R, Prothrow-Stith D. Income distribution, socioeconomic status, and self rated health in the United States: multilevel analysis. BMJ 1998;317:917-21.
  13. Townsend P, Davidson N, Whitehead M, eds. Inequalities in health: the Black report and the health divide. London: Penguin, 1988.
  14. Rogot E, Sorlie PD, Johnson NJ. Life expectancy by employment status, income, and education in the national longitudinal mortality study. Public Health Rep 1992;107:457-61.
  15. Preston SH. The changing relation between mortality and level of economic development. Popul Stud 1975;29:231-48.
  16. Wolfson MC. Divergent inequalities: theory and empirical results. Review of Income and Wealth 1997;43: 401-21.
(Accepted 2 June 1999)



Student BMJ

Risk of surgery for inflammatory bowel disease: record linkage studies

What can you learn from this BMJ paper? Read Leanne Tite's Paper+

www.student.bmj.com

Listen to the latest BMJ Interview