Intended for healthcare professionals

Primary Care

Patterns and distribution of tobacco consumption in India: cross sectional multilevel evidence from the 1998-9 national family health survey

BMJ 2004; 328 doi: (Published 01 April 2004) Cite this as: BMJ 2004;328:801
  1. S V Subramanian (svsubram{at}, assistant professor1,
  2. Shailen Nandy, PhD student2,
  3. Michelle Kelly, PhD student3,
  4. Dave Gordon, professor of social justice2,
  5. George Davey Smith, professor of clinical epidemiology4
  1. 1Department of Society, Human Development and Health, Harvard School of Public Health, 677 Huntington Avenue, KRESGE 7th floor, Boston MA 02115-6096, USA
  2. 2School of Policy Studies, University of Bristol, Bristol
  3. 3Social Science Research Unit, Institute of Education, University of London, London
  4. 4Department of Social Medicine, University of Bristol, Bristol
  1. Correspondence to: S V Subramanian
  • Accepted 10 February 2004


Objective To investigate the demographic, socioeconomic, and geographical distribution of tobacco consumption in India.

Design Multilevel cross sectional analysis of the 1998-9 Indian national family health survey of 301 984 individuals in 92 447 households in 3215 villages in 440 districts in 26 states.

Setting Indian states.

Participants 301 984 adults (≥ 18 years).

Main outcome measures Dichotomous variable for smoking and chewing tobacco for each respondent (1 if yes, 0 if no) as well as a combined measure of whether an individual smokes, chews tobacco, or both.

Results Smoking and chewing tobacco are systematically associated with socioeconomic markers at the individual and household level. Individuals with no education are 2.69 times more likely to smoke and chew tobacco than those with postgraduate education. Households belonging to the lowest fifth of a standard of living index were 2.54 times more likely to consume tobacco than those in the highest fifth. Scheduled tribes (odds ratio 1.23, 95% confidence interval 1.18 to 1.29) and scheduled castes (1.19, 1.16 to 1.23) were more likely to consume tobacco than other caste groups. The socioeconomic differences are more marked for smoking than for chewing tobacco. Socioeconomic markers and demographic characteristics of individuals and households do not account fully for the differences at the level of state, district, and village in smoking and chewing tobacco, with state accounting for the bulk of the variation in tobacco consumption.

Conclusion The distribution of tobacco consumption is likely to maintain, and perhaps increase, the current considerable socioeconomic differentials in health in India. Interventions aimed at influencing change in tobacco consumption should consider the socioeconomic and geographical determinants of people's susceptibility to consume tobacco.


Consumption of tobacco is a major risk factor for mortality.1 Recent shifts in global tobacco consumption indicate that an estimated 930 million of the world's 1.1 billion smokers live in developing countries,2 with 182 million in India alone.3 By 2020 tobacco consumption has been projected to account for 13% of all deaths in India.1 4

Smoking is not only associated with lung cancer5 but is also linked to cardiovascular diseases, tuberculosis, and chronic respiratory diseases.6 Although 20% of total tobacco consumption in India is through cigarette smoking,1 bidis (handrolled cigarettes that contain unprocessed tobacco) and hookahs are alternatives, with bidi smoking accounting for 40% of total tobacco consumption.1 3 Tobacco is also consumed, especially in India and South East Asian countries, through chewing (for example, paan masala, gutka, and mishri).7 8 Chewing tobacco is a risk factor for oral cancers.7 The annual incidence of oral cancer in men in India is estimated to be 10 per 100 000.9 Regardless of how tobacco is consumed, its adverse influence on disease and mortality among individuals and populations is clear.

Importantly, however, the distribution of tobacco consumption is not uniform. Tobacco consumption is often found to be disproportionately higher among lower socioeconomic groups.10 However, barring a few local studies,11 little systematic investigation has been done into how tobacco consumption is socioeconomically and geographically distributed in India. The gaps in tobacco consumption need to be examined to see which people are most likely to consume tobacco and which areas are more likely to have higher tobacco consumption. Such analyses are critical for designing policies and interventions aimed at achieving overall reductions in tobacco consumption at the population level and at reducing the inequalities in susceptibility to consume tobacco.

We investigate how tobacco consumption (in its smoking and smokeless form) is distributed across a range of demographic and socioeconomic markers at individual and household level in India. Conditional on this distribution we also estimate the extent to which the prevalence of tobacco consumption varies between localities, districts, and states.


Sources of data

The analysis was based on a representative, cross sectional 1998-9 national family health survey of 301 984 adults aged 18 and older, from 92 447 households from 26 Indian states.12 The household data, obtained through an interview based structured questionnaire and answered by an available adult household member, provide a range of demographic and socioeconomic markers on all the members of the household, including information on smoking and chewing tobacco.12 All households are geographically referenced to the primary sampling unit, district, and the state to which they belong. The primary sampling units (hereafter termed: local areas) are villages or groups of villages for rural areas, and wards or municipal localities for urban areas. The response rate to the survey ranged from 89% to 100%; in 24 of the 26 states it exceeded 94%.

Outcome measures

Our analysis used two dichotomous outcomes, based on the responses to the questions: “Does ‘household member’ chew paan masala or tobacco?” and “Does ‘household member’ smoke?” In addition, a combined measure of participants who smoke and chew tobacco was constructed in order to assess the distribution of consuming any tobacco. In our sample the overall prevalence for smoking was 18.4% and for chewing 21.0%; the combined prevalence was 32.9%. Table 1 provides the descriptive characteristics of the outcome and exposure measures in the sample population considered for analysis.

Table 1

Descriptive information on the individual sample considered for the analytical multilevel models from the 1998-9 Indian national family health survey, showing the frequency of different exposure variables along with the counts and prevalence of tobacco smoking, chewing, and smoking and chewing across different exposure variables. Values are numbers (%) of study participants for each variable

View this table:

Exposure measures

At the individual level we considered age (treated as continuous variable centred at its mean), sex, marital status, and educational attainment. At the household level we considered caste, religion, and a standard of living index based on material possessions. Caste status was based on the following mutually exclusive classification: scheduled caste, scheduled tribe, other backward caste; other caste; and no caste. “Scheduled caste” and “scheduled tribe” represents population groups identified by India's constitution as being marginal to the mainstream socioeconomic and political processes and, since 1951, are eligible for affirmative action. “Other backward caste” is another grouping of populations that are identified as socially and educationally backward. Since 1990 the other backward castes, while not sharing the constitutional affirmative action rights of scheduled castes and tribes are legally defined and covered by other legislative measures. “Other caste” is a residual category of people who are not scheduled caste or tribe, or other backward caste; “no caste” represents population groups for whom caste is not applicable (Muslims, Christians, or Buddhists, for example) and participants who did not report any caste affiliation in the survey. In general, the “other caste” category is considered to have higher social status, with the government of India designating the scheduled caste or tribe and other backward caste as socially and economically disadvantaged.12 We divided religious affiliation into four categories: Hindu, Muslim, Christian, and other. We used the consumption and material possessions based on assets to create the standard of living index for households, weighted for the proportion of each possession at the all India level.13 Since the standard of living index is a constructed measure it does not have an absolute interpretation. Consequently, it is more appropriate to use this measure in categorical, hierarchically ordered fashion. We followed the convention of dividing the population into every fifth of the standard of living index for our analysis. Households were also characterised by whether they were located in a large city (a population of 1 million or more), a small city (population of 100 000 or more but less than 1 million), a town (population of less than 100 000), or villages and rural areas.

Statistical approach

We applied multilevel statistical procedures14 to model the variation in tobacco consumption according to the different analytical levels.15 Specifically we estimated the effect of the demographic and socioeconomic markers on tobacco consumption (“fixed parameters”) and the variations in tobacco consumption in local areas, districts, and states that are not accounted for by individual and household demographic and socioeconomic markers. We calibrated a five level binary logistic model with a nested structure: 301 984 individuals (level 1) in 92 447 households (level 2) in 3215 local areas (level 3) in 440 districts (level 4) in 26 states (level 5). To calibrate models, we used the marginal quasi-likelihood (MQL) approximation with first order Taylor linearisation procedure. Model estimates are maximum likelihood based, derived by using the iterative generalised least squares algorithm, as implemented within the MLwiN program, version 1.10.0006.16


Table 2 presents the odds ratios along with the 95% confidence interval derived from the fixed part of a multiple multilevel regression model calibrated for tobacco smoking, chewing, and smoking and chewing.

Table 2

Odds ratios with 95% confidence intervals from the fixed part of a multivariable five level binomial logit model that is calibrated for tobacco smoking, chewing, and smoking and chewing, conditional on random effects at the level of state, district, local area, and household

View this table:

Smoking and demographic and socioeconomic markers

Age was positively associated with the probability of smoking; for a 10 year change in age the odds ratio related to smoking increased by 1.16. Men were considerably more likely to smoke than women (odds ratio 19.69). Marital status was also predictive of smoking: single, widowed, and divorced or separated people were less likely to smoke (odds ratios 0.32, 0.88, and 0.93, respectively), although the association was weak and imprecisely estimated for divorced and separated people. We also observed religion-based differences: Christians and the residual category of “other religion” were less likely to smoke than Muslims or Hindus. Caste status was also associated with smoking: in comparison to the other caste (the reference category), the scheduled tribe and scheduled caste were more likely to smoke. We observed a strong gradient between education and smoking, with the odds of being a smoker approximately three times higher in the educationally worst off group (illiterate people) than in the educationally best off group (people with postgraduate education). We observed a similar gradient between household standard of living and smoking, with individuals in the lowest fifth having an odds ratio of 2.5 of being smokers (compared with the highest fifth). The prevalence of smoking was greater in rural areas (odds ratio 1.19) and towns (odds ratio 1.14) than in large cities.

Chewing and demographic and socioeconomic markers

Age was positively associated with chewing (odds ratio 1.14 for a 10 year change), and men were more likely to chew than women (odds ratio 3.27). Single people were less likely to chew, but both widowed and separated or divorced people were more likely to chew than married people. Muslims were more likely to chew (odds ratio 1.15) and Christians and other religions less likely to chew (odds ratios 0.76 and 0.85, respectively) compared with Hindus. Scheduled caste and scheduled tribe groups were more likely to chew (odds ratios 1.15 and 1.11, respectively) than other caste. We observed a strong gradient between education and chewing; the odds of chewing in the educationally worst off group was 1.84 times that of people with postgraduate education. A similar household standard of living gradient became apparent for chewing: the odds of chewing in the lowest fifth was nearly twice that of the highest fifth. Chewing prevalence did not differ substantially between different types of urban and rural areas.

The pattern in demographic and socioeconomic inequalities for smoking and chewing combined was similar to those observed for the separate analysis of smoking and chewing. In general the socioeconomic gap in combined tobacco consumption was smaller than for smoking alone and larger than for chewing alone.

Distribution of tobacco consumption across local areas, districts, and states

Table 3 provides the variance estimates for each of the levels for two models—the first does not account for age, sex, marital status, education, religion, caste, household standard of living and urban or rural status; the second does account for these demographic and socioeconomic markers. The variance estimates in table 3 show that socioeconomic markers at the individual and household level do not entirely explain the differences in the prevalence of smoking, chewing, and their combined prevalence by local areas, districts and states. The differences between those geographical areas (especially at state level) increased once account was taken of the composition of the population residing in them. After accounting for individual or household demographic and socioeconomic markers, the bulk of the remaining variation lies at the household and state level. The other spatial levels of local areas and districts seem to matter less. Although the patterning of the multilevel variation was not substantially different across smoking, chewing, and smoking and chewing, the exact magnitude varied: the variation between states was largest for chewing, followed by smoking and chewing, and smoking.

Table 3

Variance estimates from a five level binomial logit model, before and after adjusting for demographic and socioeconomic markers of individuals or households, at the level of the state, district, local areas, and household, for tobacco smoking, chewing, and smoking and chewing

View this table:

Table 4 presents the predicted odds ratio for the different states after taking demographic and socioeconomic markers into account, using the all India prevalence as a reference. For smoking, Maharashtra has the lowest odds ratio (0.35); for chewing, Jammu (0.31); and Punjab for smoking and chewing (0.35). Mizoram has the highest odds ratio for smoking, chewing, and smoking and chewing. The burden of tobacco consumption in India falls disproportionately on Mizoram17 and the other northeastern states.

Table 4

Odds ratios for smoking, chewing, and smoking and chewing for the different Indian states, based on the state level residuals estimated from a five level multiple binomial logit model

View this table:

Figures 1 and 2 map the unconditional (crude) and conditional (model based) prevalence for combined tobacco consumption. In both maps a “natural break algorithm” was used to divide the states of India into four groups.18 A strong geographical pattern is evident in the prevalence for tobacco consumption both before (fig 1) and after (fig 2) controlling for demographic and socioeconomic markers at the individual level. In the northeastern states a greater proportion of the adult population smoke and chew tobacco than in the southern and western states of India. As is evident from figure 2, much but not all of the state variation in tobacco consumption observed in figure 1 is accounted for by the differences in the socioeconomic circumstances of the people who live in them and by other variations attributable to households, local areas, and districts.

Embedded Image
Fig 1

Crude prevalence of adults aged 18 and above who smoke or chew tobacco in 1998-9, by Indian state. The term “crude” means unadjusted prevalence and is computed as number of individuals who smoke and chew tobacco divided by the total number of individuals, in each state, and expressed as percentages

Embedded Image
Fig 2

Model based predicted proportion of adults aged 18 and above who smoke or chew tobacco in 1998-9 by Indian state after controlling for demographic and socioeconomic markers at the individual level and for variation in tobacco consumption between households, local areas, and districts. The term “model based” means conditional prevalence and is based on model based, residual, state level differences in smoking and chewing after accounting for between-individual differences in tobacco consumption that are due to age, sex, marital status, caste, religion, education, standard of living, and urban and rural differences, and after taking account of within-state variation attributable to the level of households, local areas, and districts, and expressed as percentages


Tobacco consumption in India has a distinct socioeconomic and spatial distribution; the worse off population groups are at greater risk of consuming tobacco. Our analyses show the extent to which tobacco consumption is distributed across population subgroups and across states in India.

Differential socioeconomic and geographical susceptibilities to tobacco consumption

Men are more likely to consume tobacco than women. A strong gradient with regards to education and standard of living is apparent. Higher levels of education and standard of living are inversely related to the probability of smoking and chewing; the gradient is stronger for smoking. The relation between these socioeconomic markers and tobacco consumption is similar to relations observed in developed countries.10 Further, the caste based differences in tobacco consumption show the persistent effect of caste as a key axis along which health and other wellbeing outcomes are stratified, over and above the adverse effects of low education and an index of material standard of living. In addition, the distribution of tobacco consumption by marital status is contrary to what has been observed in developed countries, where marriage is seen to have a protective effect.10 Importantly, the large differences observed between states in tobacco consumption, even after controlling for the demographic and socioeconomic markers at individual or household level, highlight the potential importance of the state context in influencing this behaviour.19

Limitations of the study

In addition to the general limits to drawing causal inferences based on cross sectional, observational data, one caveat that is pertinent to our analysis relates to the extent to which the observed magnitude of socioeconomic gap reflects “actual” gaps compared with “reporting” gaps, especially since formal and informal social conventions related to tobacco consumption can influence reporting patterns.

Moreover, data were available only on overall tobacco smoking. Differences in the socioeconomic gradient in the use of manufactured cigarettes (higher among people with more education) and bidis (higher among people with less education) have been shown to exist.11 Smoking bidis is more strongly related to lung and oral cancer than smoking manufactured cigarettes,20 21 and the preponderance of bidi smoking among less advantaged socioeconomic groups will tend to exacerbate health inequalities.

What is already known on this topic

Tobacco consumption is a key adverse health influence in South Asia in general and India in particular

Little is known about how the consumption of tobacco is distributed in India

What this study adds

Tobacco consumption in India is heterogeneous, with higher prevalence rates observed for population groups with low social caste, education, and standard of living

The prevalence of smoking and tobacco chewing shows marked geographical differences (at the level of villages, districts, and states), even after controlling for the individual and household demographic and socioeconomic markers

States account for the bulk of spatial variation in tobacco consumption


The presence of socioeconomic and demographic gradients in tobacco consumption in India counter the perspective of “poor versus non-poor” that is often adopted when examining health differentials in societies that are labelled as poor or low income. Furthermore, the variations in the prevalence of these risk behaviours by states, even after accounting for the type of individuals and households that reside in these states, show the potential importance of the state context in influencing tobacco consumption. The state of Maharashtra took the first legislative steps to discourage tobacco consumption through prohibitions on use and sale,22 and this state has the lowest current overall smoking (table 4).

The current large socioeconomic and geographical gap in tobacco consumption in India is likely to feed into substantial, and perhaps increasing, socioeconomic differentials in the health of adults over the future decades. A need exists to document and monitor such inequalities in tobacco consumption in India systematically, to understand their determinants better, and to provide an evidence base for public health interventions that takes account of differences at state level as well as between population groups, in tobacco consumption.

See Editorial by Samarasinghe and Goonaratna

The research from which this paper borrows part of its data was commissioned and funded by the UK Department for International Development (DFID). The views expressed in this paper do not represent the official position of UK DFID. We acknowledge the support of Macro International ( for providing us access to use the data of the 1998-9 Indian national family health survey. We are also grateful for the extremely helpful comments by the BMJ editorial committee and an independent reviewer.


  • Contributors SVS conceived the study, analysed and interpreted the data, wrote and edited the manuscript, and is guarantor. SN, MK, and DG contributed to data preparation, interpretation of the results and to the editing of the manuscript. GDS contributed to the interpretation of the results and to the editing of the manuscript.

  • Funding None.

  • Competing interests None declared.

  • Ethical approval This research is based on a secondary analysis of a public dataset, with the survey being conducted by international agencies under a cooperative agreement with the national governments. Ethical approval has been granted by the ethics committee at the School for Policy Studies at the University of Bristol, which conforms to guidance of the Social Research Association (SRA) for secondary research of this nature.