Intended for healthcare professionals

CCBYNC Open access
Research Special paper

Heterogeneous effects of Medicaid coverage on cardiovascular risk factors: secondary analysis of randomized controlled trial

BMJ 2024; 386 doi: https://doi.org/10.1136/bmj-2024-079377 (Published 23 September 2024) Cite this as: BMJ 2024;386:e079377

Linked Editorial

Effects of Medicaid coverage on cardiovascular health outcomes

Linked Opinion

Health insurance might be more beneficial to health than average effects suggest

  1. Kosuke Inoue, associate professor1 2,
  2. Susan Athey, professor3,
  3. Katherine Baicker, provost4,
  4. Yusuke Tsugawa, associate professor5 6
  1. 1Department of Social Epidemiology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
  2. 2Hakubi Center, Kyoto University, Kyoto, Japan
  3. 3Graduate School of Business, Stanford University, Stanford, CA, USA
  4. 4University of Chicago, Chicago, IL, USA
  5. 5Division of General Internal Medicine and Health Services Research, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
  6. 6Department of Health Policy and Management, UCLA Fielding School of Public Health, Los Angeles, CA, USA
  1. Correspondence to: K Inoue inoue.kosuke.2j{at}kyoto-u.ac.jp (on X: @ki_endoepi (Kosuke Inoue), @Susan_Athey (Susan Athey), @Yusuke_Tsugawa (Yusuke Tsugawa))
  • Accepted 1 August 2024

Abstract

Objectives To investigate whether health insurance generated improvements in cardiovascular risk factors (blood pressure and hemoglobin A1c (HbA1c) levels) for identifiable subpopulations, and using machine learning to identify characteristics of people predicted to benefit highly.

Design Secondary analysis of randomized controlled trial.

Setting Medicaid insurance coverage in 2008 for adults on low incomes (defined as lower than the federal-defined poverty line) in Oregon who were uninsured.

Participants 12 134 participants from the Oregon Health Insurance Experiment with in-person data for health outcomes for both treatment and control groups.

Interventions Health insurance (Medicaid) coverage.

Main outcomes measures The conditional local average treatment effects of Medicaid coverage on systolic blood pressure and HbA1c using a machine learning causal forest algorithm (with instrumental variables). Characteristics of individuals with positive predicted benefits of Medicaid coverage based on the algorithm were compared with the characteristics of others. The effect of Medicaid coverage was calculated on blood pressure and HbA1c among individuals with high predicted benefits.

Results In the in-person interview survey, mean systolic blood pressure was 119 (standard deviation 17) mm Hg and mean HbA1c concentrations was 5.3% (standard deviation 0.6%). Our causal forest model showed heterogeneity in the effect of Medicaid coverage on systolic blood pressure and HbA1c. Individuals with lower baseline healthcare charges, for example, had higher predicted benefits from gaining Medicaid coverage. Medicaid coverage significantly lowered systolic blood pressure (−4.96 mm Hg (95% confidence interval −7.80 to −2.48)) for people predicted to benefit highly. HbA1c was also significantly reduced by Medicaid coverage for people with high predicted benefits, but the size was not clinically meaningful (−0.12% (−0.25% to −0.01%)).

Conclusions Although Medicaid coverage did not improve cardiovascular risk factors on average, substantial heterogeneity was noted in the effects within that population. Individuals with high predicted benefits were more likely to have no or low prior healthcare charges, for example. Our findings suggest that Medicaid coverage leads to improved cardiovascular risk factors for some, particularly for blood pressure, although those benefits may be diluted by individuals who did not experience benefits.

Introduction

Many countries aim to have financially sustainable universal health coverage by expanding public insurance to cover their population.1 Ample evidence shows that health insurance coverage improves financial risk protection and mental health, but its effect on physical health is less well understood. Several randomized controlled trials have investigated insurance coverage in the United States of America. The RAND health insurance experiment was conducted in the 1970s-80s with the primary aim of studying the price elasticity of demand for healthcare services and implications for health outcomes, but the effect of having health insurance itself was not studied.2 More recently, another study evaluated the effect of health insurance (including Medicaid) on mortality using randomized outreach by the Internal Revenue Service encouraging individuals to take up insurance coverage.3 Over the two years of follow-up, they observed a reduction in mortality for people enrolling in health insurance, but used administrative data that did not have a range of individual level health outcomes.

The Oregon health insurance experiment was launched in 2008 and examined the effects of Medicaid (a public health insurance programme for low income individuals) coverage on a wide range of outcomes, including healthcare use, mental and physical health outcomes, and financial strain.45 The research design allocated a limited number of Medicaid slots to low income adults using a lottery system. The results showed improvements in access to care and outcomes, including depression, but showed, on average, no evidence that Medicaid coverage improved physical health, including cardiovascular risk factors such as blood pressure and hemoglobin A1c (HbA1c) concentrations.5 Some studies using observational or quasi-experimental designs have found that Medicaid coverage is associated with an improved health status, including lower risk of mortality, but such studies are subject to confounding factors and omitted variable bias.678 The randomized controlled trial design used in the Oregon health insurance experiment eliminated such biases. However, some subgroups in the Oregon health insurance experiment might have had an improvement in cardiovascular risk factors, while the average treatment effect was diluted by other subgroups who did not benefit from Medicaid coverage. The absence of detectable effects from the average of the results might include clinically meaningful effects that are present only in subgroups of the studied population.

Recent rapid advancements in machine learning techniques have enabled nuanced estimation of how treatment effects vary based on individuals’ observable characteristics, so-called heterogeneous treatment effects.91011 Conventional stratified analyses split the sample on the basis of a small set of stratifying variables and test whether the interaction term between the exposure variable and stratifying variable is statistically significant. However, these novel techniques can identify complex heterogeneous effects across many potentially intertwined variables that are not shown by the conventional stratified analysis.12 Although prior Oregon health insurance experiment studies did not find heterogeneous treatment effects in the average effects of Medicaid coverage on cardiovascular risk factors based on a limited number of variables,513 heterogeneous effects might be identified when the complex interplay of numerous covariates is accounted for. By examining such heterogeneity across subgroups, this study seeks to provide a more nuanced understanding of how Medicaid coverage might influence cardiovascular risk factors.

In this context, this study applies recently developed machine learning based methods to assess whether a subpopulation can be identified for whom Medicaid coverage substantially improves health outcomes. Using data from the Oregon health insurance experiment, we assessed the heterogeneity in the effect of Medicaid coverage on health outcomes, such as systolic blood pressure and HbA1c. By applying the machine learning causal forest algorithm, we delineated the characteristics of individuals with high and low predicted health benefits from Medicaid coverage. We then evaluated the effect of Medicaid coverage on blood pressure and HbA1c for people predicted to benefit highly, compared with the effect of coverage for the population overall.

Materials and methods

The Medicaid programme in the United States

Health insurance coverage in the United States is available in multiple forms. More than half of Americans are covered by private health insurance, with public programmes such as Medicare and Medicaid covering much of the remaining population, but almost 10 percent of the population remaining uninsured. Medicare is a federal programme that provides health coverage for individuals aged 65 years or older and younger people with disabilities. Medicaid is a joint programme between the federal and state governments that offers health coverage primarily to low income individuals. The programme covers a wide array of healthcare services, including inpatient care, outpatient visits, and prescription medications, although some variations exist across states. As of 2023, nearly 80 million people were enrolled in the Medicaid programme in the United States.

Study sample

We examined data from the Oregon health insurance experiment, which is a randomized controlled study investigating the effect of Medicaid coverage in the state of Oregon, USA.45 This study leveraged the random assignment of access to Medicaid insurance coverage in 2008 for low income adults (defined as less than the federal poverty line5) in Oregon who were uninsured. Additional details on the study design are documented elsewhere.45 Individuals randomly selected from a waitlist for the programme were permitted to apply for Medicaid. In-person interviews and biometric data were collected from the treatment (selected in the lottery) and control (not selected) groups between 31 August 2009 and 13 October 2010. This survey included questions on medical service usage, health insurance status, and medication details. In addition, blood pressure measurements and blood samples were collected from participants. Across a total of 12 229 participants who responded to the survey (effective response rate, 73%), this study included 12 134 individuals with whose outcome data were available. The protocol for this study was approved by the institutional review board at University of California, Los Angeles, USA (institutional review board number 24-000623). The Oregon health insurance experiment has received approvals from several institutional review boards, and all participants provided written consent during the in-person survey. The Oregon health insurance experiment was registered at the American Economic Association’s registry for randomized controlled trials (registration number AEARCTR-0000028).

Study variables

We used whether an individual was randomly selected to apply for Medicaid coverage as an instrumental variable to estimate the local average treatment effect of Medicaid coverage on blood pressure and HbA1c measured at the in-person survey. This local effect corresponded to the average treatment effect for individuals who were able to enrol in Medicaid through the lottery assignment. After a 5 minute sitting period, blood pressures were measured three times, 30 seconds apart, and the average was calculated. HbA1c was measured from blood samples collected in the in-person survey.14 More details are described in the Oregon health insurance experiment protocol.45

The pretreatment variables collected by self-report in the Oregon health insurance experiment included age (years), sex (female or male), race and ethnic group (Hispanic, non-Hispanic black, non-Hispanic white, or other (Asians, Native Hawaiian or Pacific Islander, and other)), education status (less than high school, high school or general educational development, or college or above), and diagnoses before the lottery (hypertension, diabetes, high cholesterol, asthma, heart attack, congestive heart failure, emphysema/chronic obstructive pulmonary disease, kidney failure, cancer, and depression). Data regarding prior healthcare charges (ie, total charges and emergency department charges) were sourced from individual visit records during the period before randomization period from 1 January 2007 to 9 March 2008. More details have previously been published.15 Missing data for these covariates at baseline were imputed using a random forest approach.16

Statistical analyses

We built the causal forest algorithm with an instrumental variable regression (ie, instrumental variable forests; instrumental_forest function in grf package in R) to evaluate the heterogeneity in the treatment effect of Medicaid coverage on systolic blood pressure and HbA1c.17 In the IV forests, we estimated the conditional local average treatment effect for each individual i by taking a ratio of two weighted averages. To calculate the weights, we used a data driven method designed to give more weight to observations with similar treatment effects while avoiding overfitting. We drew 2000 subsamples of the data, and we further divided each subsample into two parts. In the first part of each subsample, we constructed a partition of the data based on observable baseline characteristics. The partition was selected to maximize heterogeneity in the conditional local average treatment effect across elements of the partition. In the second part of each subsample, we recorded which observations were assigned to the same element of the partition that would be assigned to observation i’s baseline characteristics. The weight a given observation received in the final estimation of the conditional local average treatment effect for individual i was equal to the number of times that the observation fell in the second part of a subsample and was assigned to the same element of the partition as individual i. This sample splitting approach to constructing weights ensured that the outcome and treatment assignments of one unit were not used to determine how much that unit was weighted in estimating the conditional local average treatment effect at a particular i. The conditional local average treatment effect for an individual i can be interpreted as the local average treatment effect of Medicaid coverage on systolic blood pressure or HbA1c conditional on baseline characteristics for each individual. In the context of this study, it represented what would be the expected change in an individual's blood pressure and HbA1c after a year if they enrolled in the Medicaid programme. That is, where the expectation was for individuals with i’s observable baseline characteristics and with (potentially unobservable) characteristics that would lead them to choose to enrol in Medicaid under the expanded eligibility.

To assess the calibration of our instrumental variable forest model, individuals were categorized on the basis of fifths of the predicted conditional local average treatment effect, and then local average treatment effect was estimated for each quintile group. When ranking individuals on the basis of estimated conditional local average treatment effect, we applied a cross fitting approach.18 In this approach, we ranked individual i in fold k to a quintile (or decile) on the basis of estimated conditional local average treatment effect for based on data from all folds other than k (ie, τ^{–k(i)}(Xi), where {–k(i)} represents the instrumental variable forest model that was fit with other folds), and then estimated local average treatment effect for the assigned ranking group using data from fold k. By using this approach, we did not use the same data to determine how individuals are ranked and to evaluate the difference across the ranking groups. Ideally, a well calibrated instrumental variable forest model should yield a plot where the group specific local average treatment effects consistently increase in alignment with the conditional local average treatment effect quintiles. We assessed the variable importance by calculating a weighted total of its occurrences at every depth in the instrumental variable forest. More details on the instrumental variable forest are shown in supplementary method and elsewhere.17

Then, we estimated the individualized treatment effect (ie, conditional local average treatment effect) of Medicaid coverage on systolic blood pressure and HbA1c using the instrumental variable forest model. We used the estimated conditional local average treatment effect to categorize individuals into two groups for each of the two outcomes, systolic blood pressure and HbA1c. People who were estimated to have a high benefit (conditional local average treatment effect <0) and people estimated to have a low benefit (conditional local average treatment effect ≥0). After comparing the demographic characteristics of the high and low benefit groups,19 we calculated local average treatment effect among individuals in the high benefit group and compared them with the local average treatment effect among the overall population. In addition, we plotted the local average treatment effect according to the cumulative fraction of the participants according to the predicted conditional local average treatment effect.

Additional analyses

We conducted six additional analyses. (1) To test whether our findings could be affected by individuals who were already treated for hypertension and diabetes before the study began, we reanalysed the data restricting to people with no diagnosis of hypertension or diabetes at baseline. (2) We compared average treatment effect in the high benefit group with average treatment effect in the overall population using intention-to-treat analysis. (3) We calculated the effect of Medicaid coverage on healthcare use (the number of prescription drugs and office visits in the past year) among individuals in the high benefit group and the overall population. (4) We repeated the analysis to assess the heterogeneity in the effect of Medicaid coverage on diastolic blood pressure and total cholesterol concentrations. (5) To understand the characteristics of individuals who enrolled in the Medicaid programme, we fitted a logistic regression model to predict Medicaid enrolment (the “take-up”) among individuals who won the lottery. We then compared the baseline characteristics according to the quintiles of the predicted take-up rates derived from this regression model. (6) We evaluated whether individuals who benefited from Medicaid coverage had limited access to healthcare at baseline, by calculating counterfactual charges (ie, counterfactual charges if individuals were covered by the Medicaid programme and had access to healthcare) for each individual. To do so, we first fitted the zero inflated negative binomial model to predict total charges during the study period using the data restricted to individuals with Medicaid coverage (ie, individuals who won the lottery and enrolled in the Medicaid programme). Using this prediction model, we calculated the counterfactual charges for all individuals at baseline. We then compared the differences in predicted (counterfactual) versus observed total charges between the high benefit group versus others.

Local average treatment effects were estimated with instrumental variable regression, and their robust 95% confidence intervals (CIs) were obtained by repeating the analysis on 1000 bootstrapped samples. We adjusted all analyses for the number of household members on the lottery list as originally conducted because selection for the programme was random and conditional on household size. For each categorical variable (ie, race and ethnic group and education status), we created dummy variables and included them in the model. P values were adjusted for multiple comparisons using the Benjamini-Hochberg procedure.20 All statistical analyses were conducted using R, version 4.1.1 (R Project for Statistical Computing).

Patient and public involvement

Our study was a post hoc analysis and did not include patients as study participants. The original Oregon health insurance experiment was done when patient and public involvement in research design was uncommon. Participants did not partake in shaping the research question, defining the outcome measures, or designing the study itself. No direct patient and public involvement was sought because the random assignment of access to Medicaid insurance coverage was not conducted for research purpose and the analysis required specialised training.

Results

A total of 12 134 individuals on low incomes met the inclusion criteria. Of these individuals, 6338 were assigned to the lottery group and 5796 to the control group. Baseline characteristics were balanced between the two groups (table 1). Results of the in-person interview survey showed mean systolic blood pressure was 119 mm Hg (standard deviation 17) and mean HbA1c was 5.3% (0.6%). Medicaid coverage was not associated with significant changes in systolic blood pressure (−0.62 (95% CI −3.16 to 1.73), P=0.62) and HbA1c (0.00% (−0.10% to 0.10%), P=0.96).

Table 1

Demographic characteristics of the control and lottery groups

View this table:

Heterogeneity in the effect of Medicaid coverage on blood pressure and HbA1c

The instrumental variable forest models for systolic blood pressure and HbA1c were well calibrated and showed significant heterogeneity in the effect of Medicaid coverage (supplementary figure A). In both models, age, sum of total charges, and sum of baseline charges relating to the emergency department were identified as important variables (defined by a weighted sum of the number of splits in the forest models) for heterogeneous treatment effects (supplementary figure B). Individuals predicted to benefit highly (ie, conditional local average treatment effect (<0)) for systolic blood pressure (n=9158) or HbA1c (n=7212) were less likely to have a history of hypertension diagnosis and had lower total and emergency department charges at baseline than those with lower predicted benefit (table 2, supplementary tables A and B). We also observed a higher prevalence of Hispanic individuals in the high benefit group than in the low benefit group. This distribution of racial and ethnic groups was at least partially attributable to the correlation between race and ethnicity and total charges at baseline; Hispanic individuals showed lower total charges at baseline than did other racial and ethnic groups (supplementary table C).

Table 2

Demographic characteristics of high benefit groups for systolic blood pressure and glycated hemoglobin (HbA1c)

View this table:

The effect of Medicaid coverage among individuals predicted to benefit highly

Individuals with high predicted benefit for systolic blood pressure showed a greater reduction in systolic blood pressure because of Medicaid coverage compared with the overall population (−4.96 mm Hg v −0.62 mm Hg, adjusted difference −4.34 mm Hg (95% CI −6.04 to −2.74), P<0.001 (table 3)). For this group, Medicaid coverage would also result in a greater average reduction in diastolic blood pressure (−3.91 mm Hg v −1.00 mm Hg, −2.91 mm Hg (−4.10 to −1.79), P<0.001).

Table 3

Overall population versus high benefit group for the effect of Medicaid coverage on blood pressure and glycated hemoglobin (HbA1c)

View this table:

Similarly, individuals with high predicted benefit for HbA1c showed a slight reduction in HbA1c by Medicaid coverage compared with the overall population; however, the effect was not clinically meaningful (−0.12% v 0.00%, adjusted difference −0.12% (95% CI −0.22% to−0.04%), P=0.009 (table 3).

When we computed the group specific effects for scenarios among individuals based on the estimated individualized effect (ie, conditional local average treatment effect) from the instrumental variable forest model, we found that Medicaid coverage led to greater reductions in systolic blood pressure and HbA1c for individuals with a larger conditional local average treatment effect than it did for the overall population (fig, supplementary table D, supplementary figure C).

Figure
Figure

Change in systolic blood pressure and HbA1c by Medicaid coverage according to predicted benefits. The x axis shows the coverage population of Medicaid based on the ranking of the predicted benefits (ie, conditional local average treatment effect), and the y axis shows the estimated effect among those populations. For example, among people with the top 30th centile of estimated benefits, the estimated reduction by Medicaid in systolic blood pressure was 6.76 (95% confidence interval 2.60 to 11.55) and in HbA1c was 0.28% (95% confidence interval 0.07% to 0.50%). We did not calculate change in outcomes for the scenario among individuals in the top 10th centile owing to small sample size and insufficient statistical power. CI=confidence interval; HbA1c=haemoglobin A1c

Additional six analyses

(1) Our findings remained qualitatively unaffected when we restricted our analysis to individuals without hypertension or diabetes diagnosis at baseline (supplementary table E). (2) The results were also consistent when we compared average treatment effect between the high benefit group for conditional local average treatment effect and the overall population using an intention-to-treat analysis (supplementary table F). (3) We found an increase in the number of prescription drugs and office visits in the past year by Medicaid coverage in both the high benefit group for systolic blood pressure and the overall population, but not in the high benefit group for HbA1c (supplementary table G). We noted no evidence that changes in prescription drugs and office visits differ between the overall population versus individuals in the high benefit group. (4) Consistent with our main results, we observed the heterogeneity in the effect of Medicaid coverage on diastolic blood pressure and total cholesterol concentrations (supplementary figure D), and individuals predicted to highly benefit for these outcomes were likely to have lower prior healthcare charges at baseline compared with others (supplementary tables H and I). (5) People with higher take-up rates were more likely to be female, non-Hispanic white, and have greater total and emergency department charges at baseline than those with lower take-up rates (supplementary table J). (6) The difference between the predicted (counterfactual) and observed total charges was larger for the high benefit group than for the low benefit group (US$3837 (£2951, €3464) v −$91; P<0.001), suggesting that individuals who benefit the most from Medicaid coverage were those who did not have access to healthcare before Medicaid coverage.

Discussion

Principal findings

In our post hoc analysis of Oregon health insurance experiment data using machine learning causal forest models, we found heterogeneity in the effects of Medicaid coverage on systolic blood pressure. While effects of Medicaid coverage on average, were not detected, some individuals showed improvements in systolic blood pressure from Medicaid coverage, and these individuals were likely to have no or low prior healthcare charges at baseline. A similar pattern was observed for HbA1c, but the estimated effects were smaller and not of clinical significance. These findings suggest that Medicaid coverage leads to improvement of blood pressure for some people, but the benefit for these people was diluted by individuals who did not benefit from Medicaid coverage in assessments of effects for the Oregon health insurance experiment study population overall.

Policy implications

Our findings suggest that null average effect of Medicaid coverage on cardiovascular risk factors, as observed in the original Oregon health insurance experiment study, may obscure significant benefits for some subgroups. In particular, we observed a clinically meaningful reduction in systolic blood pressure of approximately 5 mm Hg, which is large enough to lower the risk of health outcomes such as cardiovascular diseases and mortality, and equivalent to that achieved through lifestyle interventions.21 This effect size is 10 times larger than the point estimate of the average treatment effect (−0.52 mm Hg) observed in the original Oregon health insurance experiment. The experiment’s result was not only statistically insignificant but also too small to be a clinically meaningful change, even if statistical significance was achieved with a larger sample size. A smaller change observed in HbA1c than in blood pressure may be explained in part by the fact that only a small number of the study participants had increased HbA1c concentrations (only 5.1% had HbA1c concentrations of ≥6.5% and 3.3% had levels of ≥7.0%), compared with 16.3% of participants exhibiting elevated blood pressure. It is also important to note that our estimates had a wide confidence interval that might reflect the varied responses of individuals to improved healthcare access through Medicaid coverage, such as differences in medication adherence and follow-up visits.

Possible explanations

Several mechanisms through which Medicaid coverage could improve cardiovascular risk factors are potential. Firstly, insurance coverage facilitates access to healthcare, enabling beneficiaries to consult healthcare professionals, get beneficial care, and more easily adhere to prescribed treatments. This hypothesis is supported by the original Oregon health insurance experiment study findings that Medicaid coverage increased outpatient care use, rates of people admitted to hospital, and prescription medication usage by 15-35%.45 Secondly, insurance coverage reduces out-of-pocket healthcare expenses,22 which could allow beneficiaries to redirect their financial resources towards other health promoting activities, such as purchasing nutritious foods and engaging in physical exercise. Lastly, a greater sense of security provided by health coverage might reduce stress,5 which may in turn improve physical health.2324

Individuals with larger predicted reductions in systolic blood pressure tended to have lower healthcare charges at baseline than did those with lower predicted health benefits. Although the exact underlying mechanisms are unclear, our findings suggest that individuals with low healthcare charges at baseline had limited access to healthcare before receiving Medicaid coverage, and therefore had a large health benefit with the increased access to care that came with Medicaid coverage. By contrast, individuals who had access to healthcare services before Medicaid coverage might not have changed their care patterns, and thereby the outcomes, as much.

Methodological implications

Harnessing recently developed methodological tools, we were able to detect heterogeneity in the effects of Medicaid coverage that had not been previously shown. Traditionally, randomized controlled trials, including the Oregon health insurance experiment, have assessed heterogeneity through stratifying analysis based on a priori hypothesis. However, traditional stratifying analysis does not consider complex functional forms or interaction effects among baseline characteristics when analysing how the effect of Medicaid coverage varies across individuals. Therefore, the original Oregon health insurance experiment did not identify clinically meaningful heterogeneous treatment effect.5 By using the causal forest method in this post hoc analysis of Oregon health insurance experiment, our study is the first to identify subgroups (based on multiple characteristics) who had lower blood pressure associated with Medicaid coverage. We showed that these subgroups were likely to have lower healthcare charges before Medicaid coverage. Unfortunately, the sample size in this study is insufficient to answer questions about treatment effect heterogeneity for each baseline characteristic individually; rather, further prospective studies designed to assess treatment effect heterogeneity could elucidate on such questions.

Social perspectives

We found that although Medicaid coverage improved blood pressure across all racial and ethnic groups, the likelihood of enrolling in the Medicaid programme on eligibility (ie, the take-up rate) was lower among Hispanic individuals compared with non-Hispanic white individuals. This discrepancy could be due, for example, to language barriers or information barriers to applying, but further research is warranted to better understand the underlying mechanisms of this difference and to identify interventions that could mitigate potential barriers. It is also important to note that algorithm based approaches have the potential to exacerbate disparities if the training data are biased or used inappropriately.25

Strengths and limitations of this study

Our study has limitations. Firstly, the causal forest model evaluated heterogeneity based on measured covariates, and other unmeasured characteristics may also be important. Since we did not have baseline information on lifestyle factors such as smoking and alcohol intake, obesity status, mental health status, pregnancy, and family history of diseases, we did not assess heterogeneity based on these variables. Secondly, because baseline characteristics were self-reported, our findings might be affected by measurement error and misclassification bias, although these should not have differed between the treatment and control groups. Thirdly, our study participants had an average coverage duration of approximately 17 months, and patterns over a longer follow-up might differ.5 Fourthly, we examined only limited health outcomes. Additional examination of heterogeneity in the effect of Medicaid coverage on other clinical outcomes (eg, cardiovascular disease, cancer, infectious disease, Alzheimer’s disease, and mortality) would be informative. Fifthly, although the Oregon health insurance experiment collected data for whether each participant lived in a metropolitan statistical area at baseline, we were not able to assess heterogeneity by such geographical locations because almost all participants in this study had a zip code of residence within a metropolitan statistical area. Sixthly, although we calculated the conditional local average treatment effects for each fold using an algorithm that excluded observations from that specific fold, future research should focus on exploring the uncertainties associated with these estimated conditional local average treatment effects. Our findings need to be validated in external databases, which would provide a more comprehensive understanding and ensure the robustness of our results. Moreover, because instrumental variable methods allow us to estimate the effect among compliers, our findings may not be generalizable to populations with different patterns of compliance with the intervention (ie, people who do not comply). Lastly, the insurance examined in this study was from the Medicaid programme, which is a public health insurance for low income individuals in the US. Our findings may not be generalizable to other types of insurance such as private insurance plans. We conducted this study using data from the state of Oregon in the United States, therefore, our findings may also not be generalizable to other states or countries.

Conclusions

Although Medicaid coverage did not improve cardiovascular risk factors on average, we found substantial heterogeneity in the effects within the study population. Individuals with high predicted benefits were more likely to have no or low prior healthcare charges at baseline, for example. Our findings suggest that expanding Medicaid coverage may lead to important health benefits for some identifiable subpopulations even when there is limited average benefit across the population overall.

Summary box

What is already known on this topic

  • Although expanding health insurance coverage is a policy priority worldwide, research into the effect of insurance on physical health outcomes has yielded mixed results

  • A randomized controlled trial of the effect of health insurance (Medicaid) coverage among low income individuals found no evidence that Medicaid coverage improves cardiovascular risk factors, eg, blood pressure and glycated hemoglobin concentrations

  • However, some subgroups might benefit from Medicaid coverage

What this study adds

  • The machine learning causal forest model showed that Medicaid coverage has heterogenous effects on cardiovascular risk factors, and Medicaid coverage improved these risk factors, particularly blood pressure, for some subgroups

  • Individuals with a high probability of improving blood pressure as a result of Medicaid coverage tended to have no or low prior healthcare charges at baseline

  • These findings show the importance of investigating heterogeneous treatment effects when assessing the impact of interventions such as health insurance coverage

Ethics statements

Ethical approval

The University of California, Los Angeles’s institutional review board determined that ethical approval was not needed.

Data availability statement

All data used in this study are available online from the National Bureau of Economic Research’s Public Use Data Archive and can be accessed at https://www.nber.org/research/data/oregon-health-insurance-experiment-data. Statistical code available from the corresponding author on reasonable request.

Footnotes

  • Contributors: All authors contributed to the design and conduct of the study, data collection and management, analysis and interpretation of the data, and preparation, review, or approval of the manuscript. YT supervised the study and KI is the guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. The views expressed here are those of the authors and do not necessarily represent the views of the National Institutes of Health, or other affiliated institutions.

  • Competing interests: All authors have completed the ICMJE uniform disclosure and declare: support from the Japan Society for the Promotion of Science, the Japan Science and Technology Agency, National Institutes of Health, and Gregory Annenberg Weingarten, GRoW @ Annenberg for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

  • Funding: This study was supported by the Japan Society for the Promotion of Science (22K17392 and 23KK0240; PI, Inoue), the Japan Science and Technology Agency (JST, JPMJPR23R2; PI, Inoue), National Institutes of Health (NIH) (P01AG005842 and R01AG034151; PI, Baicker), and Gregory Annenberg Weingarten, GRoW @ Annenberg (PI, Tsugawa). KI receives funding from the Japan Agency for Medical Research and Development (AMED; JP22rea522107), and the Program for the Development of Next generation Leading Scientists with Global Insight (L-INSIGHT) sponsored by the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan, for other work not related to this study. SA receives funding from the Golub Capital Social Impact Laboratory, Schmidt Futures, and Office of Naval Research Grant N00014-22-12668 for other work not related to this study. KB serves on the board of directors of Eli Lilly and the Mayo Clinic and on advisory panels to the Congressional Budget Office and the National Institute for Health Care Management. YT receives funding from NIH/National Institute on Ageing (R01AG068633 and R01AG082991) and NIH/National Institute on Minority Health and Health Disparities (R01MD013913)for other work not related to this study, and serves on the board of directors of M3, Inc. The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.

  • Transparency: The corresponding author affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

  • Dissemination to participants and related patient and public communities: Our research findings will be disseminated through press releases, resulting interviews from local and national media, social media posts on Twitter, and academic conferences. Provenance and peer review: Not commissioned; externally peer reviewed.

http://creativecommons.org/licenses/by-nc/4.0/

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

References