Regression discontinuity designs in healthcare researchBMJ 2016; 352 doi: https://doi.org/10.1136/bmj.i1216 (Published 14 March 2016) Cite this as: BMJ 2016;352:i1216
- Atheendar S Venkataramani, instructor1,
- Jacob Bor, assistant professor2,
- Anupam B Jena, associate professor of health care policy and medicine3
- 1Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA, USA; and Harvard Center for Population and Development Studies, Cambridge, MA, USA
- 2Department of Global Health, Boston University School of Public Health, Boston, MA, USA
- 3Department of Health Care Policy, Harvard Medical School, Boston, MA, USA; Department of Medicine, Massachusetts General Hospital; and National Bureau of Economic Research, Cambridge, MA, USA
- Correspondence to: A B Jena, Department of Health Care Policy, Harvard Medical School, Boston, MA 02115, USA
- Accepted 10 February 2016
Treatment decisions and interventions are often based on specific measurement, laboratory, timing, or eligibility cut-offs
The regression discontinuity design is a statistical approach that utilizes threshold based decision making to estimate causal estimates of different interventions
Regression discontinuity is relatively simple to implement, transparent, and provides “real world” effects of treatments and policies
Despite the ubiquity of threshold based decision making in healthcare, regression discontinuity is underutilized
Randomized controlled trials have long considered the ideal means of assessing causal relations in medicine and health policy. However, not all clinical or policy questions can effectively be approached with experimental methods, owing to financial or ethical constraints or concerns about real world generalizability. As a result, investigators have increasingly relied on quasi-experimental study designs.1 2
The regression discontinuity design is one such quasi-experimental method. Regression discontinuity takes advantage of clinical or policy decision rules in which people are differentially assigned to a treatment or intervention if they fall above or below an arbitrary cut-off for a continuous variable.3 4 Such situations abound in healthcare. For example, decisions to initiate antihypertensive treatment, transfuse blood products, or screen for colorectal cancer are often driven by decision rules premised on specific thresholds (some of which are controversial and in need of evaluation).5 6 Similarly, eligibility for public sector health insurance in the United States is contingent on specific age and/or income based cut-offs.
Though both the ubiquity of clinical and policy decision rules and the high degree of internal validity of regression discontinuity makes the method “tailor-made” for healthcare research, it has been underutilized.4 In addition to a lack of familiarity with regression discontinuity, underutilization may be driven by an unclear sense of the kinds of questions to which regression discontinuity may be applied as well as the types of cut-offs that can be exploited for causal inference.
In this article, we describe regression discontinuity and its strengths and weaknesses and characterize the diversity of research questions to which this design applies (in the domains of clinical medicine, public health and prevention, and health policy) and the numerous types of thresholds and decision rules that can be employed. Our goal is to inspire investigators to assess more habitually, whether a regression discontinuity approach can strengthen the causal likelihood of an observed empirical association by illustrating the versatility of the method in answering a wide array of research questions in clinical and health policy research.
Treatment assignment and analysis
Assignment to treatment by an arbitrary cut-off in regression discontinuity can either be “sharp” or be “fuzzy.”3 4 7 8 9 In sharp regression discontinuity, the probability of receiving treatment jumps deterministically from 0 to 1 at the cut-off. All people on one side of the cut-off receive treatment and those on the other side not. The treatment effect in the sharp regression discontinuity is estimated by comparing outcomes for those “just above” the cut-off to those “just below” it. Statistically, this involves either fitting models where the outcome is regressed on a binary indicator denoting being above or below the assignment threshold and a linear or polynomial for the underlying assignment variable, or using non-parametric methods to compare outcomes on either side of the threshold within an optimally selected bandwidth.3 10
In contrast, in fuzzy regression discontinuity, the probability of treatment increases at the cut-off but is not deterministic—that is, some people on both sides of the cut-off still receive treatment, though the probability of treatment increases at the cut-off. The fuzzy design thus allows for crossover into or out of treatment (relative to the treatment decision implied by assignment based on threshold) and/or non-adherence. This design is thus more germane to the bulk of clinical and policy questions in the sphere of healthcare, where additional considerations (beyond one’s position relative to a specific cut-off) often contribute to decision making about treatment, uptake of treatment or programs, and adherence. Similar methods to analyzing sharp regression discontinuity can be applied to fuzzy regression discontinuity. However, because treatment is not deterministic, the comparison captures the effect of assignment to treatment by the threshold rule—not receiving the treatment itself. Instrumental variable methods can then be used to assess the effect of treatment among compliers.4 9 Several recent reviews have described the statistical and practical issues around conducting regression discontinuity studies in the area of health.4 7 8
Figure 1⇓ illustrates treatment assignment in fuzzy regression discontinuity using real clinical data on the initiation of HIV treatment in South Africa.7 At the time the data were collected, guidelines recommended treatment for all people with initial CD4 counts of less than 200 cells/mm3. As the illustration shows, the probability of receiving treatment was noticeably higher for patients presenting for care with CD4 counts below this threshold. However, several people below the threshold did not receive treatment and several above the threshold did. The presence of such “crossover” highlights the importance of clinical attrition before initiation of treatment, other indications for treatment regardless of CD4 count, and patient and provider preferences in motivating treatment choice in many clinical settings.
Strengths and limitations
The main strength of regression discontinuity is that it provides a transparent method to estimate causal effects of treatments or policies, particularly when randomized controlled trials are not possible. Causal inference comes from the assumption that, aside from differential use of treatment, those on either side—yet close to the cut-off—are otherwise similar. Put differently, whether someone lies immediately above or immediately below a cut-off is considered effectively random, leading to quasi-random treatment assignment for those close to the threshold. The validity of this randomization assumption can be evaluated as in a clinical trial by showing that patients on either side of the threshold have comparable characteristics. Other strengths of regression discontinuity are that non-random attrition and crossover, which create difficulties in randomized trials, can either be avoided or be easily incorporated into the design to support causal inference. Moreover, regression discontinuity studies may better reflect “real world” effects of treatments and policies, as they incorporate important behavioral phenomena such as poor adherence and loss to follow up, which randomized controlled trials seek to minimize.
Limitations of regression discontinuity include concerns around external validity, particularly as the estimates are only interpretable as causal effects for those near the specific cut-off of interest. However, cut-offs, such as those delineated by lipid lowering guidelines, are often of specific policy interest themselves. (With certain assumptions, regression discontinuity estimates can provide information about causal effects further from the threshold, although local randomization is not preserved11). Additionally, causal inference in regression discontinuity may be threatened by non-random manipulation around the treatment cut-off. However, in many situations manipulation is either not possible or unlikely to generate bias. For example, random measurement error in laboratory testing may place some people above and some below the treatment threshold in a manner that cannot easily be manipulated. Moreover, manipulation is visually and statistically testable.12 Lastly, regression discontinuity studies may require large datasets to generate precise estimates. However, the growing availability of “big data,” such as electronic medical information systems, administrative databases, national registries, insurance claims, population health databases, social media, internet searches, and other routinely collected data may loosen this constraint.13
Examples of regression discontinuity studies
Table 1⇓ provides examples of regression discontinuity studies in healthcare. These studies, which employ fuzzy designs, are drawn from the clinical, public health, and health policy literatures to illustrate the breadth of research questions investigated using regression discontinuity. Collectively, the studies also show the diversity of usable cut-offs.
Treatment decisions based on therapeutic goals or cut-offs are ubiquitous in clinical medicine. Despite this, few published studies have used the regression discontinuity design to evaluate clinical questions (table 1⇑).7 14 15 16 In the example in figure 1⇑, Bor et al show the power of the regression discontinuity approach by examining the role of early antiretroviral treatment on mortality, retention, and immune recovery in South Africa.7 17 Exploiting a national policy of initiating treatment at CD4 counts less than 200 cells/mm3 to estimate a fuzzy regression discontinuity, they found that people with CD4 counts just below this threshold had a 35% lower hazard of death than those with CD4 counts just above it. Clinical threshold rules based on continuously measured biomarkers (such as CD4 counts) are a particularly strong application of regression discontinuity because assignment to treatment around the threshold is effectively driven by random measurement error.18 19 Consistent with this, Bor et al showed that patients just above and just below the relevant threshold were statistically identical across baseline characteristics. Similarly, to estimate the impact of statin use on low density lipoprotein cholesterol levels, Geneletti et al. exploited guideline driven differences in statin prescribing for those with just less or just greater than 20% 10 year risk of a cardiovascular event (as determined by a standard risk calculator).16
Though the benefits of early antiretroviral treatment and statins are both well established in the medical literature, regression discontinuity studies provide a better sense of what could happen in the real world, which is free of biases to external validity introduced by a trial environment. For instance, the large survival benefit of early antiretroviral treatment in the study by Bor et al may be driven in part by late (or loss to) follow-up among those presenting for care with CD4 counts just above 200 cells/mm3. This is a relevant and important phenomenon that may be missed in randomized controlled trials, where attempts are made to explicitly minimize attrition in both treatment and control groups.17
Clinical thresholds can also be used to study the effectiveness of bundles of medical services. For example, to examine the effects of neonatal intensive care on infant mortality, Almond et al used the arbitrary designation of infants weighing less than 1500 g at birth as very low birth weight along with guideline based recommendations on clinical management for these infants.14 Like CD4 counts in the previous example, treatment assignment was driven by random variation in the measurement of birth weight. Almond et al find that infants just below this cut-off were more likely to receive neonatal intensive care and were 18% less likely to die by age 1 year than those just below the cut-off. Bharadwaj et al used the same cut-off to evaluate the effects of neonatal intensive care on cognitive development and academic achievement later in childhood.15
Non-clinical cut-offs can be used to evaluate clinical interventions. Jensen and Wust focused on health outcomes for infants born just before and just after the arrival of information from a major clinical trial in Denmark, which found that caesarean sections conferred better health outcomes than vaginal delivery for term breech infants.20 Jensen and Wust showed that caesarean rates increased markedly just after the announcement of the trial results, and that infants born just after the announcement had higher Apgar scores and lower rates of hospital admissions in the first year of life than those born just before the announcement (who were more likely to be delivered vaginally).
Prevention and public health research
Creative regression discontinuity approaches have been applied in preventive medicine and public health as well (table 1⇑). For example, in Canada Smith et al explored the effects of the HPV vaccine on cervical dysplasia and anogenital warts.21 Because of the timing of vaccine introduction as well as age eligibility guidelines, the authors showed an increase in exposure to vaccine across a specific set of birth quarters, which served as the cut-off variable. They found that cervical dysplasia and anogenital warts declined noticeably just after this cut-off as well. Like the clinical studies discussed previously, the study provides a population level insight into the real world effects of HPV vaccination, thereby tackling concerns about reduced efficacy owing to behavioral disinhibition. Age eligibility cut-offs have also been employed in the public health literature. In a prominent example, Callaghan examined the effects of alcohol on mortality among young adults, examining death rates just above and just below the legal drinking age.22
Regression discontinuity approaches have also yielded substantive contributions to the literature on the social determinants of health. Ludwig and Miller showed large reductions in child mortality after the initiation of the Head Start program in the 1970s, which included health and nutritional services in addition to educational programs.23 They exploited sharp differences in federal support for Head Start provided to counties falling below a poverty threshold, and compared health outcomes in counties on either side of that threshold. (Poverty had been measured using census data before the advent of the program and was thus free of manipulation.)
Geographic boundaries have also been exploited in regression discontinuity studies. Causal inference requires that socioeconomic, demographic, and administrative factors aside from the intervention of interest are similar on either side of the boundary. In a landmark study, Chen et al estimated the impact of air pollution on health using data from China.24 The authors exploited variation in pollution generated by a national policy, which provided free coal based winter heating for those living north but not south of the Huai River. They found statistically significantly higher levels of ambient pollution just north of the river versus just south, and estimated that as a result of pollution life expectancies were 5.5 years lower north of the river.
Health policy research
Several studies have applied regression discontinuity to assess the impacts of health insurance on health outcomes, a key question in health policy (table 1⇑). De La Mata examined utilization and health outcomes for children in families just above and just below income eligibility thresholds for US Medicaid, the country’s largest public insurance program for those on low incomes.25 To examine the impacts of childhood insurance coverage on adolescent health outcomes, Wherry et al utilized a Medicaid eligibility rule change that increased length of exposure for birth cohorts born just after 1 October 1983.26 Sood et al used a geographical cut-off, using the fact that insurance was made available in some districts of the Indian state of Karnataka but not in others.27 They found that mortality rates were lower and utilization rates were higher for those families just over the border where insurance was available compared with those just on the other side. Almond and Doyle examined a specific component of insurance coverage in the United States—namely, rules around minimum length of stay after childbirth.28 They utilized a specific set of rules around the crude counting of calendar days in the hospital: according to insurance, children born just after midnight are permitted a full day longer in the hospital than those born just before midnight (whose day of birth is counted as a single day in the hospital). They found that longer stays did not change health outcomes for the mother or the child, but resulted in substantially greater costs. Collectively, these studies, as well as others using regression discontinuity to illustrate wide ranging impacts of insurance,29 30 exemplify how different types of regression discontinuity can be used to evaluate similar questions.
The approaches discussed in this article highlight the diversity of questions that can be addressed using regression discontinuity designs. They also illustrate how the vast range of decision thresholds in healthcare can be utilized to bolster the causal likelihood of observed empirical associations. Because of the potential breadth of options available, investigators focusing on clinical questions for which experiments are not ethically or financially feasible or for which the external validity is uncertain should assess whether regression discontinuity approaches are possible and credible before turning to other observational approaches. They should do so keeping in mind that factors as diverse as decision rules based on clinical variables; temporal changes in guidelines or reimbursement policies; income, age, and other eligibility thresholds; or geographic location may all be valid cut-offs for analysis.
Several future opportunities remain for the application of regression discontinuity approaches. Table 2⇓ highlights several clinical areas where there are active controversies around a specific clinical guideline or little evidence of the real world effects of a particular treatment.31 For example, blood pressure targets for some patient populations (eg, adults with diabetes) continue to be based on expert opinion. Regression discontinuity studies using specific blood pressure cut-offs can be used to strengthen evidence based guidelines in these areas.6 Another example is cardiac resynchronization treatment, which has been shown to be effective in clinical trials of patients with left ventricular systolic dysfunction (ejection fraction <35%) and QRS prolongation (>120 ms) on resting electrocardiograms. However, real world evidence in specific populations may be bolstered by comparing implantation rates with cardiac resynchronization treatment, hospital admissions for heart failure, and all cause morbidity and mortality for patients with duration of the QRS complex just above versus just below a cut-off of 120 ms.
The preponderance of clinical decision thresholds and changes in policy guidelines provide a rich basis for the application of regression discontinuity. The regression discontinuity design has a number of distinct advantages, even over randomized trials. It can be deployed where clinical trials are not feasible and, unlike randomized controlled trials, allows for an assessment of real world impacts. Though regression discontinuity requires large sample sizes, this design would be sufficiently powered for several existing clinical and administrative databases. The time is opportune for growth of regression discontinuity based studies in healthcare.
Contributors: All authors contributed to the design and conduct of the study, data collection and management, analysis interpretation of the data, and preparation, review, or approval of the manuscript. The research conducted was independent of any involvement from the sponsors of the study. Study sponsors were not involved in the study design, data interpretation, writing, or decision to submit the article for publication. ABJ is the guarantor.
Competing interests: All authors have read and understood the BMJ policy on declaration of interests and declare the following: ABJ had support from the Office of the Director, National Institutes of Health (NIH early independence award, grant 1DP5OD017897-01) for the submitted work.
Provenance and peer review: Not commissioned; externally peer reviewed.