Emulating the GRADE trial using real world data: retrospective comparative effectiveness study

Abstract Objective To emulate the GRADE (Glycemia Reduction Approaches in Diabetes: A Comparative Effectiveness Study) trial using real world data before its publication. GRADE directly compared second line glucose lowering drugs for their ability to lower glycated hemoglobin A1c (HbA1c). Design Observational study. Setting OptumLabs® Data Warehouse (OLDW), a nationwide claims database in the US, 25 January 2010 to 30 June 2019. Participants Adults with type 2 diabetes and HbA1c 6.8-8.5% while using metformin monotherapy, identified according to the GRADE trial specifications, who also used glimepiride, liraglutide, sitagliptin, or insulin glargine. Main outcome measures The primary outcome was time to HbA1c ≥7.0%. Secondary outcomes were time to HbA1c >7.5%, incident microvascular complications, incident macrovascular complications, adverse events, all cause hospital admissions, and all cause mortality. Propensity scores were estimated using the gradient boosting machine method, and inverse propensity score weighting was used to emulate randomization of the treatment groups, which were then compared using Cox proportional hazards regression. Results 8252 people were identified (19.7% of adults starting the study drugs in OLDW) who met eligibility criteria for the GRADE trial (glimepiride arm=4318, liraglutide arm=690, sitagliptin arm=2993, glargine arm=251). The glargine arm was excluded from analyses owing to small sample size. Median times to HbA1c ≥7.0% were 442 days (95% confidence interval 394 to 480 days) for glimepiride, 764 (741 to not calculable) days for liraglutide, and 427 (380 to 483) days for sitagliptin. Liraglutide was associated with lower risk of reaching HbA1c ≥7.0% compared with glimepiride (hazard ratio 0.57, 95% confidence interval 0.43 to 0.75) and sitagliptin (0.55, 0.41 to 0.73). Results were consistent for the secondary outcome of time to HbA1c >7.5%. No significant differences were observed among treatment groups for the remaining secondary outcomes. Conclusions In this emulation of the GRADE trial, liraglutide was statistically significantly more effective at maintaining glycemic control than glimepiride or sitagliptin when added to metformin monotherapy. Generating timely evidence on medical treatments using real world data as a complement to prospective trials is of value.


Introduction
Type 2 diabetes is a common serious chronic health condition, impacting 11.3% (37.3 million) of the US population 1 and 9.3% (463 million) of people worldwide. 2 Moderate glycemic control, defined by achieving glycated hemoglobin (HbA 1c ) between 7% and 8%, improves microvascular and macrovascular outcomes. 3 4 Current clinical practice guidelines recommend targeting HbA 1c <7% for most nonpregnant adults. 5 Timely and appropriate treatment intensification is fundamental to maintaining glycemic control 6 and preventing complications. [7][8][9][10] Metformin is the preferred glucose lowering drug owing to its efficacy, tolerability, and low cost. [11][12][13][14] Type 2 diabetes is, however, a progressive disease, and most patients ultimately require intensification of treatment. Recent US population level estimates suggest that nearly one third of people with HbA 1c ≥7% are treated with only one glucose lowering drug 15 and as such would benefit from treatment intensification. Clinical practice guidelines advise that choice of second line treatment should be informed by clinical and situational considerations specific to each individual, recognizing the knowledge gaps stemming from the lack of direct comparisons of currently available second line drugs. [11][12][13][14] The GRADE (Glycemia Reduction Approaches in Diabetes: A Comparative Effectiveness Study) trial is a recently completed, but still unpublished, pragmatic, randomized, parallel arm clinical trial that seeks to address this knowledge gap by comparing four second line glucose lowering drugs among adults with moderately uncontrolled type 2 diabetes who are in receipt of metformin monotherapy. 16 17 The drugs represent four classes: glimepiride (sulfonylurea), sitagliptin (dipeptidyl-peptidase 4 inhibitor), liraglutide (glucagon-like peptide-1 receptor agonist), and insulin glargine (basal analog insulin). The GRADE trial was designed (2008) and launched (July 2013) before US Food and Drug Administration approval of sodium-glucose cotransporter-2 inhibitors and several cardiovascular outcomes trials that showed reduction in atherosclerotic cardiovascular and kidney disease outcomes with use of glucagon-like peptide-1 receptor agonists, and in heart failure and kidney disease outcomes with use of sodium-glucose cotransporter-2 inhibitors. This highlights a key limitation of large prospective randomized controlled trials: such trials are time consuming to conduct, potentially hindering the ability to answer questions in a clinically meaningful time frame. Thus, it is of value to efficiently generate timely evidence on medical treatments using observational research methods applied to real world data as a complement to prospective trials.
Advances in the quantity, quality, and granularity of real world data, combined with improvements in statistical methods used to account for confounding, treatment allocation bias, and time related bias, have provided opportunities to use large scale real world data to inform the understanding of drug effectiveness and safety. Ideally, studies using real world data would be conducted before the publication of the results from randomized controlled trials, thereby minimizing potential biases that could be introduced by trying to replicate known results from such trials. As an illustrative test case of the opportunities and limitations of using observational research methods to emulate randomized controlled trials, and building on parallel analyses emulating the PRONOUNCE (A Trial Comparing Cardiovascular Safety of Degarelix Versus Leuprolide in Patients With Advanced Prostate Cancer and Cardiovascular Disease) trial, 18 we used claims and laboratory results data from OptumLabs® Data Warehouse (OLDW), a deidentified national dataset of privately insured and Medicare Advantage beneficiaries, to emulate the GRADE trial. We used published 16 19 and publicly available 17 information on the GRADE trial's study design to emulate the methods and anticipated results as closely as possible, with the goal of directly comparing the effectiveness of glimepiride, sitagliptin, liraglutide, and insulin glargine in achieving and maintaining HbA 1c <7.0% among adults with type 2 diabetes and HbA 1c 6.8-8.5% while in receipt of metformin monotherapy. We also examined the secondary metabolic, microvascular, macrovascular, and safety endpoints planned in the GRADE trial as feasible using the data available within OLDW. Our study therefore had two complementary objectives. First, a clinical objective, to examine four second line glucose lowering drugs for lowering or maintaining, or both, HbA 1c <7.0%, filling an important clinical knowledge gap in the comparative effectiveness of these commonly used and guideline recommended drug classes. Second, a methodologic objective, to ascertain whether routinely available claims data can be used to emulate a prospective randomized clinical trial ahead of its publication, filling important methodologic and regulatory policy needs in the use of real world data to predict clinical trial results.

study design
We retrospectively analyzed medical and pharmacy claims data from OLDW, a deidentified claims dataset that includes healthcare utilization information for beneficiaries of private health plans (adults of working age and their dependents) and Medicare Advantage plans. The latter are Medicare approved plans offered by private companies to beneficiaries who are eligible for Medicare (eg, adults aged ≥65 years, individuals with disability, people with end stage kidney disease) as a private alternative to Original Medicare. Just as with private insurance, Medicare Advantage plans typically bundle medical and pharmacy coverage. OLDW contains longitudinal health information on enrollees in these health plans, representing a diverse mixture of ages, ethnicities, and geographic regions across the US. 20 21 The study is reported according to the Reporting of studies Conducted using Observational Routinelycollected Data (RECORD) reporting guideline. 22

study population
We first assembled a cohort of adults (≥18 years) who initially started glimepiride, sitagliptin, liraglutide, or insulin glargine between 25 January 2010 (date of liraglutide approval by the FDA; remaining study drugs were approved earlier) and 30 June 2019 (see supplemental figure S1). The index date was set to the date of the first claim for the study drug. People who started ≥2 study drugs on the index date were excluded. Individuals were required to be adherent to metformin for ≥8 weeks before that first study drug fill date. This was established by identifying all metformin fills before the index date, establishing continuous treatment episodes based on prescription fill dates and the days' supply for each fill (allowing up to 30 day gap between fills), and requiring that the last metformin treatment episode before the index date be at least eight weeks. To ensure consistent and adequate capture of baseline comorbidities and treatment data, people were required to have six months of continuous enrollment with medical and pharmacy coverage before the index date. We excluded those with fills for any glucose lowering drugs other than metformin during the baseline period and those with type 1 diabetes, defined using ICD-9 and ICD-10 (international classification of diseases, ninth and 10th revisions, respectively) codes. Individuals were further required to have valid personal (age, sex, region) data and HbA 1c results both within three months before the index date (baseline HbA 1c ) and during follow-up. Next, we adapted the eligibility criteria for the GRADE trial 16 17 19 and applied these to beneficiaries included in OLDW, as detailed in supplemental table S1. Supplemental tables S2 and S3 summarize the relevant diagnosis codes and drugs. All eligible individuals in OLDW were included in the cohort.

Outcomes
The primary outcome was time to primary metabolic failure, calculated as days to HbA 1c ≥7.0% while treated with the assigned drug, with the period of eligibility starting at month 3 after the index date (analogous to the first quarterly HbA 1c assessment in the GRADE trial). Unlike the GRADE trial protocol, we did not require a confirmatory HbA 1c owing to variation in real world HbA 1c testing intervals. To assess for potential bias in outcome ascertainment as the result of different frequencies of HbA 1c testing and varying intervals between tests among the treatment groups, we compared the number, frequency, and timing of available HbA 1c test results and found no difference between the groups (see supplemental table S4). Because testing frequency is guided by baseline HbA 1c , we also examined intervals between sequential HbA 1c tests stratified by baseline HbA 1c and found no differences between the treatment groups (see supplemental table S5). Secondary metabolic, cardiovascular, and microvascular outcomes were analyzed as specified in the GRADE trial's statistical analysis plan 17 if they were feasible to ascertain using claims data (see supplemental table S6). Individuals were followed until they experienced the outcome of interest, anticipated follow-up duration of the trial (seven years), end of the study period (31 July 2019), end of insurance coverage, or death. Individuals with outcomes observed while being treated with the assigned regimen, were followed until they discontinued the assigned drug (defined as not refilling a drug after 30 days of the end of last treatment episode), with the goal of emulating the definitions of these outcomes in the GRADE trial (ie, while being treated with the originally assigned drugs). 16 independent variables Patient individual level age, sex, race or ethnicity, and annual household income were identified from OLDW enrollment files at the time of the index date. Detailed description of the source data for these variables is available in the supplemental methods. Comorbidities (ascertained from all claims during six months preceding the index date) included retinopathy, nephropathy, neuropathy, coronary artery disease, cerebrovascular disease, peripheral vascular disease, heart failure, and previous severe hypoglycemia and hyperglycemia, as detailed in supplemental table S2. Specialties of treating physicians were categorized as primary care, endocrinology, cardiology, nephrology, other, and unknown. Baseline drugs, included as surrogates for burden of complications, were identified from fills in the six months preceding the index date (see supplemental table S3).

statistical analysis
Inverse probability of treatment weighting was used to balance the differences in baseline characteristics among the treatment groups. Propensity scores were used as probability of treatment; these propensity score weights were estimated using generalized boosted models including the baseline variables presented in table 1. Using generalized boosted models involves an iterative process with multiple regression trees to capture complex and non-linear relations between treatment assignments and the pretreatment covariates, with the propensity score model leading to the best balance among the treatment groups. 23 The supplemental methods provide additional detail on the models. We calculated stabilized weights with multiple treatments by dividing the marginal probability of treatment by the propensity score of treatment received. 24 Supplemental figure S2 shows the distribution of weights. Standardized mean differences were used to assess the balance of covariates after weighting; a standardized mean difference ≤0.1 was considered a good balance and ≤0.2 was considered acceptable. 25 Before evaluation of the outcomes, we examined the weighted sample sizes and ability to account for baseline confounding to determine the feasibility of including each treatment group.
The cumulative incidence of the primary (time to first HbA 1c ≥7.0) and secondary (time to first HbA 1c >7.5%) metabolic failures within each treatment arm was estimated with the inverse probability of treatment weighting Kaplan-Meier method. We used the propensity score weighted Cox proportional hazards regression models adjusted by baseline HbA 1c values to compare the outcomes between treatment groups. As the primary outcome can be only observed from the third month, we set the at risk time for the proportional hazards model as three months after the index date. Results are presented as median times to metabolic failure and expected proportions of people experiencing metabolic failure at one and two years. All pairwise comparisons between the treatment groups were estimated, and we applied the Holm method to adjust the P values for multiple testing. We tested the proportional hazards assumption using Schoenfeld residuals. Similar analyses were performed for other time-to-event outcomes. The at risk start time for modeling secondary metabolic, cardiovascular, and microvascular disease outcomes was set at the study index date. Repeated measures HbA 1c trends by treatment group were estimated by using the inverse probability of treatment weighting mean HbA 1c results by treatment group in three month time intervals. The follow-up time by treatment arm was estimated using the same propensity score weights as the primary analysis and the inverse probability of treatment weighting Kaplan-Meier method for the censoring distribution. 26 All primary analyses were conducted using the per protocol censoring approach for the primary outcome and for the secondary outcomes of secondary metabolic failure and insulin initiation, censoring at the time of treatment drug discontinuation, disenrollment from the health plan, end of study period, or death, whichever came first (see supplemental figure S3). Time receiving treatment for each drug was determined by calculating continuous coverage episodes based on available fills-the same as for baseline metformin treatment. Remaining secondary outcomes were analyzed using the intention-to-treat censoring approach, censoring the participant at the time of health plan disenrollment, end of study, or death, which ever came first. P<0.05 was considered statistically significant for all two sided tests. All analyses were performed using SAS 9.4 (SAS Institute, Cary, NC) and R version 4.0.2.(R Foundation).

sensitivity analyses
First, to examine the comparative effectiveness of study drugs while treated only with them and not with any other drug, accounting for real world treatment practices, we repeated all analyses using the as treated censoring approach, censoring at the time a new drug class was added, the assigned drug was discontinued, health plan disenrollment, end of study, or death, which ever came first (see supplemental figure S3). Second, we assessed residual confounding by testing a falsification endpoint that was unlikely to be associated with the studied drugs: diagnosis of pneumonia (see supplemental table S2) during the follow-up period.

Patient and public involvement
Patients were not involved in the design, conduct, or dissemination of this study. However, this study was informed by the need to identify preferred glucose lowering treatment strategies in the absence of direct comparisons across the examined drugs; and to examine whether and how data collected in the process of routine patient care can be used to emulate prospective clinical trials. Because this study seeks to inform drug regulatory policy and procedures, investigators from the FDA contributed to the design of the study and interpretation of study findings; they are included as coauthors on this publication.

study population
We identified 18 365 adults with type 2 diabetes who started glimepiride, 12 818 who started sitagliptin,  (see supplemental table S8  for all included drugs).  Supplemental  table  S9 shows baseline characteristics of the included individuals before weighting. Across the four treatment groups, there were significant differences (largest standardized mean difference >0.2) in age, race or ethnicity, annual household income, and prescribing physician specialty. Individuals in the liraglutide arm were more likely to be younger, white, on a higher income, and treated by an endocrinologist than those in the other treatment arms. Individuals in the glargine arm were most likely to be on a low income and they had the highest prevalence of all examined comorbidities.
The glargine arm was excluded from all analyses because of small sample size (n=251, weighted n=179) and inability to achieve good control of confounders after weighting. The propensity score model was estimated on the remaining three groups. Primary metabolic failure (Hba 1c ≥7.0%) Median follow-up until per protocol censoring was 238 days (95% confidence interval 226 to 255 days) in the glimepiride arm, 124 (100 to 150) days in the liraglutide arm, and 186 (179 to 201) days in the sitagliptin arm (see supplemental figure S4). Mean HbA 1c decreased most in the liraglutide arm and least in the sitagliptin arm, with differences most pronounced between months 3 and 6 of treatment ( fig  1). The median times to primary metabolic failure were 442 days (95% confidence interval 394 to 480 days) in the glimepiride arm, 764 (741 to not calculable) days in the liraglutide arm, and 427 (380 to 483) days in the sitagliptin arm (fig 2). Liraglutide was associated with lower risk of primary metabolic failure compared with glimepiride (hazard ratio 0.57, 95% confidence interval 0.43 to 0.75) and sitagliptin (0.55, 0.41 to 0.73); table 2. No significant difference was observed between sitagliptin and glimepiride (1.03, 0.94 to 1.13). By one year, the estimated cumulative incidence rates of primary metabolic failure were 0.28 (95% confidence interval 0.19 to 0.36) in the liraglutide arm, 0.44 (0.42 to 0.46) in the glimepiride arm, and 0.46 (0.43 to 0.48) in the sitagliptin arm (table 3).

subgroup analyses
Liraglutide was associated with lower risk of primary metabolic failure compared with glimepiride (hazard ratio 0.59, 95% confidence interval 0.44 to 0.78) and sitagliptin (0.58, 0.43 to 0.79) among patients with baseline HbA 1c ≥7.0%. No significant differences were observed among the treatment groups in individuals with baseline HbA 1c <7.0% (see supplemental table S13). Liraglutide was associated with lower risk of primary metabolic failure compared with glimepiride (0.54, 0.42 to 0.71) and sitagliptin (0.58, 0.44 to 0.77) among those aged <65 years. No significant differences were observed among groups in people aged ≥65 years of age. Liraglutide was also associated with lower risks of primary metabolic failure than glimepiride and sitagliptin in women, but not in men, and in white and Hispanic individuals, but not in black or Asian individuals. Findings were similar for secondary metabolic failure (see supplemental table S14).

sensitivity analyses
Another glucose lowering drug was added before discontinuation of the assigned treatment in 423 of 4168 (10%) people in the glimepiride arm, 237 of 572 (41%) in the liraglutide arm, and 419 of 2800 (15%) in the sitagliptin arm. Sensitivity analyses using the as treated censor approach were consistent with the primary analyses (see supplemental figure S6 and table S15). No significant differences were observed among the treatment groups for the pneumonia falsification endpoint (see supplemental table S16).

Principal findings
In our emulation of the GRADE trial using real world data from an administrative claims database we found that liraglutide was statistically significantly more effective at maintaining glycemic control, defined by time to HbA 1c ≥7.0% (primary metabolic failure) and HbA 1c >7.5% (secondary metabolic failure) than either glimepiride or sitagliptin. These differences are clinically meaningful, with over 40% more patients in control of their HbA 1c when treated with liraglutide than when treated with glimepiride or sitagliptin. We were unable to include insulin glargine in the comparisons because of the small number of individuals treated with this drug who met the GRADE trial eligibility criteria. This was not surprising as treatment with basal insulin in the clinical context examined by the GRADE trial is outside the standard of care and mainstream practice. Additionally, the analytic framework implemented in this work shows that real world data may be an important complement to prospective trials, allowing for efficient and timely examination of pressing clinical  questions and inquiries of comparative effectiveness and safety. Our efforts to emulate all specifications of the GRADE trial were hindered because study conditions are not adequately represented in real world practice as they are not supported by clinical practice guidelines. Although all four study drugs were frequently used by the OLDW population, 80% of adults starting these drugs had to be excluded because they did not meet the prespecified eligibility criteria for the GRADE trial. Nevertheless, this proportion of included participants is still higher than the 9.1% generalizability estimated by the GRADE trial team compared with the overall US population with diabetes. 19 Most of the people (58.6% overall) were excluded because they did not meet the baseline HbA 1c level requirements, including 81.1% of people who started glargine, 71.8% who started liraglutide, 52.8% who started glimepiride, and 51.7% who started sitagliptin. According to current guidelines, the target HbA 1c for most non-pregnant adults is 7.0%, such that treatment intensification would not be warranted for some people. Initiation of insulin, in particular, is advised when HbA 1c is >9-10%, 14 27 so starting glargine as a second line drug at HbA 1c levels <8.5% would not be consistent with the standard of care 14 27 or contemporary practice. [28][29][30] The fact that most people treated with the studied drugs in clinical practice are not represented in the study population raises concerns about the utility and generalizability of the GRADE trial's findings and its impact on diabetes management, underscoring the important complementary insights that can be gleaned from analyses of real world data (which can be designed to use more pragmatic and generalizable eligibility criteria) as adjuncts to randomized controlled trials.
comparison with other studies We met our objective to conduct all analyses before publication of the GRADE trial findings, and it will be important to ultimately compare our findings with those of the GRADE trial. The greater effectiveness of liraglutide compared with both glimepiride and sitagliptin is consistent with previous studies. 29 [31][32][33] Additionally, subgroup analyses showing greater effectiveness of liraglutide among people with raised baseline HbA 1c and in younger patients, generated important hypotheses about the optimal use of liraglutide (and potentially other glucagon-like peptide-1 receptor agonists) in clinical practice to be explored in future research. When the GRADE trial was conceived, drugs' ability to lower HbA 1c was at the forefront of clinical decision making when choosing glucose lowering treatment. Similarly, the sodium-glucose cotransporter-2 inhibitors class of glucose lowering drugs had not yet been incorporated into practice and therefore was excluded as a comparator treatment when the GRADE trial was conceived and designed.
strengths and limitations of this study Our study is strengthened by application of advanced analytic methods that account for measured differences between treatment arms that otherwise confound analyses and preclude causal inference. The generalized boosted based models for the propensity score are more flexible and less sensitive to model misspecification compared with logistic regression. The large and diverse population within OLDW made emulation efforts uniquely possible despite the narrow eligibility criteria specified by the GRADE trial.
Despite rigorous causal inference analytic methods, observational studies are inevitably subject to residual confounding. For the metabolic endpoints, there was evidence of non-proportional hazards, which makes the single summary hazard ratio calculated from the Cox proportional hazards an imperfect estimate for the time varying risk. However, with the goal of emulating the GRADE trial, where the statistical analysis plan was to estimate single summary hazard ratios, we report the same estimate in the emulation. We were also unable to operationalize every component of the GRADE trial's eligibility criteria and endpoints. For example, we did not require confirmatory HbA 1c results to meet the metabolic endpoints and were not able to maintain the same standard timeframe for HbA 1c ascertainment as specified in the GRADE trial. Additionally, while the GRADE trial analyses were conducted using the intention-to-treat principle, we a priori chose to use per protocol analysis for the metabolic endpoints because in the absence of randomization, reasons for changing a treatment typically depend on post-initiation factors that could confound the association between the treatment group and the outcome. While advanced statistical methods can account for post-baseline differences between groups in key characteristics, these methods require accurate estimation of the reasons to stop or change treatment, and such estimation is not feasible in this setting using claims data. Duration of follow-up was also different among the treatment arms, which is unavoidable when studying real world practice patterns. In particular, a higher proportion of individuals initiating liraglutide filled only one cycle of treatment before either switching to a different treatment or not refilling their prescription, potentially because of poor tolerability, the need to be administered subcutaneously, or high cost.
Not all people with claims data in OLDW have available laboratory data, as laboratory results are available for a subset of patients based on data sharing agreements between OptumLabs and commercial laboratories. The availability of laboratory results, however, is independent of treatment regimen, and we do not expect it to bias our analyses. The schedule of HbA 1c testing in real world practice is contingent on an individual's current HbA 1c level and ability to access care, and on the clinician's anticipation of changing HbA 1c levels. This may have confounded study results by delaying the time to HbA 1c reassessment and reaching the study endpoint in people with low baseline HbA 1c or with barriers to care. Our evaluation could not account for inclusion and exclusion criteria that could not be operationalized using claims data, doi: 10.1136/bmj-2022-070717 | BMJ 2022;379:e070717 | the bmj including drugs obtained without insurance coverage (eg, obtained through a low cost generic programme, 34 a patient assistance programme, or a sample), comorbidities that were not coded and billed in a clinical encounter, and information on family history. However, previous studies found the likely number of glucose lowering drugs missing from claims to be low. 35 Finally, the study cohort comprised people with private and Medicare Advantage health plans, such that results may not fully generalize to people with public health plans or those without insurance coverage.

Policy implications
Contemporary clinical practice guidelines increasingly focus on the impacts of glucose lowering treatments on hard outcomes that are important to patients beyond HbA 1c , such as macrovascular and microvascular complications and death,. 36 Indeed, most recent clinical practice guidelines recommend consideration of glucagon-like peptide-1 receptor agonists and sodiumglucose cotransporter-2 inhibitors even as preferred treatments and independent of the HbA 1c level among people at high risk for atherosclerotic cardiovascular disease, kidney disease, and heart failure. 14 For these outcomes, robust evidence favors liraglutide (of the drug classes examined) in individuals at high risk for atherosclerotic cardiovascular disease, 37 38 further underscoring the advantage of this drug. It will be important, in future research, to compare the effectiveness of glycemic control achieved by glucagonlike peptide-1 receptor agonists with that of sodiumglucose cotransporter-2 inhibitors, as sodium-glucose cotransporter-2 inhibitors are similarly recommended for people at high risk for cardiovascular disease, kidney disease, and heart failure. 14 Analytic methods such as those implemented in this study, and in the parallel emulation of PRONOUNCE, 18 can be leveraged for more timely evaluations of drug effectiveness and safety as long as the treatments being considered are already used in clinical practice. Indeed, work is currently underway to examine the comparative effectiveness of sulfonylurea, glucagonlike peptide-1 receptor agonist, dipeptidyl-peptidase 4 inhibitor, and sodium-glucose cotransporter-2 inhibitor drugs for atherosclerotic cardiovascular disease and other hard outcomes among people at moderate risk for atherosclerotic cardiovascular disease using observational data from real world practice. 39 conclusions Better understanding of the comparative effectiveness and safety of second line glucose lowering drugs is urgently needed to inform shared decision making in diabetes. Ultimately, the population included in this study and our findings should be compared with those of the GRADE trial, once published in peer reviewed literature, to assess the fidelity and generalizability of results and to improve our understanding of the use of real world data to emulate clinical trials.
JSR has served as an expert witness. JDW serves as a consultant for Hagens Berman Sobol Shapiro LLP and Dugan Law Firm APLC. NDS is currently employed by Delta Air Lines; he was an employee of Mayo Clinic when this research was conducted. JSR is a co-founder of medRxiv and an associate research editor for The BMJ. Other declarations are: no other relationships or activities that could appear to have influenced the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: All study data are deidentified consistent with Health Insurance Portability and Accountability Act of 1996 (HIPAA) expert deidentification determination. The study was therefore exempt from review by the Mayo Clinic institutional review board.
Data sharing: This study was conducted using deidentified claims data from OptumLabs Data Warehouse. Raw data are not publicly available. The study protocol, code sets, and statistical analysis plan are available online. 40 The lead author (RGM) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Dissemination to participants and related patient and public communities: Study results will be disseminated to patient and public communities through peer reviewed publication, reporting of study results to the Food and Drug Administration, and sharing of the results on social media.
Provenance and peer review: Not commissioned; externally peer reviewed. This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/.