Safety, effectiveness, and cost effectiveness of long acting versus intermediate acting insulin for patients with type 1 diabetes: systematic review and network meta-analysisBMJ 2014; 349 doi: http://dx.doi.org/10.1136/bmj.g5459 (Published 01 October 2014) Cite this as: BMJ 2014;349:g5459
- Andrea C Tricco, research scientist1,
- Huda M Ashoor, research coordinator1,
- Jesmin Antony, research coordinator1,
- Joseph Beyene, biostatistician2,
- Areti Angeliki Veroniki, post-doctoral fellow1,
- Wanrudee Isaranuwatchai, research associate1,
- Alana Harrington, research assistant1,
- Charlotte Wilson, research coordinator1,
- Sophia Tsouros, research assistant1,
- Charlene Soobiah, graduate student1,
- Catherine H Yu, endocrinologist1,
- Brian Hutton, research scientist3,
- Jeffrey S Hoch, associate professor of health policy1,
- Brenda R Hemmelgarn, professor of medicine4,
- David Moher, research scientist3,
- Sumit R Majumdar, professor of medicine5,
- Sharon E Straus, professor of medicine16
- 1Li Ka Shing Knowledge Institute, St Michael’s Hospital, Toronto, ON, M5B 1T8, Canada
- 2Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, L8S 4K1, Canada
- 3Clinical Epidemiology Program, Ottawa Hospital Research Institute and Faculty of Medicine, University of Ottawa, Ottawa, ON, K1H 8L6, Canada
- 4Departments of Medicine and Community Health Sciences, University of Calgary, Calgary, AB, T2N 4Z6, Canada
- 5Department of Medicine, University of Alberta, Edmonton, AB, T6G 2R3, Canada
- 6Department of Geriatric Medicine, University of Toronto, Toronto, ON, M5S 1A1, Canada
- Correspondence to: S E Straus
- Accepted 22 August 2014
Objective To examine the safety, effectiveness, and cost effectiveness of long acting insulin for type 1 diabetes.
Design Systematic review and network meta-analysis.
Data sources Medline, Cochrane Central Register of Controlled Trials, Embase, and grey literature were searched through January 2013.
Study selection Randomized controlled trials or non-randomized studies of long acting (glargine, detemir) and intermediate acting (neutral protamine Hagedorn (NPH), lente) insulin for adults with type 1 diabetes were included.
Results 39 studies (27 randomized controlled trials including 7496 patients) were included after screening of 6501 titles/abstracts and 190 full text articles. Glargine once daily, detemir once daily, and detemir once/twice daily significantly reduced hemoglobin A1c compared with NPH once daily in network meta-analysis (26 randomized controlled trials, mean difference −0.39%, 95% confidence interval −0.59% to −0.19%; −0.26%, −0.48% to −0.03%; and −0.36%, −0.65% to −0.08%; respectively). Differences in network meta-analysis were observed between long acting and intermediate acting insulin for severe hypoglycemia (16 randomized controlled trials; detemir once/twice daily versus NPH once/twice daily: odds ratio 0.62, 95% confidence interval 0.42 to 0.91) and weight gain (13 randomized controlled trials; detemir once daily versus NPH once/twice daily: mean difference 4.04 kg, 3.06 to 5.02 kg; detemir once/twice daily versus NPH once daily: −5.51 kg, −6.56 to −4.46 kg; glargine once daily versus NPH once daily: −5.14 kg, −6.07 to −4.21). Compared with NPH, detemir was less costly and more effective in 3/14 cost effectiveness analyses and glargine was less costly and more effective in 2/8 cost effectiveness analyses. The remaining cost effectiveness analyses found that detemir and glargine were more costly but more effective than NPH. Glargine was not cost effective compared with detemir in 2/2 cost effectiveness analyses.
Conclusions Long acting insulin analogs are probably superior to intermediate acting insulin analogs, although the difference is small for hemoglobin A1c. Patients and their physicians should tailor their choice of insulin according to preference, cost, and accessibility.
Systematic review registration PROSPERO CRD42013003610.
To treat hyperglycemia associated with type 1 diabetes, insulin is administered; isophane insulin (neutral protamine Hagedorn (NPH)) and zinc insulin (lente) have been used commonly since the 1950s. Newer insulin analogs (for example, glargine, detemir) are reported to have a longer duration of action and less between patient variability1; they have been available since the early 2000s.
Some evidence suggests that the newer long acting insulin analogs such as glargine and detemir might be more effective and safer than intermediate acting insulin (NPH and lente).2 3 4 5 Although systematic reviews exist on this topic,2 3 4 only efficacy data from randomized trials were analyzed and comparative effectiveness data from observational studies were not included. Furthermore, cost effectiveness data were not considered in these reviews.2 3 4 We did a systematic review and network meta-analysis to examine the comparative effectiveness, safety, and cost effectiveness of long acting insulin versus intermediate acting insulin for patients with type 1 diabetes.
We developed a systematic review protocol using the Preferred Reporting Items for Systematic reviews and Meta-Analysis for Protocols.6 We revised the protocol following feedback from decision makers at the British Columbia Ministry of Health who had posed the query. The final protocol was published and registered with PROSPERO (CRD42013003610).7 8 Our methods are briefly described here.
We included studies of long acting insulin analogs (glargine and detemir) compared with each other or with intermediate acting insulin (NPH and lente) administered to adults with type 1 diabetes. We excluded studies of pre-mixed insulin preparations. Experimental (randomized clinical trials, quasi-randomized trials, non-randomized trials), quasi-experimental (interrupted time series, controlled before and after study), and observational (cohort) study designs were eligible for inclusion.
We worked with decision makers from the British Columbia Ministry of Health to select the outcomes that were most relevant to them.7 The primary outcome was glycated hemoglobin (A1c); secondary outcomes included severe hypoglycemia (as defined by the authors), serious hyperglycemia (as defined by the authors), weight gain, quality of life, microvascular complications, macrovascular complications, all cause mortality, incident cancers, and cost effectiveness. We imposed no restrictions related to publication status or date, and we attempted to translate non-English articles.
The literature search included Medline, Embase, and the Cochrane Central Register of Controlled Trials, supplemented by searching trial registry websites, conference abstracts, and the reference lists of included studies and relevant reviews.2 3 4 We did forward citation searching in Web of Science and searched the 10 most relevant citations in PubMed for all studies fulfilling our eligibility criteria. The Medline search was peer reviewed and published previously.7 9 The other search strategies can be obtained from the authors on request. All searches were executed on January 8, 2013.
After a calibration exercise, two reviewers (ACT, HA, CS, AH, CW, ST, GS) independently screened each citation and subsequent full text article. Conflicts were resolved by discussion for all levels of screening.
We abstracted data for characteristics of studies and patients and for outcome results. For cost effectiveness studies, data items included study characteristics (for example, intervention, comparator, perspective, currency) and results (for example, incremental cost effectiveness ratios, cost per quality adjusted life year, cost per life year). We derived incremental cost effectiveness ratios for studies reporting the difference in both effectiveness and cost between the intervention and control groups by using the following formula: (cost of the intervention−cost of the comparator)/(effectiveness of the intervention−effectiveness of the comparator).10
Data abstraction and quality/risk of bias appraisal
We appraised quality and risk of bias by using the Cochrane risk of bias tool for randomized controlled trials,11 the Newcastle-Ottawa scale for cohort studies,12 and a 10 item tool developed by Drummond and colleagues for cost effectiveness studies.10 We used the McHarm tool to examine the reporting of adverse drug reactions in the studies that reported harms.13 After calibration of the data abstraction process, two team members (HA, JA, CS, AH, CW) independently abstracted and appraised each of the included studies. Conflicts were resolved by discussion. We contacted authors for missing data or clarifications. Finally, two team members (ACT, HA) reviewed all data to ensure accuracy before analysis.
We did random effects pairwise meta-analysis using the odds ratio effect measure for dichotomous data and the mean difference for continuous data, when at least two studies examined the same intervention and comparator for a particular outcome. For dichotomous outcomes in which studies reported 0 events in one treatment arm, we added 0.5 to the numerator and 1 to the denominator. Studies reporting 0 events in all treatment arms for a particular outcome were excluded from the analysis.
When sufficient data were available (that is, at least five studies reported the same outcome with most relevant treatment comparisons examined), we did random effects network meta-analysis. We summarized the evidence by using a network diagram for each outcome. We assumed a common within network estimate for heterogeneity and estimated it with the restricted maximum likelihood method.14 We assessed the transitivity assumption by examining the comparability of the distribution of the treatment effect modifiers across comparisons,15 including hemoglobin A1c levels (<8% v ≥8%), crossover trials, pregnancy, and quality of studies according to the Cochrane risk of bias tool.11 We evaluated whether consistency existed between direct and indirect evidence, and between studies involving different sets of treatments for the same comparison, by using the design by treatment interaction model for the network.16 We assessed inconsistency locally in the network by using the loop specific method.17 We used sub-group and sensitivity analysis to explore important network inconsistency.
We calculated the network meta-analysis summary treatment effects with their 95% confidence intervals and 95% predictive intervals. We calculated the predictive intervals for the summary treatment effects to capture their uncertainty and the magnitude of the heterogeneity in the network.18 The predictive intervals provide intervals within which the estimated treatment effect of a future study is expected to lie.19 To visually assess the presence of heterogeneity, small study effects, and bias (including publication bias) in the network, we applied the comparison adjusted funnel plot.18 To account for the fact that each study estimates the relative effect of different treatments, we ordered the treatments from oldest to newest, and we used the fixed effects model, as the random effects model is more affected by small study effects. We used the surface under the cumulative ranking curve to rank the treatments.20 We did pairwise meta-analysis with the metafor package in R software and network meta-analysis by using the method of multivariate meta-analysis in Stata with the mvmeta command.21 22
We considered NPH administered twice daily to be usual care (that is, the reference standard). We grouped treatments into nodes with input from clinicians and examined the robustness of the selected treatment nodes by sensitivity analysis. Specifically, we entered the insulin analogs separately in the analysis without considering daily frequency (glargine versus detemir versus NPH) and then with daily frequency (glargine once daily, glargine twice daily, and so on).
To compare the economic results across the cost effectiveness studies, we converted costing data to 2012 international United States dollars (USD). All costs were converted to US dollars and adjusted for inflation by using the consumer price index for medical care in the United States.23 24
The literature search yielded a total of 6501 titles and abstracts (fig 1⇓). Thirty nine studies fulfilled the eligibility criteria, including 38 primary publications5 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 and one companion report.62
Characteristics of studies and patients
One study was written in Serbian,33 one was a trial protocol that reported unpublished results,34 and another was a conference abstract that reported cost effectiveness results.56 Twenty seven studies were randomized controlled trials with study durations ranging from 4 to 104 weeks and including 7496 patients (table 1⇓).25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 Most of the randomized controlled trials were multicenter, were conducted in Europe or North America, and examined glargine versus NPH (10 trials), detemir versus NPH (11 trials), or detemir versus glargine (3 trials). At baseline, the proportion of female participants ranged from 21% to 100% (in a study of pregnant women) (appendix 7). The mean age ranged from 28 to 47 years, mean body mass index from 23.1 to 28.0, hemoglobin A1c from 6.9% to 9.5%, and duration of type 1 diabetes from 11 to 27 years. One study was a cohort with more than 14 years of follow-up. More than 3100 patients with type 1 diabetes were included in this study from the United Kingdom, and the effects of glargine versus detemir were examined (table 1⇓).52 Ten of the included publications were cost effectiveness studies conducted in Europe or North America (table 1⇓).5 53 54 55 56 57 58 59 60 61
Risk of bias and methodological quality results
Most of the trials had an unclear risk of bias on three to five of the seven items; 17 randomized controlled trials had a high risk of bias on one to three of the criteria (appendices 1 and 2). Only 41% of the randomized controlled trials had a low risk of bias for random sequence generation, and approximately 19% had a low risk of bias on allocation concealment. Twenty randomized controlled trials were funded by drug companies, and some of the authors of these trials were employed by the company funding the study.
Twenty one randomized controlled trials reported on harms associated with treatment and were assessed using the McHarm tool (appendices 3 and 4).13 Most randomized controlled trials reported between 7% and 53% of the 15 items; two adequately reported more than 60% of the items.37 38
The cohort study fulfilled most of the Newcastle-Ottawa scale criteria.12 52 However, the authors did not report the number of withdrawals or control for confounders in their analysis (appendix 5). This study was funded by a drug company.
Three cost effectiveness analyses fulfilled all of the methodological quality criteria on the Drummond tool,5 10 57 59 and four were scored as unclear on one or two of the criteria (appendix 6).54 55 58 60 One study obtained effectiveness estimates from an observational study,53 and another based effectiveness on overall hypoglycemia instead of mild hypoglycemia.54 Both were scored as not adequately fulfilling the effectiveness item. One publication was a conference abstract, and 60% of the items were assessed as being unclear.56 Eight studies were funded by a drug company,53 54 55 57 58 59 60 61 whereas the source of funding was not reported for two cost effectiveness studies.5 56
Network meta-analysis results
Primary outcome—hemoglobin A1c
We did a network meta-analysis on hemoglobin A1c that included 26 randomized controlled trials and 6776 patients (appendix 8).25 26 27 28 29 30 31 32 33 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 After a median of 20 weeks of follow-up, only NPH once daily resulted in a significantly greater hemoglobin A1c compared with NPH once or twice daily (mean difference 0.31%, 95% confidence interval 0.02% to 0.60%) (table 2⇓; fig 2⇓). Glargine once daily (mean difference −0.39%, −0.59% to −0.19%), detemir once daily (−0.26%, −0.48% to −0.03%), and detemir once or twice daily (−0.36%, −0.65% to −0.08%) resulted in significantly reduced hemoglobin A1c compared with NPH once daily. However, none of these was statistically significant in our network meta-analysis including predictive intervals. We observed no other statistically significant differences.
On the basis of the surface under the cumulative ranking curve cumulative rankings, glargine once daily had the greatest likelihood of being the most effective insulin in reducing hemoglobin A1c, followed by glargine twice daily, detemir four times a day, and detemir once or twice daily (appendix 16). Similar results were observed in our sensitivity analysis when the treatment nodes were glargine, detemir, and NPH (that is, not incorporating daily dosing). Results for direct and indirect meta-analyses were consistent statistically (table 2⇑).
We did a sub-group analysis by baseline hemoglobin A1c. For patients with poorly controlled hemoglobin A1c (values ≥8%), the following types of insulin were examined in 12 randomized controlled trials including 4002 patients: NPH once or twice daily, NPH once daily, glargine once daily, detemir once daily, and detemir once or twice daily. Glargine once daily resulted in significantly improved hemoglobin A1c compared with NPH once daily (mean difference −0.65%, −0.96% to −0.35%) and detemir once daily (−0.41%, −0.74% to −0.08%). Both detemir once or twice daily (mean difference −0.49%, −0.82% to −0.16%) and detemir administered four times daily (−0.65%, −1.12% to −0.17%) significantly reduced hemoglobin A1c compared with NPH once daily. On the basis of the surface under the cumulative ranking curve, glargine once daily had the greatest likelihood of being the most effective insulin in reducing hemoglobin A1c.
For patients with hemoglobin A1c values below 8%, the following types of insulin were examined in 14 randomized controlled trials including 2774 patients: NPH once or twice daily, NPH once daily, NPH four times daily, glargine once daily, glargine twice daily, detemir once daily, and detemir once or twice daily. Glargine once daily significantly improved hemoglobin A1c compared with NPH once daily (mean difference −0.29%, −0.58% to −0.01%). On the basis of the surface under the cumulative ranking curve, detemir once daily had the greatest likelihood of being the most effective insulin in reducing hemoglobin A1c, followed by NPH once or twice daily and glargine once daily.
We did another sub-group analysis in which the randomized controlled trial including pregnant women was removed25; the same results were observed. We did a sub-group analysis excluding crossover trials and studies with high risk of allocation concealment bias from the network30 44 49; the results did not change. The observational study reported hemoglobin A1c results after 14 years of follow-up, but we were unable to include this study in our network meta-analysis as the authors did not report a variance measure associated with the mean.52
Secondary outcome—body weight
We did a network meta-analysis on body weight, including 13 randomized controlled trials and 3396 patients (appendix 23).26 27 28 30 32 35 39 41 42 44 47 49 50 After a median of 26 weeks of follow-up, NPH once daily (mean difference 4.64 kg, 3.53 to 5.75 kg) and detemir once daily (4.04 kg, 3.06 to 5.02 kg) caused significantly more weight gain compared with NPH once or twice daily; however, detemir once or twice daily caused significantly less weight gain than NPH once or twice daily (−0.87 kg, −1.44 to −0.30 kg) (table 2⇑). Patients receiving detemir once or twice daily (mean difference −5.51 kg, −6.56 to −4.46 kg), detemir once daily (−0.60 kg, −1.13 to −0.08 kg), and glargine once daily (−5.14 kg, −6.07 to −4.21 kg) had significantly less weight gain than those receiving NPH once daily. Furthermore, detemir once daily caused significantly more weight gain than detemir once or twice daily (mean difference 4.91 kg, 4.00 to 5.81 kg) and glargine once daily caused significantly less weight gain than detemir once daily (−4.54 kg, −5.30 to −3.77 kg). Our network meta-analysis including the predictive intervals showed similar results, except that results for detemir once or twice daily versus NPH once or twice daily and detemir once daily versus NPH once daily were no longer significant. According to the surface under the cumulative ranking curve, detemir once or twice daily had the greatest probability of causing the least amount of weight gain, followed by glargine once daily. The design by treatment interaction model suggested no inconsistency in the network as a whole. Our sub-group analysis that removed crossover trials produced the same results.26 32 We were unable to include the observational study in the network meta-analysis because the authors did not report a variance measure associated with the mean body weight.52
Secondary outcome—severe hypoglycemia
We did a network meta-analysis on the number of patients experiencing severe hypoglycemia, which included 16 randomized controlled trials and 5697 patients (appendix 33).25 26 28 30 32 35 37 38 39 42 44 45 47 48 49 50 The definition of severe hypoglycemia varied across the studies (appendix 36). After a median of 24 weeks of follow-up, patients receiving detemir once or twice daily experienced significantly less severe hypoglycemia than those receiving NPH once or twice daily (odds ratio 0.62, 95% confidence interval 0.42 to 0.91) (table 2⇑). However, this was no longer statistically significant in our network meta-analysis including the predictive intervals. Glargine twice daily and detemir once daily had the greatest likelihood of causing the least severe hypoglycemia according to the surface under the cumulative ranking curve. Statistical inconsistency was observed in the network as a whole despite several sub-group and sensitivity analyses, including removing the randomized controlled trial including pregnant women25 or crossover trials.26 32 37 38 Inconsistency was no longer statistically significant when three randomized controlled trials with high risk of bias were excluded.30 44 49 After this exercise, we found the same results for detemir once or twice daily versus NPH once or twice daily. However, detemir administered four times daily caused significantly more patients to experience severe hypoglycemia compared with detemir once or twice daily (odds ratio 3.09, 1.31 to 7.27) or glargine once daily (1.69, 1.12 to 2.50). The same results were observed with the surface under the cumulative ranking curve, and similar results were observed in our sensitivity analysis of treatment nodes.
Direct comparisons meta-analysis and single point estimate results
Secondary outcome—number of patients experiencing serious hyperglycemia
One randomized controlled trial reported serious hyperglycemic events, which were not defined. After 26 weeks of follow-up, no statistically significant difference existed between detemir twice daily and NPH twice daily (relative risk 5.74, 0.23 to 140.48).44
Secondary outcome—microvascular complications
Three randomized controlled trials reported retinopathy, but all examined different insulin treatments. No statistically significant differences were observed for glargine twice daily versus NPH twice daily after 16 weeks of follow-up (relative risk 1.28, 0.48 to 3.40),49 detemir twice daily versus NPH twice daily after 24 weeks of follow-up (0.89, 0.43 to 1.86),25 and detemir twice daily versus NPH twice daily after 26 weeks of follow-up (1.62, 0.66 to 3.93).45
Secondary outcome—macrovascular complications
One randomized controlled trial reported transient ischemic attack after 16 weeks of follow-up.41 No statistically significant difference was observed between detemir twice daily and NPH twice daily (relative risk 2.93, 0.12 to 71.33). One randomized controlled trial reported death due to myocardial infarction and found no statistically significant differences between detemir once daily and NPH once daily after 32 weeks of follow-up (relative risk 4.47, 0.24 to 82.58),38 and another found similar results for death due to cardiopulmonary arrest for glargine once daily versus NPH twice daily after 28 weeks of follow-up (0.34, 0.01 to 8.33).50
Secondary outcome—all cause mortality
After a median of 24 weeks of follow-up, no a statistically significant difference was observed between detemir twice daily and NPH twice daily for all cause mortality (two randomized controlled trials; odds ratio 0.97, 0.10 to 9.44; I2=0%).38 41 Five randomized controlled trials reported 0 deaths in both arms for detemir once daily versus glargine once daily,26 glargine twice daily versus glargine once daily,37 detemir once daily versus NPH twice daily,25 34 and glargine once daily versus NPH twice daily,49 so they were not included in the meta-analysis.
Secondary outcome—incident cancer
After 16 weeks of follow-up, no statistically significant difference was observed between glargine once daily and NPH twice daily for pancreatic cancer (relative risk 0.33, 0.01 to 8.12).49 Similarly, after 26 weeks of follow-up, no statistically significant difference was observed between detemir twice daily and NPH twice daily for uterine cancer (relative risk 1.46, 0.06 to 35.63).47
Secondary outcome—quality of life
After 24 weeks of follow-up, quality of life did not differ between glargine once daily (median 32, interquartile range 27-34) and NPH twice daily (median 31, interquartile range 25-34) on the Well-Being Enquiry for Diabetics questionnaire in one randomized controlled trial.31
Secondary outcome—cost effectiveness
Fourteen cost effectiveness analyses reported in five studies compared detemir with NPH (table 3⇓).5 54 55 57 58 Three of these analyses found that detemir was less costly and more effective,54 55 57 and the others found that detemir was more costly but also more effective than NPH. Eight cost effectiveness analyses reported in five studies compared glargine with NPH.53 56 59 60 Glargine was less costly and more effective in two of these analyses,53 and six found that glargine was more costly and more effective than NPH. Compared with detemir, glargine was not cost effective in either analysis examining this comparison.61
We found that glargine once daily, detemir once daily, and detemir twice daily significantly reduced hemoglobin A1c compared with NPH once daily in our network meta-analysis. Given the small reduction in hemoglobin A1c observed, no differences are likely to be clinically relevant because they did not approach the commonly accepted 0.5% minimal clinically important difference in hemoglobin A1c. Furthermore, none of the treatment comparisons was statistically significant in our network meta-analysis including the predictive intervals, suggesting that the estimates are probably unstable and that any treatment may be effective in a future study. Of note, glargine once daily was significantly superior to NPH once daily overall and in our sub-group analysis on baseline values of hemoglobin A1c. Glargine once daily was also more effective for patients with poorly controlled diabetes (those with hemoglobin A1c ≥8%) than those with hemoglobin A1c below 8%. Many of our results comparing weight gain for long acting and intermediate acting insulin were clinically relevant. NPH once daily and detemir once daily caused more weight gain than NPH once or twice daily. Detemir once or twice daily and glargine once daily caused significantly less weight gain than NPH once daily. Also, patients receiving detemir once or twice daily experienced significantly less hypoglycemia compared with those receiving NPH once or twice daily. The cost effectiveness analysis results were inconsistent across studies. In aggregate, our findings suggest that in terms of glycemic control, long acting insulin analogs are slightly superior to intermediate acting insulin analogs; in terms of safety, long acting insulin analogs are associated with slightly less weight gain and fewer episodes of severe hypoglycemia.
Strengths and weaknesses in relation to other studies, discussing important differences in results
In a previous systematic review and network meta-analysis, Sanches and colleagues concluded that detemir and glargine were not clinically different from NPH for glycemic control or safety, whereas we found that long acting insulin analogs are slightly superior to intermediate acting insulin analogs for glycemic control.3 We also found that harms (weight gain and severe hypoglycemia) occurred less often for patients receiving long acting versus intermediate acting insulins. In addition, we included five randomized controlled trials including 1028 patients (probably owing to our later search date), one cohort study, and 10 cost effectiveness studies that have never been included in any of the previous systematic reviews in this area (appendices 37 and 38).2 4 63
The reported methods of the randomized controlled trials included in our review could be strengthened by using an adequate method of randomization (for example, computer generated), conducting appropriate allocation concealment (for example, centralized randomization), and adequately reporting harms outcomes. Furthermore, the definitions of severe hypoglycemia varied across studies, and studies were limited further by lack of reporting of the dosages of insulin administered to the patients. The randomized controlled trials could be strengthened by a longer duration of follow-up, as the average duration of follow-up ranged from 16 to 26 weeks. This is of particular importance for the all cause mortality and incident cancer outcomes, which require longer durations of follow-up for events to occur. This major shortcoming means that our results should be interpreted with caution. The cohort study that was included could be strengthened by reporting estimates of variance and number of withdrawals, controlling for confounders in the analysis, and examining harms outcomes. Another limitation is that variation exists in the bioavailability of the different types of insulin. For example, larger doses of insulin detemir are required compared with glargine or NPH, increasing the cost of detemir. None of the included cost effectiveness analyses adequately addressed the question of dosages required. Finally, the cost effectiveness analyses included in our review could be improved by using appropriate effectiveness estimates (for example, from randomized trials) and reporting the source of funding.
Our systematic review process also has limitations. We excluded a randomized controlled trial written in Japanese,64 as we were unable to translate the study, and some of our results were only based on one included study (for example, retinopathy, quality of life) and should be interpreted with caution. Finally, we focused on cost effectiveness results that were dominant (less costly, more effective). However, many of the included cost effectiveness analyses found that detemir was more costly and more effective than NPH and that glargine was more costly and more effective than NPH. In these cases, the cost effectiveness of the insulin analog will depend on the decision makers’ willingness to pay. Moreover, the cost variable in each analysis included the cost of treatment and additional costs, depending on the perspective of the analysis (see table 3⇑).
In addition, network meta-analysis is complex and can be difficult for decision makers to interpret. For example, in our surface under the cumulative ranking curve analysis for weight, detemir once or twice daily was the intervention with the greatest likelihood of ranking. This is because of the large and precise effect size that favors the particular intervention but also the network structure. Each direct estimate contributes differently to the network meta-analysis estimation depending on the information it provides. The NPH once or twice daily versus detemir once or twice daily comparison includes the greatest number and size of studies, and a small variance, which can highly influence the neighboring comparisons via the indirect comparison. The contribution of each comparison to the network summary effects can affect the significance or the validity of the results. For instance, excluding the poor quality studies in the same network did not affect the results, because these studies had a low contribution to the network. This might not be an intuitive finding, as detemir once or twice daily was not statistically significantly superior to all agents. Also, our network meta-analysis results are limited by the minimal evidence available resulting in sparse network configurations, which was especially apparent for the body weight outcome. Furthermore, we were unable to include non-randomized evidence in our network meta-analysis of glycemic control and body weight, owing to missing measures of variance.
Meaning of study: possible explanations and implications for clinicians and policy makers
Long acting insulin analogs are probably superior to intermediate acting insulin analogs for some outcomes, although the difference is small for hemoglobin A1c. Patients and their physicians should tailor their choice of long acting insulin according to their preference, cost, and accessibility.
Unanswered questions and future research
Future trials with a sufficient number of patients and longer duration of follow-up should be developed to specifically examine long acting versus intermediate acting insulin for patients with type 1 diabetes. A cost effectiveness analysis comparing all of these interventions would allow policy makers the opportunity to select the type of insulin that is most cost effective.
What is already known on this topic
Newer long acting insulin analogs might be more effective and safer than intermediate acting insulin in patients with type 1 diabetes but are more expensive
Previous systematic reviews have not included data from observational studies or economic analyses
What this study adds
This systematic review and network meta-analysis includes five randomized controlled trials (with 1028 patients), one cohort study, and 10 cost effectiveness studies that have never been included in a previous systematic review or network meta-analysis
In patients with type 1 diabetes, long acting insulin is statistically significantly superior to intermediate acting insulin for glycemic control and harms (weight gain and severe hypoglycemia)
The cost effectiveness of intermediate acting and long acting insulins varied across the included cost effectiveness analyses, but glargine and detemir are more costly than neutral protamine Hagedorn (NPH) in most cases
Cite this as: BMJ 2014;349:g5459
We thank the British Columbia Ministry of Health for requesting this systematic review and for their useful feedback on the review conception. We thank Laure Perrier for doing the literature searches, Becky Skidmore for peer reviewing the Medline search strategy, and Alissa Epworth for doing the forward citation scanning and the PubMed related article searches. Finally, we thank Maggie Chen for providing feedback on our original proposal; Kednapa Thavorn for appraising the quality of cost effectiveness studies; Judy Tran for locating full text articles; Geetha Sanmugalingham for screening some titles and abstracts; Wing Hui for scanning the reference lists of some included studies, helping to generate the data tables, and formatting the manuscript; and Wasifa Zarin for helping to reformat the paper.
Contributors: ACT conceived and designed the study, helped to obtain funding for the study, screened literature for inclusion, abstracted data from included studies, and wrote the manuscript. HMA coordinated the study, screened the literature search results, and abstracted data. JB, AAV, and WI analyzed the results and helped to draft sections of the paper. AH, CS, CW, JA, and ST helped to screen the literature and/or abstracted data. CHY, BH, BRH, DM, and SRM helped to obtain funding for the study and to conceive the study. SES conceived and designed the study, obtained the funding, and helped to write the draft paper. All authors interpreted the results and read, edited, and approved the final paper. SES is the guarantor.
Funding: This systematic review was funded by the Canadian Institutes for Health Research/Drug Safety and Effectiveness Network (CIHR/DSEN). The study funder had no role in the design, conduct, analysis, and decision to submit for publication. ACT and BH are funded by CIHR/DSEN new investigator awards in knowledge synthesis. JB holds the John D Cameron endowed chair in the genetic determinants of chronic diseases, Department of Clinical Epidemiology and Biostatistics, McMaster University. DM is funded by a University of Ottawa Research chair. SRM is the endowed chair in patient health management (supported by the Faculties of Medicine and Dentistry and Pharmacy and Pharmaceutical Sciences) and holds a health scholar salary award (supported by Alberta Heritage Foundation for Medical Research and Alberta Innovates - Health Solutions). SES is funded by a tier 1 Canada research chair in knowledge translation.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: this was work funded by the Canadian Institutes for Health Research; no financial relationships with any organization that might have an interest in the submitted work in the previous three years; no relationships or activities that could appear to have influenced the submitted work.
Ethics approval: Not needed.
Declaration of transparency: The lead author (study guarantor) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Data sharing: The data set and literature search are available from the corresponding author at.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/.