Outcomes in observational datasets: Confounders, and the “good doctor effect”
We read with great interest the study by Vinogradova and colleagues, which investigates clinical outcomes in patients treated with direct-acting oral anticoagulants (DOACs) or warfarin in a nationwide cohort . This study concluded that apixaban was associated with reduced risks of major, intracranial and gastrointestinal bleeding compared with warfarin, in concordance with previous studies. However, this study also concluded that rivaroxaban and low-dose apixaban were associated with higher mortality than warfarin and these findings are in stark contrast with the literature. In 2017, a meta-analysis of previous observational studies showed that apixaban was associated with a significant reduction in all-cause mortality compared with warfarin (HR 0.65; 95% CI 0.56-0.75) . Rivaroxaban was also associated with a similar numerical, but non-significant, reduction in all-cause mortality compared to warfarin (HR 0.67; 95% CI 0.35-1.30). Furthermore, in randomised studies, DOACs (including rivaroxaban and apixaban) reduced all-cause mortality compared with warfarin (HR 0.90 95% CI 0.85-0.95) with no evidence of heterogeneity between different DOACs in a formal meta-analysis .
Treatment choices are influenced by a variety of factors in clinical practice, including perceived or real safety, “adequacy” of a specific drug for a given patient, familiarity with a specific drug, price, local availability (including availability in a region or in a specialty), and others. Selection of a treatment that is perceived as safer can thus result in an association with worse outcomes. We suggest to call this natural consequence of physicians who want the best for their patients the “good doctor effect”. Some factors that influence outcomes can be measured, and large data sets allow correction for measurable factors using different statistical models. Others cannot be measured (sometimes referred to as “residual confounders”). Only randomisation can eliminate such confounders. Vinogradova et al. adjusted for a selection of known confounders which attenuated the differences in mortality, but hazard ratios for mortality vs. warfarin in AF patients were still numerically raised in both rivaroxaban (HR 1.19; 95% CI 1.09-1.29) and apixaban (HR 1.13; 95% CI 1.01-1.25) treated patients. The 95% confidence intervals did not include 0 for either drug (although statistical significance was only declared for rivaroxaban due to the choice of a threshold for statistical significance of P<0.01).
The authors did not include statistical comparisons in baseline characteristics between the different treatment groups, but simple Chi-squared tests show that the proportion of patients aged over 90 years old was significantly higher in both rivaroxaban and apixaban-treated patients compared with warfarin (both P<0.001). The effect of age and related frailty and comorbidities are unlikely to be fully corrected for even with the use of a polynomial term in the statistical model. In addition, important confounders that affect treatment choice were not included in the model, such as patient compliance, level of creatinine clearance, or concomitant treatment with dual antiplatelet therapy in the context of acute coronary syndrome.
Further analysis demonstrated that a substantial proportion of the increased risk associated with rivaroxaban and apixaban was related to the lower dosages of both of these drugs. Similar findings have indeed been demonstrated previously . However, the effects of indication bias are amplified in patients treated with the lower dosages of DOACs due to the additional indication for the lower dose rather than the standard dose, especially if this is not specifically adjusted for. Dose reduction for rivaroxaban is based on creatinine clearance, whilst age, body weight and level of creatinine are used to determine the dose of apixaban. Of these factors, the study by Vinogradova only directly adjusted for age. Inclusion of BMI and a history of chronic kidney disease in the model may have partially adjusted for weight and level of creatinine clearance respectively, important factors for mortality, but residual confounding will have remained.
Taking all of this into account, a combination of unrecognised confounders and imperfect adjustment for recognised confounders provides a more plausible explanation for the increased risk of mortality seen in patients treated rivaroxaban and apixaban rather than a causal effect of the drugs. In keeping with this, most of the difference in mortality between treatment groups was not related to stroke, venous thromboembolism or bleeding, which would be the expected causative effects of the drugs. We would therefore argue that the authors cannot be sufficiently confident of causation to use the terms “number needed to harm” and “number needed to treat” which should be reserved for causative effects.
Due to the limitations of studies based on routinely collected data (including indication bias and concerns regarding accuracy and reliability of data) [5,6], we advocate that effectiveness comparisons should be based on the results of randomised controlled trials wherever possible. Information on adequate drug usage (e.g. choosing a particular dose) and safety information, in contrast, can be taken from observational data sets, and often enable insights that are not possible in controlled clinical trials. In a previous study in the BMJ by Hemkens et al., approximately one third of studies based on routine health data produced treatment effects that were in in fact in an opposite direction to subsequent randomised controlled trials that investigated the same question . Another report from our institute demonstrated that the difference in mortality between different treatments (digoxin and β blockers in that example) was largely dependent on the risk of unmeasured confounders . Based on these general considerations, comparing the effectiveness of different treatments in observational data sets will remain difficult, and caution is needed when interpreting the results of Vinogradova et al. Only properly powered randomised trials comparing different DOACs can provide a definitive answer to the question whether one DOAC is safer or more effective than the other. The meta analyses of controlled clinical trials suggest a group effect compared to warfarin.
1 Vinogradova Y, Coupland C, Hill T, et al. Risks and benefits of direct oral anticoagulants versus warfarin in a real world setting: cohort study in primary care. BMJ 2018;362:k2505.
2 Ntaios G, Papavasileiou V, Makaritsis K, et al. Real-World Setting Comparison of Nonvitamin-K Antagonist Oral Anticoagulants Versus Vitamin-K Antagonists for Stroke Prevention in Atrial Fibrillation: A Systematic Review and Meta-Analysis. Stroke 2017;48:2494–503.
3 Ruff CT, Giugliano RP, Braunwald E, et al. Comparison of the efficacy and safety of new oral anticoagulants with warfarin in patients with atrial fibrillation: a meta-analysis of randomised trials. Lancet 2014;383:955–62.
4 Nielsen PB, Skjøth F, Søgaard M, et al. Effectiveness and safety of reduced dose non-vitamin K antagonist oral anticoagulants and warfarin in patients with atrial fibrillation: propensity weighted nationwide cohort study. BMJ 2017;356:j510.
5 McMurray JJV. Only Trials Tell the Truth About Treatment Effects. J Am Coll Cardiol 2018;71:2640–2.
6 Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JPA. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ 2016;352:i493.
7 Ziff OJ, Lane DA, Samra M, et al. Safety and efficacy of digoxin: systematic review and meta-analysis of observational and controlled trial data. BMJ 2015;351:h4451.
Competing interests: M Thomas: no competing interests. D Connolly: Has received research funding from Bayer and honoraria from Bayer, Pfizer, BMS and Daichii-Sankyo P Kirchhof: PK receives research support for basic, translational, and clinical research projects from European Union, British Heart Foundation, Leducq Foundation, Medical Research Council (UK), and German Centre for Cardiovascular Research, from several drug and device companies active in atrial fibrillation, and has received honoraria from several such companies in the past. PK is listed as inventor on two patents held by University of Birmingham (Atrial Fibrillation Therapy WO 2015140571, Markers for Atrial Fibrillation WO 2016012783).