The effect of omitted covariates on confidence interval and study power in binary outcome analysis: A simulation study
Introduction
The consequence of omitting balanced covariates under various non-linear models was first demonstrated by Gail et al. [1]. By balanced covariate we refer to the distribution of the covariate being comparable between the exposure/intervention groups. Asymptotically, i.e., as sample size increases, randomisation ensures well-balanced/comparable intervention groups and hence minimizing the potential for confounding. However, for small-to-moderate studies, chance disparity/imbalance in the distribution of important covariates is still a possibility after randomisation unless stratified block randomisation is employed. The main finding of Gail et al. [1] was a downward bias or underestimation of the effect of the exposure or intervention of interest when important covariates are omitted. Hauck et al. [2] have also discussed the same issue in terms of the different definitions of confounding and illustrated how these definitions at times disagree. In addition, these authors pointed out an important distinction in that the issue of omitted covariates is different than that of classical confounding [3]. The direction of the bias involved in the case of omitted covariates is predictable, i.e., always towards the null, as opposed to classical confounding which could be either way. Particularly, if the variable of main interest such as exposure or intervention does not have effect on outcome, then omitted covariates do not introduce bias while confounders do. Chao et al. [4] extended the general attenuation effect result to the case of correlated binary outcomes.
What is not known in the literature is the impact of omitted covariates on confidence interval and study power. Hauck et al. [2] argue, indirectly, that omitted covariates lead to loss of efficiency since omitting covariates is some form of model misspecification [5]. The goal of this paper is to investigate using a simulation study the effect of omitted but balanced covariates on confidence interval estimation and study power in an uncorrelated binary outcome setting.
Section snippets
Example of what is already known
Table 1 illustrates the effect of an omitted covariate on point estimation of odds ratio using a hypothetical study.
P(D|Ē) is the proportion with the outcome/disease among subjects without exposure/intervention and P(D|E) is the proportion with the outcome/disease among subjects with exposure/intervention. In addition, it is assumed that the probability of assignment to each stratum of the covariate is 0.5 and exposure within each stratum is balanced. Under this configuration, the stratum
Simulation study
We simulated data consisting of disease status (D), a binary exposure (E) and a covariate (C), satisfying independence between exposure and covariate. The exposure and the covariate are associated with disease status through the following model:
The coefficients of E and C in the above model were chosen so as to provide a wide range of combinations between exposure and omitted covariate effects. For exposure, an odds ratio of 2.0 was considered throughout while
Results
In the absence of an exposure effect, there was no bias involved in the estimation of the odds ratio due to omitted covariate. This has been shown analytically [2]. Moreover, under this scenario, the 95% confidence interval has the correct coverage probability and the size of the test was also correct. The empirical coverage rate was within the range of 0.948–0.953 while the empirical Type I error was within the range of 0.045–0.052 (data not shown).
When the effect of the binary omitted
Example: sexual activity and longevity of male fruit flies
We give a real-life example using an interesting experimental data set that appeared previously in the literature [10], [11]. The design of the experiment has been described in detail elsewhere [10]. In short, 125 fruit flies were randomly divided into five groups of 25 to determine whether increased reproduction reduces the longevity of male flies. This effect is known to occur in female flies. Sexual activity of individual males was manipulated by providing each male in the first group with
Discussion
When the variable of interest, i.e., exposure/intervention, has a non-null effect on disease risk the impact of an omitted but balanced covariate is to bias the odds ratio towards the null. This impact also extends to a shift of the corresponding 95% confidence interval to the null with a reduced coverage probability than the nominal level. In addition, study power will be reduced as compared to the model including the covariate. The impact is more dramatic when the effect of the omitted
References (14)
- et al.
A consequence of omitted covariates when estimating odds ratios
J Clin Epidemiol
(1991) - et al.
Should we adjust for covariates in nonlinear regression analyses of randomized trials?
Control Clin Trials
(1998) - et al.
Biased estimates of treatment effect in randomized experiments with non-linear regression and omitted covariates
Biometrika
(1984) Absence of confounding does not correspond to collapsibility of the rate ratio or rate difference
Epidemiology
(1996)- et al.
Effect of omitted confounders on the analysis of correlated binary data
Biometrics
(1997) Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable
Stat Med
(1988)- et al.
Statistical analysis using Generalized Estimating Equations (GEE): an orientation
Am J Epidemiol
(2003)
Cited by (19)
A clustered randomized controlled trial to assess whether Living Peace Intervention (LPint) reduces domestic violence and its consequences among families of targeted men in Eastern Democratic Republic of the Congo (DRC): Design and methods
2022, Evaluation and Program PlanningCitation Excerpt :These will be covariates with a strong association with outcome and those with a solid imbalance between treatment groups. A negligible impact will be defined as a less than 5% change in the regression coefficient for the LPint effect after stepwise removal of the covariates variable from the model (Negassa & Hanley, 2007). A difference of more than 5% in the adjusted effect estimate from the crude effect estimate will be considered confounding (Negassa & Hanley, 2007).
Overcoming underpowering: Trial simulations and a global rank end point to optimize clinical trials in children with heart disease
2020, American Heart JournalCitation Excerpt :Trials in congenital heart disease are especially susceptible to dilution of the treatment effect size-to-noise ratio because there are wide heterogeneity in patient diagnoses and significant variability in outcomes depending on case complexity, preoperative risk factors, and center-level expertise. Prior simulation studies have retrospectively analyzed various RCTs and demonstrated the potential power gains associated with covariate adjustment.9,29-32 Expert and regulatory consensus is that optimal covariate adjustment should be prespecified but ideally based upon the known prognostic value of various covariates.12,33
Bayesian adaptive clinical trials of combination treatments
2017, Contemporary Clinical Trials CommunicationsEstimating adjusted NNTs in randomised controlled trials with binary outcomes: A simulation study
2010, Contemporary Clinical TrialsCitation Excerpt :In the RCT setting with balanced covariates it is useful in any case to average over the whole sample rather than over only the treated or the untreated patients because all patients are eligible for treatment. It is known that the consequence of adjusting for a balanced covariate in logistic regression is on one hand a loss of precision but on the other hand an increased efficiency in testing for a treatment effect, i.e., a higher study power [8–12]. The reason for the latter is that the downward bias induced by omitting the covariate is avoided.
On Sample-Size Calculations for Precise Contrast Analysis in ANCOVA
2019, Journal of Experimental Education