Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study

Abstract Objective To evaluate whether body size in early life has an independent effect on risk of disease in later life or whether its influence is mediated by body size in adulthood. Design Two sample univariable and multivariable mendelian randomisation. Setting The UK Biobank prospective cohort study and four large scale genome-wide association studies (GWAS) consortiums. Participants 453 169 participants enrolled in UK Biobank and a combined total of more than 700 000 people from different GWAS consortiums. Exposures Measured body mass index during adulthood (mean age 56.5) and self-reported perceived body size at age 10. Main outcome measures Coronary artery disease, type 2 diabetes, breast cancer, and prostate cancer. Results Having a larger genetically predicted body size in early life was associated with an increased odds of coronary artery disease (odds ratio 1.49 for each change in body size category unless stated otherwise, 95% confidence interval 1.33 to 1.68) and type 2 diabetes (2.32, 1.76 to 3.05) based on univariable mendelian randomisation analyses. However, little evidence was found of a direct effect (ie, not through adult body size) based on multivariable mendelian randomisation estimates (coronary artery disease: 1.02, 0.86 to 1.22; type 2 diabetes:1.16, 0.74 to 1.82). In the multivariable mendelian randomisation analysis of breast cancer risk, strong evidence was found of a protective direct effect for larger body size in early life (0.59, 0.50 to 0.71), with less evidence of a direct effect of adult body size on this outcome (1.08, 0.93 to 1.27). Including age at menarche as an additional exposure provided weak evidence of a total causal effect (univariable mendelian randomisation odds ratio 0.98, 95% confidence interval 0.91 to 1.06) but strong evidence of a direct causal effect, independent of early life and adult body size (multivariable mendelian randomisation odds ratio 0.90, 0.85 to 0.95). No strong evidence was found of a causal effect of either early or later life measures on prostate cancer (early life body size odds ratio 1.06, 95% confidence interval 0.81 to 1.40; adult body size 0.87, 0.70 to 1.08). Conclusions The findings suggest that the positive association between body size in childhood and risk of coronary artery disease and type 2 diabetes in adulthood can be attributed to individuals remaining large into later life. However, having a smaller body size during childhood might increase the risk of breast cancer regardless of body size in adulthood, with timing of puberty also putatively playing a role.

: An applied negative control example using age at menarche as an outcome A direct acyclic graph illustrating a multivariable Mendelian randomization (MVMR) analysis to investigate the direct and indirect effect of early life body size on age at menarche. This analysis was undertaken as a negative control, given that early life body size can only influence timing of puberty directly as the indirect path via adult body size is not biologically plausible given that this outcome occurs at an earlier stage in the life course. As such, the direct effect of early life body size as calculated in the MVMR analysis should be the same as the total effect as derived by univariable estimates.

Supplementary Figure 3: Genetic correlation analysis
Using linkage disequilibrium (LD) score regression, we compared the genetic correlation between our two exposures (early life and adult body body size) with measured adult body mass index (BMI) and childhood obesity from two consortia (GIANT and EGG).

Supplementary Figure 4: Boxplots for early life and adult body size in UK Biobank
Early life body size is categorised into three groups in the UK Biobank study (based on whether individuals considered themselves to be 'thinner', 'about average' or 'plumper'. Adult body mass index was measured as a continuous variable and normalised to have a mean=1 and standard deviation=0. The categorical measurement for early life BMI made it challenging to investigate the assumption regarding linearity between these exposures in our model. However, these boxplots do not indicate that there was evidence against a linear relationship as there was an overall incremental trend across categories.

Supplementary Note 1: The Avon Longitudinal Study of Parents and Children (ALSPAC) cohort description
All children were genotyped using the Illumina HumanHap550 quad genome-wide SNP genotyping platform. ALSPAC mothers were genotyped using the Illumina human660Wquad array at Centre National de 331 Génotypage (CNG). Genotypes were called with Illumina GenomeStudio. Samples were removed if individuals were related or of non-European genetic ancestry. Genetic variants were removed if they had >5% missingness or a Hardy Weinberg equilibrium (HWE) P-value <1.0x10 -06 . Imputation was performed using Impute V2.2.2 against a reference panel from the Haplotype Reference Consortium (HRCr1.1, 2016) based on approximately 31,000 phased whole genomes. The HRC panel was phased using ShapeIt v2, and the imputation was performed using the Michigan imputation server. After imputation, we filtered out variants and kept those with an imputation quality score of ≥ 0.8 and minor allele frequency (MAF) > 0.01.
All BMI measurements were obtained at ALSPAC clinics. Height was measured to the nearest 0.1 cm with a Harpenden Stadiometer (Holtain Crosswell), and weight was measured to the nearest 0.1 kg on Tanita electronic scales. Body mass index (BMI) was calculated as (weight [kg]/(height [m] 2 ).

Supplementary Note 2: Simulation for child and adult BMI estimates
Our measure of early life body size is based on recalled relative body size at age 10, reported by individuals much later in life. This measure is therefore likely to be subject to misclassification due to individuals misremembering their relative body size. Such recall bias will not affect our measure of later life adiposity which is constructed from measures of height and weight taken at the UK Biobank clinic. We therefore computed a simple simulation study to identify what likely effect such misclassification, affecting only one exposure in a two-exposure model, will have on the estimated effects in our multivariable MR estimation.
We set up our model with two positively correlated exposures which each have a causal effect on an outcome. Each exposure is modelled to have a true continuous effect on the outcome but is only observed to take one of three categories (0/1/2) to reflect the setup of our body size data. For each exposure a set of 150 SNPs are available which predict the exposures. The model is estimated using a two-sample multivariable MR estimation where the SNP exposure associations are estimated in the same sample for both exposures and the SNP outcome association is estimated in a separate sample.
Within this model we investigated five misclassification scenarios; (1) no misclassification in either exposure, (2) random misclassification of 15% of the data in exposure 1 (the equivalent of early life body size), (3) random misclassification of 30% of the data in exposure 1, (4) misclassification of 30% of the data in exposure 1 to the level observed for exposure 2 (the equivalent of adult body size) and (5) misclassification of 30% of the data in exposure 1 to one category lower than the true value. Setting 4 represents a scenario where individuals misremember their childhood BMI as being the same (relative to others) as their current body size and setting 5 represents a misclassification where respondents remember their childhood body size as being lower than it was. In each case we considered three scenarios; where only exposure 2 has an effect on the outcome, where exposures 1 and 2 both have positive effects on the outcome and where exposure1 has a negative effect on the outcome and exposure 2 has a positive effect on the outcome. In all cases exposure 2 is also more strongly predicted by the set of SNPs than exposure 1.
As the true effect of each exposure on the outcome is the effect of a change in the continuous variable underlying the observed categorical exposure we have calculated the effect on the outcome we expect to observe in the data as the effect of moving from the mean of one category to the mean of the next 1 .
The results from the simulations are given in Supplementary Table 1. These results show that there is some bias in the estimated effects, due to the classification of the exposure into categories. This bias moves the estimated effect of both exposures away from the null. Random misclassification in 1 or reclassification of 1 to a lower category group weakens the strength of the instruments for 1 and increases the bias in the estimated effect of X1, but does not affect the estimate for 2 . However, the estimated effect of 2 is decreased when observations for 1 are reclassified to the level observed for 2 and there is a true causal effect of the 1 on the outcome that is in the same direction as for