The Impact of Education on Myopia: A bidirectional Mendelian randomisation analysis in UK Biobank

Myopia, or short-sightedness, is one of the leading causes of visual disability in the World. The prevalence of myopia has risen steadily over recent decades, reaching epidemic levels in Southeast Asia. Observational studies have reported associations between educational attainment and myopia. Whether education causes myopia, myopic children are more intelligent, or another factor, like higher socioeconomic status, causes both is unclear since observational studies are prone to confounding and randomised trials of education are unethical. Using bidirectional Mendelian Randomisation, a form of instrumental variable (IV) analysis free from confounding, we show that every additional year in education leads to an increase in myopic refractive error, but that myopia does not lead to higher educational attainment. Our results suggest that current educational methods contribute to the global burden of myopia, and argue that educational policies and practices should take account of this to reduce future visual disability in the population.

variants were selected to use as our IVs because of their robust association with education and myopia, allowing us to construct strong instruments for making MR inferences. Thus, using the allele score for educational attainment as the IV, MR showed that every additional year spent in education resulted in a more myopic refractive error of -0.27 D/year (95%CI, -0.37 to -0.17, p=4.2e-8) (Table 2; Figure 3). The MR effect estimate was even greater in magnitude than the observational estimate (-0.27 vs. -0.18 D) suggesting that unmeasured confounders may have attenuated the latter relationship. Conversely, using the myopia allele score as the IV in MR provided no evidence that refractive error affected educational attainment (b IV = -0.008 years/D, 95% CI -0.041 to 0.025, p=0.625) (Table 2; Figure 3). With our sample size of N=69,798, we had 80% power to detect an effect of education on refractive error >0.26D/yr. In the reciprocal direction, we had 80% power to detect an effect >0.048yr/D (Extended Data Figure 2), suggesting that our study had sufficient power to detect an effect of myopia on education, if present.
MR analyses are based on two pertinent assumptions: (i) the genetic instruments are only associated with the outcome via the exposure, and (ii) the genetic instruments are not associated with any confounders of the exposure-outcome relationship. The Durbin-Wu-Hausman (DWH) test for endogeneity is a method to check for endogenous variables in a regression model that would create bias, e.g. from omitted variables, measurement error or reverse causation 25,26 . There was weak evidence that the IV estimate using the education allele score differed from the observational point estimate (DWH-p=0.059), with the IV estimate suggesting a larger negative association (Table 2). There was strong evidence that IV estimate using the myopia allele score was a departure from the observational point estimate (DWH-p=4.05e-19) ( Table 2).
The second core assumption of MR is that the IV is not associated with a confounder of the exposure-outcome association 20 . There was evidence that the geographical co-ordinate northing (measured northward distance in UK) is a confounder of the education-refractive error relationship; northing was negatively associated with education (b = -1.61e -6 , 95% CI, -1.78e -6 to -1.45e -6 ) and positively with refractive error (b = 1.16e -6 , 95% CI 9.84e -7 to 1.33e -6 ). Northing was also associated with the education (p=6.8e-5) and myopia (p=6.1e-3) allele scores (Extended Data Table 2). Compared to standard regression, confounding bias plots suggest that inclusion of the northing variable in the IV analysis may result in a greater degree of bias for the education allele score ( Figure 4A) but not for the myopia allele score ( Figure 4B).
In contrast, the geographical easting coordinate was positively associated with education (b = 8.90e -7 , 95% CI, 6.82e -7 to 1.10e -6 ) and negatively associated with refractive error (b = -1.03e -6 , 95% CI, -1.25e -6 to -8.06e -6 ). It was weakly associated with the myopia allele score (p=0.01). However, there no evidence to suggest a greater degree of bias in the IV analysis compared to a standard regression with the inclusion of the easting variable ( Figure   4B). We identified one further confounding variable as population stratification principle component 9, which incurred a greater degree of bias in the IV regression compared to OLS regression. While these confounders (northing and population stratification component 9) may bias our MR effect estimates, we restricted our analyses to unrelated individuals of European ancestry to limit the likelihood of major bias from population stratification.
MR causal estimates were also calculated using the MR-Egger and Weighted Median methods (which partially relax the assumptions required for MR causal estimates to be valid) as well as alternative methods of integrating IV estimates across individual SNPs. All of these approaches yielded causal estimates indicating that increasing levels of education led to a more myopic refractive error (by -0.22 to -0.44 D/year; p<0.05 for all methods), while there was no evidence that a more myopic refractive error led to greater educational attainment (p>0.05 for all methods) (Extended Data Table 4). An advantage of MR-Egger is that it gives a valid test for causality even if invalid instruments are used, e.g. due to horizontal pleiotropy 27 . The ubiquity of exposure to education in populations with available genotype data means that it is not possible to assess individuals who are completely free of our outcome, specifically education. With MR-Egger, a deviation of the intercept estimate from zero would suggest the existence of horizontal pleiotropy, i.e. where certain genetic variants affect the outcome via a different biological pathway from the exposure under investigation.
In practice, there was no evidence that the Egger intercept deviated from zero either for education causing refractive error (intercept=0.007, SE=0.006, p=0.22) or refractive error causing education (intercept=-0.002, SE=0.007, p=0.81), indicating that there was no evidence for horizontal pleiotropy. However, such bias cannot be ruled out definitively until we gain more knowledge of the mechanisms by which these genetic variants affect the traits described here.
To ensure that the association between educational attainment and myopia was not an artefact of the non-normal distribution of the variable, age completed schooling, educational attainment was recoded using two alternative methods: (1) dichotomising the cohort into participants who finished their schooling when >16 years versus £16 years of age; and (2) excluding individuals who had attended college or university. Encoding education as a dichotomous trait (>16 years vs <16 years of age when completed schooling) produced the same pattern of causality as the continuous variable, age completed schooling; i.e. education had an effect on refractive error (b IV = -0.347 D/LOD(education), 95% CI -0.482 to -0.220) while refractive error did not have an effect on education (b IV = -0.0004 LOD(education)/D, 95% CI -0.028 to 0.028) (Extended Data Table 5).
When individuals who had attended university or college were excluded from the analyses, there was a similar point estimate of the effect of education on refractive error (b IV = -0.228 D/yr, 95% CI -0.479 to 0.018, p=0.066) with larger standard errors. This is attributable, in part, to the reduced sample size (N=45,535). Again, there was no evidence that refractive error had an effect on educational attainment (b IV = -0.004 yr/D, 95% CI -0.035 to 0.027, p=0.80) (Extended Data Table 5).
MR is a powerful approach for testing causal hypotheses in epidemiology 28,29 . The large sample size and robustly-associated genetic instruments used here meant that causal effects could be estimated with high precision. Our findings are in agreement with the single previous study to address the causal relationship between education and myopia, a metaanalysis of 3 European-ancestry cohorts (N=5,649) using MR 30 . Cuellar-Partida et al. 30 included 17,749 SNPs to construct their polygenic risk score as an IV for educational attainment. The authors reported that each year of educational attainment led to a more myopic refractive error of -0.46 D/year (p=1.0e-3). However, the study by Cuellar-Partida et al. 30 was underpowered, which may explain why results from 2 of the 3 cohorts they studied were non-significant. Furthermore, their methodology risked violating one of the key assumptions of MR; firstly because the thousands of SNPs used to create their IV may well have included pleiotropic variants with direct effects on both educational attainment and refractive error; and secondly, because there were likely to be SNPs associated with educational attainment that were in LD with refractive error variants. The much larger sample size in our study permitted the use of less than 100 variants to use as IVs for educational attainment and refractive error. Thus, we were able to mitigate against the risk of LD between the major risk variants for the two traits explaining the underlying associations between education and myopia. Crucially, our analyses provided strong evidence that this relationship arose from a causal effect of education on refractive error, and not via reverse causation or confounding by influences such as socioeconomic status.
Myopes, by definition, have better near vision than distance vision and require less accommodative effort for near work and study, and so myopia has been proposed as an educational advantage 31 . Despite the general perception that myopes are more studious than non-myopes, we found no evidence that refractive error affected educational attainment.

The study cohort in UK Biobank
We analysed cross-sectional data from the baseline assessment of the UK Biobank project 40 .
During the period 2006 to 2010, UK Biobank recruited 502,649 participants aged 37 to 73 years-old. Participants attended 1 of 22 assessment centres across the UK, at which they completed a touch-key questionnaire, had a face-to-face interview with a trained nurse, and underwent physical assessments. All participants completed sociodemographic questionnaires, which included questions on past educational or professional qualifications.
Towards the end of the UK Biobank recruitment exercise, a detailed ophthalmic assessment was introduced. Approximately 23% of participants underwent the ophthalmic assessment.
Participants who had withdrawn consent were excluded. A total of 69,798 participants had valid education, refractive error and genetic data available (Extended Data Figure 1).

(iii) The genotype data in UK Biobank
Participants were genotyped using one of two platforms: the Affymetrix UK BiLEVE Axiom array or the Affymetrix UK Biobank Axiom array. The genetic data underwent rigorous quality control procedures and was phased and imputed against a reference panel of Haplotype Reference Consortium (HRC), UK10K and 1000 Genomes Phase 3 haplotypes 41 .
Due to an issue with the imputation of UK10K and 1000 Genomes variants, analysis has been restricted to HRC variants only. Samples were excluded based on the following genotypebased criteria: non-European ancestry, relatedness, mismatch between genetic sex and selfreported gender, putative aneuploidy (variable 22019), outlying heterozygosity, and excessive missingness (variable 22027) 41 .

(i) Ordinary Least Squares (OLS) observational analyses
Observational associations between refractive error and educational attainment were assessed using linear regression adjusted for sex and age. The regression was then repeated with adjustment for additional potentially confounding variables: Townsend deprivation index (TDI), birth weight, whether breastfed, and geographic coordinates of place of birth rounded to the nearest kilometre (northing and easting coordinates). These genetic variants are described in Extended Data Table 3.

(b) The generation of allele scores for Mendelian Randomisation
Multiple genetic variants were combined into a single weighted allele score for each trait. An allele score, compared to individual variants, has been shown to improve the coverage properties and reduce the bias of instrumental variable (IV) estimates 42

(d) Sensitivity analyses for confounding, pleiotropy and artefact
MR analyses assume that covariates are randomly distributed with respect to genotype 20 . A number of variables were found to confound the education-refractive error relationship and were associated with one or both instrumental variables (allele scores) (Extended Data Table   2). Confounding bias plots 46,47 were used to assess relative bias in the IV estimate compared to standard multivariate regression. Additionally, suspected confounding factors were included as covariates in supplementary analyses (Extended Data Table 5). associations will be correlated. It was, therefore, necessary to run MR-Egger as a split sample analysis, whereby the sample was randomly split in half, then G-E and G-O associations were calculated separately in the independent groups (Extended Data Finally, to ensure the association between educational attainment and myopia was not an artefact of the non-normal distribution of the variable age completed schooling, educational attainment was recoded using two alternative methods: (1) dichotomisation into age >16 years when completed schooling and age £16 years when completed schooling; and (2) excluding individuals who attended college or university. The results were compared with the original analyses using the continuous variable age completed schooling.

Code availability
Code used to run the analysis is available on GitHub:     Tables   Table 1. Observational association between educational attainment and refractive error.
Variables that confound the education -refractive error relationship which are also associated with an allele score are highlighted in green. (Extended_data_table_2.xlsx)

Extended Data Table 3. Genetic variants from Okbay et al and Pickrell et al used to construct the education and myopia allele scores in this study.
Association scores in the original study are shown in green; association scores in UK Biobank are shown in blue. (Extended_data_table_3.xlsx) Extended Data Table 4. Causal estimates of education on refractive error and refractive error on education using methods implemented in MRBase and a split sample in UK Biobank.
Sheet 1 shows estimates using various methods from MRBase. Sheets 2 & 3 show per variant associations with education and refractive error in split samples A (blue) and B (green). (Extended_data_table_4.xlsx)