We read with great interest Rahi et al's editorial (1) accompanying our article on refractive errors and education. (2) We fully agree with these authors that visual disability in childhood is a major global challenge, disproportionately affecting children dwelling in low income areas. We do, however, have some disagreements with Rahi et al about the interpretation of our findings. The first regards their referring to the "negative findings" of our trial, which is factually incorrect. The trial found2 a statistically-significant difference between groups for our pre-specified main outcome (endline math score adjusting for baseline score, 0.11SD [95% Confidence Interval 0.01 to 0.21, P = 0.03]) at the pre-specified p = 0.05 level. Conditions for interpreting this were set before the initiation of the trial and uploaded to Current Controlled Trials as Trial# ISRCTN03252665, and are available for public inspection at http://www.isrctn.com.
Elsewhere in their editorial, Rahi et al assert that "Free glasses failed to improve educational performance by a meaningful amount." Unlike "positive" or "statistically-significant" outcomes, which are well-defined according to pre-specified criteria, the term "meaningful" may be open to greater latitude of interpretation. In fact, our article offers several arguments as to why these results are meaningful, none of which are addressed by Rahi et al:
• The observed, statistically-significant, effect size is the equivalent of approximately half a semester of additional learning, based on previous reports from similar-aged children indicating that the gain in learning between the 4th and 5th grade is approximately 0.4SD. (3)
• The effect size associated with membership in the Free Glasses Group was greater than the effect on our main outcome of either parental education or family income, and was similar to that of residence in middle-income Shaanxi rather than much lower-income Gansu.
• The observed effect size in our trial was considerably larger than the mean recently reported for 60 health-related trials with educational outcomes conducted in primary schools in the developing world, including 22 on de-worming (mean of 0.013 SD) and 38 of nutritional supplementation (mean 0.035 SD). (Improving learning in primary schools of developing countries: a meta-analysis of randomized experiments. http://academics.wellesley.edu/Economics/mcewan PDF/meta.pdf, accessed 17 January 2014)
• Most importantly, the effect size observed in our trial showed a strong dose-response effect with increasing blackboard use in the children's classrooms, which adds biological plausibility due to the greater demands on distance vision imposed on myopic children by the blackboard. The effect size in children where blackboards were used for all or most teaching were 0.45SD and 0.23SD respectively, whereas there was no significant effect among children receiving little or no blackboard teaching.
Rahi et al's assertion that the effect size observed in our trial was not "meaningful" may relate to the fact that the effect size was smaller than the pre-specified difference in our sample size calculation. We do not agree that this is a generally-accepted criterion for defining the clinical significance of a result. Other factors such as practicality of recruiting a particular number of patients are also taken into account in addition to clinical meaningfulness in calculating sample sizes. In our protocol and pre-analysis plan, which are available on line as above, we did not state that 0.2SD was intended to be interpreted as a pre-specified "clinically important" difference. It was rather what we estimated to be the minimum detectable difference under our a priori assumptions. Due to a variety of factors including larger than expected recruitment at each school, we were able to detect a smaller effect size than we originally estimated was possible with the number of schools we could practically afford to enroll. This approach to pre-specified effect size is consistent with the published literature, as for example Wittes et al: (4)
"Designers of controlled trials facing the problem of how to specify that difference commonly use one of two approaches. Some people select the treatment effect deemed important to detect... The other frequently used method is to calculate the sample size according to the best guess concerning the true effect of treatment..."
Rahi et al offer some other important observations on our study methodology, which we address below:
• Including children with poor vision in only one eye, and those already wearing spectacles, reduced the chances of finding a significant educational effect from providing glasses: As noted in the original MS,2 these criteria only affected 12.7% and 15% of children respectively, and excluding these groups did not change our results. We included children who already wore glasses, re-refracting them, because of published evidence of the poor quality of available glasses in rural China. (5)
• The analytic methods used were inappropriate for a cluster-randomized trial: There are different ways to take into account randomization by school rather than by child. These include subject specific methods such as mixed effects models, and population average methods such as generalized estimation equations, where robust standard errors are calculated. (6) Both methods account for clustering within a school. We choose the latter approach, (2) as our focus was on population averaged differences.
• The use of only a single mathematics test as the outcome measure capturing educational performance is problematic: In fact, testing of mathematics is often preferred in educational studies designed to assess school-based outcomes, as learning in this area is more dependent on classroom teaching, and less influenced by the home environment, than language, for example.
• Compliance with spectacles was low among children, and could not accurately be assessed by self-report: We observed a statistically significant impact of providing glasses on educational outcomes despite the fact that compliance with spectacle wear was less than optimal, and expect that larger effect sizes may be possible with successful interventions to enhance wear. Our principal outcome for spectacle use was in fact observed wear at an un-announced examination, rather than self-report, as described in our methods.(2)
We appreciate the interest of Rahi et al in our article, and fully agree that further research is needed in the important area of the impact of children's visual acuity on education.
References
1. Rahi JS, Solebo AL, Cumberland PM. Uncorrected refractive error and education. BMJ 2014;349:g5991.
2. Ma X, Zhou Z, Yi H, Pang X, Shi Y, Chen Q et al. Effect of providing free glasses on children's educational outcomes in China: Cluster randomized controlled trial. BMJ 2014;349:g5740
3. Hill CJ, Bloom HS, Black R, Lipsey MW. Empirical benchmarks for interpreting effect sizes in research. MDRC working papers on research methodology. 2014. www.mdrc.org/publications/459/full.pdf.
4. Wittes J., Sample size calculations for randomized controlled trials. Epidemiol Rev. 2002;24:39-53.
5. Zhang MZ, Lv H, Gao Y, Griffiths S, Sharma A, Lam DSC, et al. Visual morbidity due to inaccurate spectacles among school-children in rural China: the See Well to Learn Well Project, Report #1. Invest Ophthalmol Vis Sci 2009;50:2011-7.
6. Hu FB, Goldberg J, Hedeker D, Flay BR, Pentz MA. Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. Am J Epidemiol. 1998;147:694-703.
Competing interests:
No competing interests
27 January 2015
Nathan G Congdon
Physician
Xiaochen Ma, Zhou Zhongqiang, Hongmei Yi, Xiaopeng Pang, Yaojiang Shi, Mirjam E Meltzer, Saskia le Cessie, Mingguang He, Scott Rozelle, Yizhi Liu
Rapid Response:
We read with great interest Rahi et al's editorial (1) accompanying our article on refractive errors and education. (2) We fully agree with these authors that visual disability in childhood is a major global challenge, disproportionately affecting children dwelling in low income areas. We do, however, have some disagreements with Rahi et al about the interpretation of our findings. The first regards their referring to the "negative findings" of our trial, which is factually incorrect. The trial found2 a statistically-significant difference between groups for our pre-specified main outcome (endline math score adjusting for baseline score, 0.11SD [95% Confidence Interval 0.01 to 0.21, P = 0.03]) at the pre-specified p = 0.05 level. Conditions for interpreting this were set before the initiation of the trial and uploaded to Current Controlled Trials as Trial# ISRCTN03252665, and are available for public inspection at http://www.isrctn.com.
Elsewhere in their editorial, Rahi et al assert that "Free glasses failed to improve educational performance by a meaningful amount." Unlike "positive" or "statistically-significant" outcomes, which are well-defined according to pre-specified criteria, the term "meaningful" may be open to greater latitude of interpretation. In fact, our article offers several arguments as to why these results are meaningful, none of which are addressed by Rahi et al:
• The observed, statistically-significant, effect size is the equivalent of approximately half a semester of additional learning, based on previous reports from similar-aged children indicating that the gain in learning between the 4th and 5th grade is approximately 0.4SD. (3)
• The effect size associated with membership in the Free Glasses Group was greater than the effect on our main outcome of either parental education or family income, and was similar to that of residence in middle-income Shaanxi rather than much lower-income Gansu.
• The observed effect size in our trial was considerably larger than the mean recently reported for 60 health-related trials with educational outcomes conducted in primary schools in the developing world, including 22 on de-worming (mean of 0.013 SD) and 38 of nutritional supplementation (mean 0.035 SD). (Improving learning in primary schools of developing countries: a meta-analysis of randomized experiments. http://academics.wellesley.edu/Economics/mcewan PDF/meta.pdf, accessed 17 January 2014)
• Most importantly, the effect size observed in our trial showed a strong dose-response effect with increasing blackboard use in the children's classrooms, which adds biological plausibility due to the greater demands on distance vision imposed on myopic children by the blackboard. The effect size in children where blackboards were used for all or most teaching were 0.45SD and 0.23SD respectively, whereas there was no significant effect among children receiving little or no blackboard teaching.
Rahi et al's assertion that the effect size observed in our trial was not "meaningful" may relate to the fact that the effect size was smaller than the pre-specified difference in our sample size calculation. We do not agree that this is a generally-accepted criterion for defining the clinical significance of a result. Other factors such as practicality of recruiting a particular number of patients are also taken into account in addition to clinical meaningfulness in calculating sample sizes. In our protocol and pre-analysis plan, which are available on line as above, we did not state that 0.2SD was intended to be interpreted as a pre-specified "clinically important" difference. It was rather what we estimated to be the minimum detectable difference under our a priori assumptions. Due to a variety of factors including larger than expected recruitment at each school, we were able to detect a smaller effect size than we originally estimated was possible with the number of schools we could practically afford to enroll. This approach to pre-specified effect size is consistent with the published literature, as for example Wittes et al: (4)
"Designers of controlled trials facing the problem of how to specify that difference commonly use one of two approaches. Some people select the treatment effect deemed important to detect... The other frequently used method is to calculate the sample size according to the best guess concerning the true effect of treatment..."
Rahi et al offer some other important observations on our study methodology, which we address below:
• Including children with poor vision in only one eye, and those already wearing spectacles, reduced the chances of finding a significant educational effect from providing glasses: As noted in the original MS,2 these criteria only affected 12.7% and 15% of children respectively, and excluding these groups did not change our results. We included children who already wore glasses, re-refracting them, because of published evidence of the poor quality of available glasses in rural China. (5)
• The analytic methods used were inappropriate for a cluster-randomized trial: There are different ways to take into account randomization by school rather than by child. These include subject specific methods such as mixed effects models, and population average methods such as generalized estimation equations, where robust standard errors are calculated. (6) Both methods account for clustering within a school. We choose the latter approach, (2) as our focus was on population averaged differences.
• The use of only a single mathematics test as the outcome measure capturing educational performance is problematic: In fact, testing of mathematics is often preferred in educational studies designed to assess school-based outcomes, as learning in this area is more dependent on classroom teaching, and less influenced by the home environment, than language, for example.
• Compliance with spectacles was low among children, and could not accurately be assessed by self-report: We observed a statistically significant impact of providing glasses on educational outcomes despite the fact that compliance with spectacle wear was less than optimal, and expect that larger effect sizes may be possible with successful interventions to enhance wear. Our principal outcome for spectacle use was in fact observed wear at an un-announced examination, rather than self-report, as described in our methods.(2)
We appreciate the interest of Rahi et al in our article, and fully agree that further research is needed in the important area of the impact of children's visual acuity on education.
References
1. Rahi JS, Solebo AL, Cumberland PM. Uncorrected refractive error and education. BMJ 2014;349:g5991.
2. Ma X, Zhou Z, Yi H, Pang X, Shi Y, Chen Q et al. Effect of providing free glasses on children's educational outcomes in China: Cluster randomized controlled trial. BMJ 2014;349:g5740
3. Hill CJ, Bloom HS, Black R, Lipsey MW. Empirical benchmarks for interpreting effect sizes in research. MDRC working papers on research methodology. 2014. www.mdrc.org/publications/459/full.pdf.
4. Wittes J., Sample size calculations for randomized controlled trials. Epidemiol Rev. 2002;24:39-53.
5. Zhang MZ, Lv H, Gao Y, Griffiths S, Sharma A, Lam DSC, et al. Visual morbidity due to inaccurate spectacles among school-children in rural China: the See Well to Learn Well Project, Report #1. Invest Ophthalmol Vis Sci 2009;50:2011-7.
6. Hu FB, Goldberg J, Hedeker D, Flay BR, Pentz MA. Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. Am J Epidemiol. 1998;147:694-703.
Competing interests: No competing interests