Relation between biochemical severity and intelligence in early treated congenital hypothyroidism: a threshold effect

BMJ 1994; 309 doi: (Published 13 August 1994) Cite this as: BMJ 1994;309:440
  1. S L Tillotson,
  2. P W Fuggle,
  3. I Smith,
  4. A E Ades,
  5. D B Grant
  1. MRC Register for Children with Congenital Hypothyroidism, Medical Unit, Institute of Child Health, London WC1N 1EH
  2. Department of Epidemiology and Biostatistics, Institute of Child Health, London WC1H 1EH.
  1. Correspondence to: Dr Smith.
  • Accepted 8 June 1994


Objectives : To assess whether early treatment of congenital hypothyroidism fully prevents intellectual impairment.

Design : A national register of children with congenital hypothyroidism who were compared with unaffected children from the same school classes and matched for age, sex, social class, and first language. Setting - First three years (1982-4) of a neonatal screening programme in England, Wales, and Northern Ireland.

Subjects : 361 children with congenital hypothyroidism given early treatment and 315 control children.

Main outcome measures : Intelligence quotient (IQ) measured at school entry at 5 years of age with the Wechsler preschool and primary scale of intelligence.

Results : There was a discontinuous relation between IQ and plasma thyroxine concentration at diagnosis, with a threshold at 42.8 nmol/l (95% confidence interval 35.2 to 47.1 nmol/l). Hypothyroid children with thyroxine values below 42.8 nmol/l had a mean IQ 10.3 points (6.9 to 13.7 points) lower than those with higher values and than controls. None of the measures of quality of treatment (age at start of treatment (range 1 -173 days), average thyroxine dose (12.76 mug in the first year), average thyroxine concentration during treatment (79-234 nmol/l in the first year), and thyroxine concentration less than 103 nmol/l at least once during the first year) influenced IQ at age 5.

Conclusions : Despite early treatment in congenital hypothyroidism the disease severity has a threshhold effect on brain development, probably determined prenatally. The 55% of infants with more severe disease continue to show clinically significant intellectual impairment; infants with milder disease show no such impairment. The findings predict that 10% of early treated infants with severe hypothyroidism, compared with around 40% of those who presented with symptoms in the period before screening began, are likely to require special education.

Clinical implications

  • Clinical implications

  • In the United Kingdom neonatal screening detects around 170 infants with hypothyroidism annually

  • There is disagreement on whether early treatment with thyroxine fully protects infants against neurological damage

  • In this study a threshold effect of disease severity rather than quality of treatment was the main determinant of outcome, suggesting prenatal effects on development

  • 55% of infants who had more severe hypothyroidism had appreciable intellectual impairment at school entry, whereas infants with milder disease were unaffected

  • It is predicted that 10% of infants with severe disease given early treatment (compared with 40% of those treated later, before the introduction of screening) will show impairment of a degree likely to require special education - that is, four times the expected rate


Before the introduction of routine neonatal screening for congenital hypothyroidism in the mid-1970s1 it had been reported that treatment during the first few months of life was associated with better psychological outcome.2 This led to considerable optimism that early treatment would eradicate the intellectual impairment associated with the disorder. Though psychological progress in children detected by screening has generally been good,*RF 3-8* some studies have produced evidence of deficits in psychological performance.*RF 4-8* Opinions differ on whether these deficits relate most closely to the biochemical severity of hypothyroidism,4,6,8 bone age at diagnosis,4,5 age when treatment is started,9 or the quality of thyroxine replacement therapy.3,7,10

We examined psychological outcome in a cohort of 361 children with congenital hypothyroidism born between 1982 and 1984 and treated in the United Kingdom after the introduction of the national screening programme in 1982. We also examined the relation between outcome and different factors in diagnosis and treatment and reassessed the psychological benefits which have been achieved in the United Kingdom by early diagnosis and treatment.

Subjects and methods

Of the 1972 590 infants born in England, Wales, and Northern Ireland (Scotland was excluded) between 1 January 1982 and 31 December 1984, 489 were identified as congenitally hypothyroid in neonatal screening tests.11 Of these, 472 had persistent hypothyroidism (thyroid stimulating hormone concentration increased above 10 mU/1), a prevalence of one in 4179. The Medical Research Council Register of Children with Congential Hypothyroidism organised a prospective study of psychological outcome at 5 years of age.

Methods of data collection, demographic characteristics, diagnostic details, and early medical problems of infants on the register have been described.11,12

Of the 472 children on the register, 14 died before school age and 18 moved abroad or were otherwise lost to follow up before school entry. Twenty one families refused consent for psychological assessment, and in 13 children assessments were incomplete. Five children were excluded because they had additional handicaps (two Down's syndrome, one familial spastic quadriplegia, one mental retardation and myoclonic epilepsy, and one mental retardation and congenital cataracts). Of the 71 children excluded, 28 (39%) were reported to be using English as a second language; the remaining 29 similar children were therefore also excluded. A further 11 subjects with very mild hypothyroidism (normal plasma throxine concentration but persistently raised plasma thyroid stimulating hormone concentration) who did not start treatment until after the age of six months were also excluded. This left 361 subjects for study.

Controls were chosen for the 361 subjects by asking the head teachers at schools attended by the hypothy roid child to identify two children of the same sex, age (within three months), language spoken at home (English or non-English), and social class defined by occupation of the family breadwinner (manual or non non-manual).13,14 Permission was sought from the families of the first choice control for participation of their child. The families of the second control were approached if participation of the first control was refused. This process yielded a control for 315 children with hypothyroidism.

Thyroid hormone status at diagnosis and during Treatment

Paediatricians did not follow a uniform treatment protocol, though recommendations on thyroxine dose were given in a newsletter distributed at the start of the project (25-50 mu g/day or 8-10 mu g/kg up to 6 months of age; 50-75 mu g/day or 6-8 mu g/kg from 6 to 12 months; 75-100 mu g/day or 5-6 mu g/kg from 1 to 5 years). In most subjects paired measurements of plasma thyroid stimulating hormone and thyroxine concentrations (or free thyroxine, which accounted for 27% of thyroxine measurements) were used to confirm the diagnosis and monitor treatment. A total of 1047 quantitative free thyroxine measurements were converted to equivalent thyroxine values.15 As thyroid stimulating hormone measurements were frequently expressed as laboratory chosen ranges with limits of “less than” and “greater than” rather than as quantitative values,16 plasma thyroid hormone concentrations were chosen as indices of biochemical severity at diagnosis and of biochemical control during treatment.

Biochemical severity of congenital hypothyroidism was assessed from the first quantitative thyroxine measurement after the positive screening test and before treatment (median age 17 days; range 0-114). Of the 361 children studied, 296 had a quantitative thyroxine value at diagnosis. A further 29 children had thyroxine values expressed as a range suitable for grouped analysis.

Quality of treatment was assessed by using (a) age at start of treatment, (b) estimates of average daily thyroxine sodium dose (mu g), and (c) estimates of average plasma thyroxine concentration (nmol/l) during the first two years of treatment. With exclusion of the first 14 days of treatment (to avoid the expected low thyroxine concentrations at that time), averages for thyroxine dose and thyroxine concentration were calculated from the area under the curve from age at start of treatment plus 14 days to 6 months, 6- 12 months, 12-24 months, and combinations of these averages. Averages for thyroxine concentration were calculated for subjects who had at least one quanti tative thyroxine value measured within each period and for the 12-24 month period if they also had at least one value measured between the 3rd and 5th years of age. Children who had either one thyroxsine value below 103 mmol/l or two thyroxine values below 103 nmol/l between 14 days after the start of treatment and the end of the first year were identified as undertreated, as defined previously (except that in that study thyroid stimulating hormone values were included).3,17

Analyses of treatment quality - A total of 224 children had thyroxine values measured both before and during treatment which fulfilled the above criteria up to 1 year of age, and 218 of these also had suitable data up to 2 years. The medians (range) for numbers of thyroxine values per child were three (one to nine) for start of treatment to 6 months of age, two (one to five) for 6-12 months of age, and two (one to eight) for 12-24 months. One hundred and eighty one of these children had both thyroxine values measured at diagnosis and thyroxine averages up to one year of age. Each sample had similar medians and ranges for age at start of treatment, thyroxine concentration at diagnosis, average dose and thyroxine concentration over the various time periods, and had similar proportions in non-manual and manual or unemployed social classes (see below).

Psychological assessment and social class definition

Each hypothyroid child and his or her control was assessed individually at school by the school psychologist using the Wechsler preschool and primary scale of intelligence 1966. Details of parents’ occupation were recorded by the psychologist at the same time. For analysis, subjects whose parents were unemployed were merged with the manual group (to which most would have belonged had they been employed). Subsequent review of breadwinners’ occupations as reported to the psychologist by parents showed that 91 pairs of hypothyroid and control subjects were mis matched for social class. Overall, however, the social class distribution of controls and hypothyroid children was very similar (non-manual 40.3% and 41.8% respectively) and was also similar to that of the general population of 5 year olds in the United Kingdom in 1989 (43.2%).14

Statistical analysis

Full scale intelligence quotient (IQ) scores were plotted and tabulated by plasma thyroxine concentration at diagnosis, social class, age at start of treatment, average thyroxine dose, and average thyroxine concentrations. Undertreated children (thyroxine concentration less than 103 nmol/l either once or twice in the first year) were compared with the rest of the cohort.

Data analysis focused on the form of the relation between severity of hypothyroidism and IQ. Polynomial models (linear, quadratic, and cubic) were fitted to the raw data, taking account of social class. In addition, two non-linear models which captured realistic alternative hypotheses concerning the relation of thyroxine concentration and IQ were considered. Firstly, IQ could increase with thyroxine concentration but not beyond a certain level (negative exponential). Alternatively, there could be a threshold in diagnostic thyroxine concentration above which IQ is normal and below which it is impaired (logistic). Specialised methods are required to compare these models18,19 (see appendix). The logistic model emerged as the most likely description of the data. Residuals from this model were used as the dependent variable in further regression analyses in order to assess the relation between IQ and quality of treatment after accounting for thyroxine concentration at diagnosis and social class. Average dose, average thyroxine concentration, and undertreatment were examined separately, allowing for age at start of treatment in each analysis.


IQ in relation to disease severity and social class

Table I shows mean IQ in relation to plasma thyroxine concentration at diagnosis and social class. When the secular rise in IQ is taken into account the expected mean IQ in United Kingdom children assessed between 1987 and 1989 was around 112.20,21 In this series, however, the overall mean for the hypothyroid children was 106.4. Higher initial thyroxine values and non-manual social class were closely associated with higher IQ, and these associations were highly significant (table I). The rise in mean IQ with increasing thyroxine values at diagnosis was noticeably non-linear, children with initial thyroxine concentrations of roughly 40 nmol/l or below having, on average, lower IQs than those with values above 40 nmol/l In these the mean IQ was close to the expected population mean for the test being used (Wechsler preschool and primary scale of intelligence). Below and above 40 nmol/l there was little evidence of a downward or upward trend in IQ with lower or higher thyroxine concentrations.

Table I

IQ (Wechsler preschool and primary scale of intelligence) at 5 years of age in relation to plasma thyroxine concentration at diagnosis and social class

View this table:
Table II

Mean IQ by social class and plasma thyroxine concentration at diagnosis

View this table:

Form of relation between IQ and disease severity, and comparison with controls.

The best fitting logistic model emerging from the statistical analysis using the raw thyroxine values (see appendix) is illustrated in the figure. The figure also shows the mean IQs grouped by thyroxine concentration at diagnosis as in table I. The inflextion point of the curve was at thyroxine concentration 42.8 nmol/l (95% confidence interval 35.2 to 47.1 nmol/l), with a gradient at the inflexion point of 0.76 (that is, a rise of 7.6 IQ points per 10 nmol/l increase in thyroxine). The 95% confidence intervals for the gradient suggested that the data were consistent with a range of gradients between 0.14 at one extreme and a perfect dichotomy of the data, a step function, at the other. The vertical distance between the lower and upper asymptotes was 10.3 IQ points (6.9 to 13.7 points).


Fitted logistic (L) model for IQ and thyroxine concentration at diagnosis allowing for social class (SC=1 for non manual). IQl= 98.1+8.6 SC+10.3{1/1+exp(%(-[0-9]+0.76(42.79-thyroxine)))}. 95% Confidence intervals are given for inflexation point (A) and for points 5% (B) and 95% (C) of distance between lower and upper asymptotes. Mean IQs for grouped thyroxine values are also shown (same categores as in table II but spite by social class)

The curve fitting results were consistent with an intuitive interpretation of the dichotomy in the diagnostic thyroxine values presented in table I and gave strong support to the logistic model against the negative exponential alternative. Further checks supported this view. Firstly, logistic models were fitted for each social class separately, giving essentially the same results with the inflextion points at 41.4 nmol/l (non-manual) and 45.0 nmol/l (manual) compared with 42.8 nmol/l in the combined data set. Secondly, the lower “corner” of the logistic curve, taken to be the point 5% of the distance between the lower and upper asymptotes, was at thyroxine concentration 39 nmol/l (17 to 46 nmol/l). This shows that the data would be inconsistent with logistic curves having lower asymptotes outside the range of the data and offers powerful evidence that there is a limit beyond which further decreases in thyroxine concentration at diagnosis produce no additional decrement in IQ. The model also predicts with 95% confidence that children with a thyroxine concentration above 47 nmol/l at diagnosis are unlikely to suffer impairment of more than five IQ points.

The mean IQ of children with thyroxine values above the threshold of 42.8 nmol/l was close to that in controls of similar social class (table II). In both social classes children with thyroxine values below the threshold showed an average deficit of 11-12 points compared with an IQ difference based on the logistic curves of 10.3 points. Overall the control children had a mean IQ of 113.2, very close to the expected population mean of 112.21 The deficits in mean IQ in the children with more severe hypothyroidism were accounted for by consistent deficits of between 0.8 and 1.5 points in the means of the scaled scores of every subtest of the Wechsler preschool and primary scale of intelligence, both verbal and performance.

Effect of treatment on IQ

By contrast with thyroxine values at diagnosis (table I) there was no obvious trend in IQ with age at start of treatment (table III). Nor were there any associations with average dose of thyroxine or average thyroxine concentration in the first year of life. Similarly, negative results were obtained with average dose and average thyroxine concentration up to 6 months of age, from 12 to 24 months of age, and for the first 24 months. The lack of a relation between IQ and the treatment variables persisted in the regression analyses using the residuals from the logistic model as the dependent variable both before and after the group was split at the inflexion point defined.

Table III

IQ (Wechsler preschool and primary scale of intelligence) at 5 years of age in relation to age at start of treatment and average thyroxine dose and average plasma thyroxine concentration from start of treatment to 1 year of age

View this table:

Outcome was also examined in children who had low thyroxine values (below 103 nmol/l) from 14 days after the start of treatment to the end of the first year. Of 334 children in whom thyroxine concentration was measured at least twice, 88 (26.3%) had at least one thyroxine value below 103 nmol/l. Of 282 children with two or more thyroxine measurements, 27 (9.6%) had at least two values below 103 nmol/l. In both groups the mean IQ (103.6 and 99.6 respectively) was lower than the mean (107.0) in children who did not have low thyroxine values. Using the residuals from the logistic regression as the dependent variable in a regression analysis and allowing for age at treatment, we found, however, no significant independent association between IQ and having either one or two low thyroxine values.


Our study shows that in children with congenital hypothyroidism born in the United Kingdom and given early treatment there was a sharp threshold in intellectual outcome that divided them into two distinct groups - those with plasma thyroxine concentrations of less than 42.8 nmol/l at diagnosis, who showed a global deficit in mean IQ of 10 points, and those with less severe congenital hypothyroidism, who showed no deficit. A reduction in mean IQ of 10 points increases from 2% to 10% the proportion of children with IQs more than two standard deviations from the population norms and results in a fourfold increase in those requiring remedial help and special education. Several other studies have observed a strong association between intellectual outcome after early treatment and the bio-chemical severity of hypothyroidism at diagnosis.4,6,8 None, however, has examined the shape of this relation, and most have assumed a linear relation between severity and IQ. Our analyses suggest that only about 30% of the variance due to a diagnostic thyroxine concentration can be captured by a linear relation.

Together with the association between IQ and bone age at diagnosis recorded by several groups,4,5,8,22,23 Our results indicate that even after early diagnosis and treatment, severity has a prenatal influence on brain development in congenital hypothyroidism. Triiodothyronine seems to be the active hormone with respect to neurological development in the fetus and is synthesised in the brain from thyroxine transported from fetal plasma across the blood-brain barrier.24 Despite good evidence that maternal thyroxine contributes substantially to fetal thyroxine in the later weeks of pregnancy,25 our data suggest that maternal transfer is insufficient for fetal requirements if hypothyroidism is severe.26 Thyroxine concentrations of less than 42.8 nmol/l at diagnosis at around 17 days of age, or low concentrations during a break in treatment at around 1 year of age as reported from Finland,8 could be markers for a critical level of thyroxine deprivation in the fetus leading to permanent neurological deficit.

The New England Congenital Hypothyroidism Collaborative's study and some other studies have found either normal mean IQs,3,17,27,28 an apparent absence of a correlation with disease severity,3,17 or lower IQs in subjects with low thyroxine and high thyroid stimulating hormone concentrations during treatment3,7,17 and have attributed intellectual impairment to either delayed treatment or under treatment.*RF 17,26 Though we too found that children with low thyroxine values (<103 nmol/l, equivalent to 8 mu g/dl as defined in a previous study3,17 during treatment tended to have a lower mean IQ, we could find no independent association between low thyroxine concentration and outcome after allowing for the effect on IQ of social class and severity. Nor could we show any effect of our other measures of quality of treatment. In the New England study3 average plasma thyroxine values (mean 141 (SD 26) nmol/l) during the first year of treatment were lower than in the United Kingdom (mean 154 (29) nmol/l), so the different intellectual outcomes in the two studies were unlikely to be due to a general effect of under-treatment in the United Kingdom. However, the New England study included raised thyroid stimulating hormone concentrations as well as low thyroxine values in their definition of undertreatment,3 and we cannot exclude the possibility that undertreatment has some effect on outcome.

What benefit screening and early treatment?

Our findings question the benefit of screening and early treatment. In a population survey in south east England before the introduction of screening, Hulse reported results with the Wechsler intelligence scale for children in 99 subjects who had presented with symptoms of thyroid deficiency beginning in early life, including developmental delay or mental handicap.29 Their mean IQ was 79.5 compared with an expected population mean at the time of 103-104. Introduction of screening increased the frequency of reported congenital hypothyroidism in the United Kingdom by 45%, from one in 7600 to one in 4179, probably owing to identifying children with less severe disease who might later have presented with juvenile hypothyroidism. It is noteworthy that 45% of children in our study also had milder disease, with thyroxine values of 42.8 nmol/l or higher. An appropriate comparison for assessing the benefits of screening might therefore be between those in our study with more severe disease (55% of our sample in whom the reduction in mean IQ was 10 points) and the patients in Hulse's survey. These data suggest an improvement in mean IQ in severely hypothyroid children of about 14 points, or a reduction in the proportion of children with IQs more than two standard deviations below the population mean from 40% to 10%. However, deliberate selection of handicapped subjects in Hulse's survey means that the benefit could be smaller.

We conclude that in the United Kingdom as in several other countries early treatment of congenital hypothyroidism has probably reduced but failed to eliminate neurological impairment. Though the mechanism and timing of the impairment remain uncertain, the close association between biochemical severity and outcome favours an important prenatal influence of hypothyroidism, and one with a sharp threshold.

We thank the parents and children who participated in the project, the school staff for their contribution, and Ms Shelley Tokar for help in organising the school assessments and managing the administration of the project. We also thank the paediatricians, laboratory staff, and the local educational psychologists for so generously supplying information to the register. The project was funded by the MRC and the Department of Health.

Members of the MRC steering committee who guided the project from its inception in 1981 to completion in 1992 were Professor A Davis (chairman), Dr G M Addison, Professor J A Aynsley-Green, Dr N Barnes, Dr D J Carson, Professor Dame Barbara Clayton, Ms Caroline Dore, Professor M A Ferguson-Smith, Dr P W Fuggle, Professor P J Graham, Dr D B Grant, Dr J A Hulse, Dr I A F Lister Cheese, Professor J G Ratcliffe, Dr D McI Scott, Dr Isabel Smith, and Professor O H Wolff.


Two alternative hypotheses concerning the relation between IQ and diagnostic thyroxine concentration were captured by (a) the negative exponetial (NE) model, which rises to an asymptote, and (b) the logistic (L) model, which is a symmetric sigmoid shape (see figure): (a) IQNE=ß(sub1)+ß(sub2)SC+ ß(sub3) (1 - exp (-ß4 thyroxine)), and (b) IQL=ß(sub1)+ß(sub2)SC+ß(sub3{1/ (1+exp(-ß(sub4) (ß5-thyroxine)))}, where SC= social class. These models are not hierarchically nested, and specialised methods are used to compare them.18,19 Briefly, this entails estimating the value of alpha in an equation that combines the predicted IQs from the negative exponential and logistic models: (c) IQ=(1-alpha) IQNE+alpha IQL. If alpha is near unity we would conclude that IQ is best predicted by the logistic model, whereas values near zero would favour the negative exponential model. The C* statistic19 was used to test the hypotheses alpha=0 and alpha=1. Confidence intervals for model parameters and for functions of parameters based on likelihood profiles were calculated numerically.

Table IV

Regression model fitting for pretreatment plasma thyroxine concentration and IQ, accounting for social class. Residual sum of squares for polynomial and non-linear models

View this table:

Goodness of fit for polynomial and non-linear models (fitted to raw not grouped data) relating IQ and diagnostic thyroxine concentration is shown in table AI. For polynomials to show a significant improvement (P<0.05) due to each additional parameter a reduction of about 800 in residual sum of squares is required. The quadratic model was significantly better than the linear model and the cubic model significantly better than the quadratic model, suggesting an S shaped relation between thyroxine concentration and IQ.

Of the non-linear models, the logistic model fitted better than the negative exponential model and better than the cubic model. Note that the linear model reduces the residual sum of squars by 2100, only 29% of the reduction being due to the logistic curve. By using methods for comparing non-nested regression models (equation (c)), the estimate of alpha was 1.086, suggesting that the logistic model was a far superior fit than the negative exponential model. The hypothesis that the data were perfectly fitted by a logistic (sigmoid) curve and not at all by the negative exponential model could not be rejected (C*=0.75; P=0.45), whereas the hypothesis that the data were entirely negative exponential and not at all the logistic could be rejected with confidence (C*=3.12;P=0.0014).


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
View Abstract