Perceived age as clinically useful biomarker of ageing: cohort studyBMJ 2009; 339 doi: http://dx.doi.org/10.1136/bmj.b5262 (Published 14 December 2009) Cite this as: BMJ 2009;339:b5262
- Kaare Christensen, professor1,
- Mikael Thinggaard, mathematician1,
- Matt McGue, professor12,
- Helle Rexbye, research fellow1,
- Jacob v B Hjelmborg, associate professor1,
- Abraham Aviv, professor3,
- David Gunn, postdoctoral scientist4,
- Frans van der Ouderaa, vice president corporate research 4, director of business development6,
- James W Vaupel, professor5
- 1Danish Twin Registry and Danish Aging Research Center, Institute of Public Health, University of Southern Denmark, DK-5000 Odense C, Denmark
- 2Department of Psychology, University of Minnesota, Minneapolis, MN, USA
- 3Center of Human Development and Aging, University of Medicine and Dentistry of New Jersey, New Jersey Medical School, Newark, NJ, USA
- 4Unilever Discover, Colworth House, Sharnbrook, Bedfordshire
- 5Max Planck Institute for Demographic Research, Rostock, Germany
- 6Netherlands Consortium for Healthy Ageing, Leiden University Medical Centre Leiden, LUMC, 2300RC Leiden, Netherlands
- Correspondence to: K Christensen
- Accepted 15 November 2009
Objective To determine whether perceived age correlates with survival and important age related phenotypes.
Design Follow-up study, with survival of twins determined up to January 2008, by which time 675 (37%) had died.
Setting Population based twin cohort in Denmark.
Participants 20 nurses, 10 young men, and 11 older women (assessors); 1826 twins aged ≥70.
Main outcome measures Assessors: perceived age of twins from photographs. Twins: physical and cognitive tests and molecular biomarker of ageing (leucocyte telomere length).
Results For all three groups of assessors, perceived age was significantly associated with survival, even after adjustment for chronological age, sex, and rearing environment. Perceived age was still significantly associated with survival after further adjustment for physical and cognitive functioning. The likelihood that the older looking twin of the pair died first increased with increasing discordance in perceived age within the twin pair—that is, the bigger the difference in perceived age within the pair, the more likely that the older looking twin died first. Twin analyses suggested that common genetic factors influence both perceived age and survival. Perceived age, controlled for chronological age and sex, also correlated significantly with physical and cognitive functioning as well as with leucocyte telomere length.
Conclusion Perceived age—which is widely used by clinicians as a general indication of a patient’s health—is a robust biomarker of ageing that predicts survival among those aged ≥70 and correlates with important functional and molecular ageing phenotypes.
Perceived age—usually the estimated age of a person—is an integral part of assessment of patients. Some 30 years ago, in the Baltimore longitudinal study, Borkan and Norris validated the use of several objectively measured physical parameters as markers of biological age against perceived age and mortality.1
The Longitudinal Study of Aging Danish Twins (LSADT) found that perceived age, rated from facial photographs, predicted short term mortality of participants.2 Perceived age was influenced negatively by exposure to sun, smoking, and low body mass index (BMI) and positively by high social status, low depression score, and being married, though the strength of the associations varied by sex.3 That initial study used nurses to assess age and consisted of a two year follow-up of mortality. As nurses are experienced in recognising severe disease or imminent death, the utility of perceived age ratings might be limited to medical staff.
We looked at age as perceived by geriatric nurses (who, because of their profession, should be “experts” in evaluating the appearance of older people), older women (who are “peers” of similar age to the subjects and hence from their own experience could also be “experts”), and young male student teachers (who were expected to be the worst assessors). We also examined whether perceived age was correlated with physical and cognitive function and leucocyte telomere length in the older twins and whether the age, sex, and background of the assessors affected the results. We systematically investigated whether perceived age is a robust biomarker of ageing as indicated by its widespread though informal use in clinical practice.
The Longitudinal Study of Aging Danish Twins (LSADT) follows a population based cohort of same sex twins aged ≥70.4 The study began in 1995, with assessments every two years up to 2005. In 2001 the study had a participation rate of 85%; and 91% of the participants with normal cognition, aged 70-99, consented to have their face photographed. A total of 1826 twins (840 men and 986 women) had a high quality colour photograph taken with a digital camera at a distance of 0.6 m (passport type photographs) (fig 1).⇓ Not all the participants were photographed with a neutral facial expression, but this has been shown not to have an impact on the assessment of perceived age.5
We used three groups of assessors: 20 female geriatric nurses aged 25-46, 10 male student teachers aged 22-37, and 11 older women aged 70-87. They assessed the 387 same sexed twin pairs who comprised the study population of the previously reported two year follow-up.2 The assessors did not know the age range of the twins, and each twin of a pair had their age assessed on different days. The assessments were done via presentation of the photographs on a computer screen, and the pictures were shown in a predetermined but random sequence to avoid bias in the age assessments induced by preceding images.
In addition, 10 of the 20 nurses rated all the 1826 pictures,3 the 387 twin pairs and all the “single twins” (twins whose co-twin was either dead or did not want to participate) in the survey. Data from one male assessor were excluded because his responses were extreme outliers. Seventy twins had one missing rating, five had two missing ratings, and one had seven missing ratings, giving a total of 87 (0.2%) out of more than 41 000. The analyses were performed with and without pictures with missing ratings with virtually identical results. The mean of the age estimates for each twin was used as the twin’s perceived age in each of the three assessor groups.
We used in the Danish Civil Registration system, which registers date of death or emigration of all Danish people,6 to follow each participant from the date in spring 2001 when their photograph was taken through to 31 January 2008.
The assessment of physical functioning was based on an adaptation of an instrument developed and previously validated in Denmark7 that has been administered essentially unchanged throughout the LSADT. We used the strength scale, which consists of 11 items (such as walking up two flights of stairs) with responses on a scale of 1 to 4 (1=cannot do, 2=can do with aid or major difficulties, 3=can do with fatigue or minor difficulties, 4=can do without aid or difficulty). The strength scale values used were the average of the 11 individual item responses. The scale has high internal consistency reliability (>0.90) and is relatively stable (two year stability coefficient ≥0.60).7
Grip strength was assessed in a standardised way, as previously described,8 with a Smedley dynamometer (TTM, Tokyo, Japan). To measure maximal strength, the width of the handle was adjusted to fit the hand size with the second phalanx resting against the inner stirrup. Grip strength is influenced by the position of the elbow; strength is greater with a fully extended elbow. We required the elbow to be held at 90° and the upper arm to be tight against the trunk in a series of three measurements, with brief pauses between each, and subsequently used the maximum value as the estimate. We identified the maximum value of three measurements with each hand. Grip strength discriminates functioning in all adult age groups, predicts incident disability, and is highly correlated with muscular power in other muscular groups. It is easily and reliably measured and it correlates with function in activities of daily living and survival among the oldest old.8
The cognitive battery included the mini-mental state examination (MMSE)9 and five brief cognitive tests selected to be sensitive to age related changes.10 The MMSE is a standard neurological screening test that is especially effective for screening at the lower end of cognitive functioning. In our sample, the MMSE was both internally consistent (0.75) and temporally stable (two year stability coefficient 0.64). The brief cognitive tests included a verbal fluency task (the number of animals named in one minute), forward and backward digit recall, and immediate and delayed recall of a 12 item list. The five individual cognitive measures were temporally stable (two year stability coefficients range from 0.40 to 0.52) and positively correlated, the latter justifying the formation of a cognitive composite of the five tests (internal consistency reliability estimate 0.75; two year stability coefficient 0.60).10
Leucocyte telomere length
For 282 of the participants, we had access to a full blood sample drawn in 1997-8. We used two enzyme digests (Hinf I/Rsa I and Hph I/MnI I) to generate the terminal restriction fragments (TRF), the mean length of which was used as an indicator of leucocyte telomere length.11 12 Telomere length is an indicator of the number of historic cell replications and the replicative potential of cells. Shorter length is associated with a host of diseases related to ageing and lifestyle factors and has been shown to be associated with mortality.12
The reliability analysis of perceived age was conducted with analysis of variance. Survival analyses with Cox’s proportional hazards models were used to study the association between perceived age and survival since the date of the photograph. The hazard ratio was initially estimated with sex and perceived age as independent covariates, then with sex and chronological age as independent covariates, and finally with all three covariates to assess the effect of perceived age given a person’s chronological age. We also performed a series of similar analyses including the other biomarkers of ageing (that is, the physical and cognitive functioning measures). The proportional hazards assumption underlying the Cox model was tested with the Schoenfeld residual test. Intrapair analyses were performed by calculating the proportion of times the co-twin who looked oldest died first. The proportion was compared with the null hypothesis of equality (50%/50%), by using the standard binomial test.
We analysed trends with a variance weighted least square regression and used correlation analyses, adjusting for sex and chronological age, to study the association between perceived age and the ageing phenotypes (physical and cognitive functioning as well as telomere length).
Statistical analysis was done with Stata 9.2. As the data partly pertain to twin pairs, a correlation between twins in a pair would lead to an underestimation of standard errors if we used the traditional procedure for obtaining these. Therefore, our regression analyses used the “cluster” option in Stata with a unique pair number as the cluster variable.
All three assessor groups rated 387 twin pairs (774 twins, 352 men and 422 women), corresponding to 175 monozygotic and 212 dizygotic twin pairs. Analysis of variance showed that the perceived age data from all three assessor groups had high reliability (0.82-0.94).
Tables 1⇓ and 2⇓ provide the characteristics for age and sex specific thirds of perceived age for the full sample of 1826 pictures rated by the 10 nurses. The pattern is consistent across all six age and sex strata, with higher mortality and poorer functioning in the higher perceived age thirds.
Table 3⇓ shows comparisons of data, including the correlation between perceived age and the other biomarkers of ageing, generated by all three assessor groups. Given the consistency of ratings across the three groups of assessors, only ratings generated by 10 female nurses were used in the total sample of the 1826 participants.
The mean of the perceived ages was close to the mean of the chronological ages; within one year in all rater groups except the older assessors, who overestimated the ages by an average of 1.7 years. The mean chronological age was about two years higher in the total sample of 1826 twins than among the 387 twin pairs (774 individuals) as the total sample also included twins whose co-twins had died, and these tended to be older than the twins with living co-twins. The correlation between perceived age and chronological age was highest in the total sample (0.52), while it was lowest (0.22), but still highly significant, using the data from the older female assessors.
Table 3 shows the results of the correlation between perceived age and the physical and cognitive functioning ageing phenotypes and leucocyte telomere length, adjusted for chronological age and sex.⇑ Perceived age was significantly correlated with all functional phenotypes across all assessor groups and in both the twin pair sample and the total sample. In addition, all tested indices of ageing were associated with increased perceived age in the expected direction—for example, decreased physical and cognitive functioning was associated with increased perceived age. Leucocyte telomere length, measured with two types of enzyme digests, showed the same pattern of association with perceived age. The association of perceived age with leucocyte telomere length generated by the Hph I/MnI I was significant only for the assessments made by the nurses of the total sample. However, the correlation was significant for all assessor groups when Hinf I/Rsa I was used to generate the data on leucocyte telomere length.
At the end of the seven year follow-up period, among the 387 twin pairs (352 men and 422 women), 225 (29%) had died, 116 (33%) men and 109 (26%) women. In the total sample of 1826 twins, 348 (41%) men and 327 (33%) women had died. Table 4⇓ shows the hazard ratios adjusted for sex, perceived age, and chronological age. As expected from Danish demographic data, the hazard ratio for chronological age was 1.11-1.13 in the bivariate analyses, corresponding to an 11-13 % increase in mortality risk per year and a 30-40% mortality reduction in women compared with men. For all the assessor groups, perceived age was highly and significantly correlated to mortality in the bivariate analyses and also highly and significantly correlated after adjustment for chronological age.
The effect size for perceived age was the same as or larger than chronological age, both in the univariate and the bivariate analyses. This large effect of perceived age might be caused, in part, by regression to the mean (perceived age was underestimated for the oldest people and overestimated for the youngest).2 When we scaled chronological and perceived age to the same mean and standard deviation, however, the effect of chronological age and perceived age was of nearly identical size for the nurses and the young male assessors whereas the perceived age effect was smaller for the older assessors but still highly significant (data not shown).
Perceived age was still significantly associated with survival after adjustment for other biomarkers of ageing. As expected from the correlations in table 3, however, the effect size in full sample was attenuated from a hazard ratio of 1.08 (1.05 to 1.10) (table 4) when we adjusted for chronological age and sex to 1.05 (1.03 to 1.07) when we added MMSE and grip strength and finally to 1.03 (1.01 to 1.06) when we also included strength score and cognitive score in the model together with all the previous covariates.
The proportional hazards assumption was generally not violated in the applied models with the 387 same sex twin pairs, although there was a tendency for the association between perceived age and mortality to decline after more than five years of follow-up. To investigate the time dependency further, we carried out a subanalysis excluding participants who died within two (three) years of their photograph being taken. We found the same pattern as in the overall twin pair sample, and the age adjusted association between perceived age and mortality was still significant for all the assessor groups (data not shown). The proportional hazards assumption was not violated in the total sample with the 1826 twins, and the association between perceived age and mortality did not decline after five years of follow-up. This indicates that perceived age is not just predictive of short term mortality.
As of 31 January 2008, there were 179 pairs (78 monozygotic and 101 dizygotic) in which at least one twin had died. Figures 2⇓ and 3[f] show the analyses within pairs stratified by zygosity, which show significant differences. Figure 2 ⇓shows that for the dizygotic twin pairs the likelihood that the older looking twin of the pair died first increased markedly with increasing difference in perceived age within the pair—that is, the bigger the difference in perceived age within the pair, the more likely it was that the older looking twin died first. There was a significant increasing linear trend for all assessor groups (P=0.001 for nurse assessors, P=0.021 for student assessors, P=0.03 for older assessors) and all assessors combined (P=0.02). Figure 3 shows that there was no such association for monozygotic twins.⇓
Perceived age predicts survival among people aged ≥70, even after adjustment for chronological age, sex, and other readily measurable biomarkers of ageing. Perceived age also correlates with age related phenotypes such as physical and cognitive functioning and leucocyte telomere length. Clinicians use perceived age as part of their assessment of patients, but research on the validity of the approach has been sparse.1 13 14 We have shown that perceived age based on facial photographs is a robust biomarker of ageing that does not depend on the sex, age, and professional background of the assessors.
In our analysis, the comparison within pairs of dizygotic twins controlled for rearing environment and, on average, half the genetic factor variants present in a population, while the comparison within pairs of monozygotic twins controlled for all genetic factors and rearing environment. We found indication of common genetic factors influencing both perceived age and survival because controlling for genetic factors (the comparison within monozygotic pairs) removed the association between perceived age and survival (fig 3). This was in contrast with the results for the overall twin sample and for the dizygotic twins, where comparison within pairs showed a clear “dose response” association between perceived age and survival (fig 2). Hence, the comparison within pairs suggests that there are genetic factors influencing both survival and perceived age (for example, genetic factors that influence the condition of cardiovascular tissue could affect the risk of myocardial infarction as well as the appearance of skin). Full details of this study design can be found elsewhere.15
Many candidates have been proposed as biomarkers of ageing—that is, measurements that correlate well with a wide range of age sensitive traits in multiple domains, after statistical adjustment for the effects of chronological age, and predict survival. So far, however, no biomarker of ageing has been able to challenge chronological age as the best predictor of future survival. Our study shows that in a group of people aged ≥70, perceived age is a strong predictor of mortality after adjustment for chronological age. Whether this is true in other settings and in younger age groups is unknown, but we anticipate that the effect might be even more pronounced in middle age because there will be no immediate ceiling effect in the assessors’ age estimates.2 We could not examine whether the assessment of perceived age is sensitive to the ethnic and cultural background of the assessors. Nearly all the cohort members were white and we used white raters. Cross cultural or ethnic rating might be more difficult than rating within a culture.
Strengths and weaknesses of study
This was a population based study with a high participation rate. The photographs were taken as part of a survey in which the interviewers visited the participants in their homes, which ensured the inclusion of frailer participants. The photographs, however, were not as standardised as they would have been in a clinical setting because of the various conditions in the participants’ residences. The high reliability values in the analyses of variance and the consistent results across assessor groups indicate that perceived age was reliable. Also, leucocyte telomere length, which is currently among the most promising molecular biomarkers of ageing,12 16 17 was significantly correlated with perceived age, despite a three to four year time gap between the DNA sample collection for this assay and the photography, suggesting that the correlation would be even stronger had the blood samples and photographs been obtained at the same time.
Specific health hazards, such as smoking and sun exposure, and low socioeconomic status and high depression score are associated with perceived age.3 This further strengthens the notion that perceived age is a biomarker of ageing because general biomarkers of ageing should be able to reflect and summarise various major common health hazards.
When assessing health, physicians traditionally compare perceived and chronological age, and for adult patients the expression “looking old for your age” is an indicator of poor health. Our study indicates that this practice, which has existed for decades if not centuries,14 is actually a useful clinical approach especially given that in a clinical setting perceived age is based on an array of indicators in addition to facial appearance. It is unclear why there has been so little research on the reliability and validity of perceived age given that it is so widely used and formally registered in patients’ notes in many settings. One reason could be that perceived age is considered too diffuse and subjective a measure to be of scientific interest. A parallel could be self rated health (measured as very good, good, average, poor, very poor), which was also formerly considered an unreliable variable until papers in the 1980s showed that self rated health obtained in surveys was a strong predictor of short and long term survival—even among seemingly healthy individuals.18 Since then self rated health has been standard in most surveys and in clinical studies. If perceived age can be confirmed as a robust biomarker of ageing in other settings, it could be an important addition to surveys and clinical studies, albeit logistically not as simple as self rated health.
Clinicians are constantly confronted with new technological tools and tests of uncertain clinical importance—for example, genome-wide association studies of common diseases have identified many polymorphisms associated with small increases in risk.19 20 21 Apolipoprotein E genotype and FOXO 3A, however, are the only common genetic variants that consistently have been shown to be associated with longevity, and their effect size is modest.22 23 A basic clinical tool such as perceived age is a useful biomarker of ageing and facial photographs are, currently, likely to be more informative with regard to survival of older people than a DNA sample.
What is already known on this topic
Clinicians use perceived age as a general indicator of a patient’s health, though its clinical utility has not been systematically investigated
What this paper adds
In a group of 1826 Danish twins aged ≥70, perceived age predicted survival, even after adjustment for chronological age, sex, and rearing environment
Perceived age adjusted for chronological age and sex correlated with physical and cognitive functioning as well as leucocyte telomere length
The results were not sensitive to the age, sex, and professional background of the assessors
Cite this as: BMJ 2009;339:b5262
We thank Unilever for supplying the composite images, which were generated by Sharon Catt with software from the Perception Laboratory, University of St Andrews.
Contributors: KC and JWV initiated the study, obtained funding, supervised the analyses, and KC was mainly responsible for writing the report and is guarantor. HR, AA, DG, and FvdO helped to develop the protocol, assisted in the analysis of the results, and helped to write the report. MT, MM, and JH assisted in the protocol design and the data analyses, and helped to write the report.
Funding: This study was funded by Unilever. The Longitudinal Study of Aging Danish Twins (LSADT) received grants from the US National Institutes of Health (grant No NIA P01 AG008761). The Danish Aging Research Center is supported by a grant from the VELUX foundation. No funders had any role in the study design, analysis, or writing of this paper.
Competing interests: None declared.
Ethical approval: The study was approved by the regional scientific ethical committee in Denmark (Case No VF20040241).
Data sharing: No additional data available.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.