Randomised comparison of three methods of administering a screening questionnaire to elderly people: findings from the MRC trial of the assessment and management of older people in the communityBMJ 2001; 323 doi: https://doi.org/10.1136/bmj.323.7326.1403 (Published 15 December 2001) Cite this as: BMJ 2001;323:1403
- Liam Smeeth, clinical lecturer in epidemiology ()a,
- Astrid E Fletcher, professor of epidemiology and ageinga,
- Susan Stirling, research fellowa,
- Maria Nunes, research officerb,
- Elizabeth Breeze, lecturera,
- Edmond Ng, research fellowa,
- Christopher J Bulpitt, professor of geriatric medicineb,
- Dee Jones, principal research fellowc
- a Centre for Ageing and Public Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT
- b Section of Care of the Elderly, Imperial College, Faculty of Medicine, Hammersmith Campus, London W12 0NN
- c Research Team for Care of Elderly People, University Department of Geriatric Medicine, University of Wales College of Medicine, Cardiff CF64 2XX
- Correspondence to: L Smeeth
- Accepted 27 September 2001
Objective: To compare three different methods of administering a brief screening questionnaire to elderly people: post, interview by lay interviewer, and interview by nurse.
Design: Randomised comparison of methods within a cluster randomised trial.
Setting: 106 general practices in the United Kingdom.
Participants: 32 990 people aged 75 years or over registered with participating practices.
Main outcome measures: Response rates, proportion of missing values, prevalence of self reported morbidity, and sensitivity and specificity of self reported measures by method of administration of questionnaire for four domains.
Results: The response rate was higher for the postal questionnaire than for the two interview methods combined (83.5% v 74.9%; difference 8.5%, 95% confidence interval 4.4% to 12.7%, P<0.001). The proportion of missing or invalid responses was low overall (mean 2.1%) but was greater for the postal method than for the interview methods combined (4.1% v 0.9%; difference 3.2%, 2.7% to 3.6%, P<0.001). With a few exceptions, levels of self reported morbidity were lower in the interview groups, particularly for interviews by nurses. The sensitivity of the self reported measures was lower in the nurse interview group for three out of four domains, but 95% confidence intervals for the estimates overlapped. Specificity of the self reported measures varied little by method of administration.
Conclusions: Postal questionnaires were associated with higher response rates but also higher proportions of missing values than were interview methods. Lower estimates of self reported morbidity were obtained with the nurse interview method and to a lesser extent with the lay interview method than with postal questionnaires.
What is already known on this topic
What is already known on this topic The optimum method of administering a brief multidimensional screening assessment to elderly people is not known
What this study adds
What this study adds Postal questionnaires produce a higher response rate than interviews by nurses or lay interviewers but also higher proportions of missing data
Interview by nurses and to a lesser degree by lay interviewers is associated with lower levels of self reported morbidity than are postal questionnaires
Multidimensional assessment was originally developed in response to studies finding high levels of undetected medical or social problems among elderly people, 1 2 showing the need for a systematic approach to detection of problems. The effectiveness of multidimensional assessment for elderly people has been assessed in several randomised trials, with mixed results, and considerable uncertainty exists about the optimal method of administering an initial screening questionnaire. 3 4 Self administered postal questionnaires have several advantages over face to face interviews.5 They are cheaper, large numbers can be completed more quickly, interviewers do not have to be recruited and trained, and high rates of response have been obtained among elderly people.6-8
Since 1990, primary care teams in England and Wales have been required to offer an annual screening assessment to all patients aged 75 and over (the “over 75 check”).9 Although the contract for general practitioners in England and Wales recommends that people aged 75 and over are invited to “participate in a consultation,” a postal approach has been advocated for initial screening.10-13 The national service framework for older people includes a recommendation that elderly people should receive some form of single assessment “which is matched to their individual circumstances.”14 The consultation papers for the single assessment process state that either postal or face to face methods may be appropriate.15
We present a randomised comparison of three different methods of administering a screening questionnaire: post, interview by lay interviewer, and interview by nurse. All three methods have previously been used in randomised controlled trials of multidimensional screening assessments of elderly people, but none of these trials compared different methods.16-18 We compared rates of response, proportions of missing data, estimates of prevalence, and the validity of the results obtained. The data presented are from the baseline assessments of the Medical Research Council trial of the assessment and management of older people in the community.
The MRC trial
The MRC trial of the assessment and management of older people in the community is a community based randomised controlled trial comparing different approaches to multidimensional screening for people aged 75 years and over. The trial was designed to recruit 108 practices with an average of 500 eligible patients per practice, stratified to provide a representative sample in terms of mortality (standardised mortality ratio) and deprivation (Jarman score) of general practices in the United Kingdom. All practices were part of the MRC general practice research framework. In each practice, all patients aged 75 years or over registered with the practice were included in the study, unless they were resident in a long stay hospital or nursing home or were terminally ill.
We randomised practices to two groups: targeted screening and universal screening. We invited all participants to have a brief screening assessment. All participants in the universal screening arm were then invited to have a more detailed assessment. In the targeted screening arm, only people found to have a pre-specified level of problems during the brief assessment were invited to have a detailed assessment. The figure shows the study design, number of eligible participants, and rates of response.
A statistician drew up a computer generated randomisation list, stratified by tertiles of Jarman score and standardised mortality ratio; practices were randomly allocated centrally at the London School of Hygiene and Tropical Medicine as they were recruited to the trial. Because of the nature of the intervention, participants and researchers could not be blinded to the group assignment. One hundred and six practices participated in the study. The original target was 108 practices, with 18 in each of the six different combinations of universal or targeted and post, lay interview, or nurse interview (see fig). The sample sizes were based on the main trial outcomes (mortality and admissions to hospital or institution).
One practice in the universal-post group split up after randomisation, and the participating general practitioner consequently had a smaller number of patients. For this reason we recruited a further practice to that group, giving a total of 19 practices. Three practices withdrew at a late stage (after randomisation but before data collection) and were not replaced: hence three of the six groups include only 17 practices.
Data collection took place between 1995 and 1999. Before starting the assessments, the nurses and lay interviewers attended a training session. A few practices joined the study late, in which case the training took place at the practice. The nurses involved in the study were mostly practice nurses involved in practice based research; some of the nurses devoted all their time to research.
We obtained ethical approval for all aspects of the study from the relevant ethics committees.
Assessment of participants
We invited all participants in both arms of the study to have a brief screening assessment consisting of a range of health related questions covering the areas specified in the 1990 contract. We randomised practices to one of three methods of administering the brief assessment: postal questionnaire, interview by lay interviewer, or interview by nurse. We used 26 screening questions plus questions about smoking, alcohol intake, and physical activity included largely for purposes of epidemiological research. We present results for these three items here because they are specifically recommended by the national service framework for older people for inclusion in a screening assessment.14 Most of the questions had a graded response. For example, possible answers to questions about hearing and vision were “no difficulty,” “a little difficulty,” and “a lot of difficulty.”
We then invited all participants in the universal screening arm to have a more comprehensive detailed assessment by the trained nurse. In the targeted screening arm, only participants with a predetermined number and type of problems at the brief assessment went on to have the detailed assessment. We do not present the results of the detailed assessments in the targeted arm of the trial here because the participants were not a representative group of people. For four of the domains included in the brief assessment, more accurate and thorough assessments were undertaken in the detailed assessment.
Hearing—We gave participants the whispered voice test, in which specified numbers and letters are spoken in a whisper at full expiration by a tester standing 15 cm behind the patient. The test was performed with participants wearing hearing aids if they used them, thus testing participants' everyday hearing. The whispered voice test has been found to have a sensitivity of between 80% and 100% and a specificity of between 80% and 89% when compared with measurement of hearing loss in the range 30–40 decibels by pure tone audiometry.19-23
Vision—We measured participants' distance visual acuity at 3 m with a Glasgow acuity chart.24 For this study, we compared the screening question about vision with a binocular visual acuity cut-off point equivalent to a Snellen acuity of less than 6/18. Interventions such as cataract extraction would usually be offered when a visual acuity of less than 6/18 is found,25 and this level of visual impairment is below the level required to be legally permitted to drive in the United Kingdom.26
Depression—Participants completed the 15 item version of the geriatric depression scale.27-28 All questions are answered yes or no. Using a cut-off score of ≥6 to indicate depression, the 15 item version of the geriatric depression scale has been found to have a sensitivity in the range 78-85% and a specificity in the range 74-82% when compared with the Diagnostic and Statistical Manual of Mental Disorders29 or Geriatric Mental State30 criteria.31-33 We used a cut-off score of ≥6 in this study.
Cognition—Participants completed the mini-mental state examination.34 This examination is a widely used test of cognitive function and has been shown to be both valid and reliable.34-36 It has two sections: a verbal section with a maximum score of 21 and a performance section (involving, for example, copying a drawing) with a maximum score of 9. For physical or educational reasons, not all people are able to complete the performance section. The nurse administering the questionnaire decided whether participants were able to complete the performance section. Cut-off points of less than 17 for the whole test or less than 12 for the verbal section were used in the main trial to indicate the need for referral and were used to indicate likely cognitive impairment in this study.
All data were recorded on specially designed forms and scanned electronically. We analysed the data with Stata 6 software. Because we used stratified cluster sampling by general practice, additional variance could have arisen because observations on individuals within clusters may be correlated (that is, individuals within the same practice may be more similar than individuals in different practices). All analyses took account of the cluster design in the estimation of standard errors.37-39 We analysed all participants with available data in the groups to which they were initially randomised.
Because of the large number of possible comparisons, we performed hypothesis testing only for comparisons of particular interest. For example, in looking at the sensitivity and specificity of self reported measures by method of questionnaire administration there are 48 possible significance tests. We present confidence intervals to aid interpretation.
From 42 278 eligible participants, we obtained data from the brief screening assessment for 32 990 people, an overall response rate of 78.0%. Men were more likely to respond than women (80.5% v 76.7%, P<0.001), and this sex difference persisted after adjustment for age: the adjusted odds ratio for response comparing men with women was 1.22 (95% confidence interval 1.16 to 1.29, P<0.001). Data on sex were missing for eight eligible participants and could not be clarified as they had moved away. Responders were slightly younger than non-responders (median 80.3 years v 81.0 years, P<0.001). Table 1 shows rates of response by method of administration of the brief screening assessment broken down by age and sex.
The response rates were 83.5% for the postal method, 73.9% for the lay interview method, and 75.9% for the nurse interview method. The mean response rate for the two interview methods was 74.9%, a difference from the postal method of 8.5% (4.4% to 12.7%, P<0.001). The response rate for the nurse interview method was 2% higher (−4.4% to 8.5%, P=0.53) than that for the lay interview method.
The response rate fell with increasing age in both men and women. In addition, the higher response rate seen for the postal approach was not apparent among the oldest age groups. This may have been because of the higher levels of home visiting undertaken by the nurse or lay interviewer in the oldest age groups. For both nurse interview and lay interview methods, around 30% of assessments were undertaken in people's homes (31.2% and 30.0%, P=0.78). The proportion of assessments undertaken in people's own homes rose from around 16% in the 75–80 year age group to around 75% in the 90 years and over age group, with no significant differences between the two interview methods.
Proportion of missing or invalid responses and prevalence of problems
Table 2 shows the proportion of missing or invalid responses and the prevalence of problems reported. The proportions of missing or invalid responses were higher in the postal questionnaire group than in the interview groups for all questions. The mean proportion of missing responses was 2.1% overall: 4.1% for postal questionnaire, 0.6% for lay interview, and 1.1% for nurse interview. The mean proportion of missing responses was 3.2% higher (2.7% to 3.6%, P<0.001) for the postal approach than for the two interview methods combined. The difference between the two interview methods was small: 0.5% higher (0.12 to 0.93, P=0.017) for the nurse interview group.
Four of the questions (as indicated in table 2) were in two parts: the first part determined whether a domain applied, and this was followed by a branching question to quantify the problem. For example, the first question about incontinence asked if there was ever a problem and was followed by a question to determine frequency. For these domains, a missing response refers to one or both parts of the question being missing. The patterns of missing responses for these four domains were similar to the patterns for the other domains.
In the postal questionnaire group, 21% of responders stated that someone had helped them to fill in the questionnaire. The mean proportion of missing responses was slightly lower for people who had received help filling in the questionnaire (3.8% v 4.4%, P=0.03).
For 22 out of the 26 screening domains the prevalence of self reported problems was higher for the postal group than for the interview groups (the exceptions were medications, hearing, depression, and financial difficulty). Some of the observed differences were small, but larger differences were seen for many domains. To a much lesser degree, a similar pattern was seen in the lay interview group and the nurse interview group, with slightly lower levels of problems reported in the nurse interview group across all but two domains. For the three additional questions that were not originally intended for screening purposes (physical activity, smoking, and alcohol intake) no clear differences were seen in the answers obtained by the three methods.
All eligible people in the universal screening arm of the trial were invited to have a more detailed assessment by a nurse. Of 21 241 people, 15 126 (71.2%) responded; 64% of detailed assessments took place in general practice surgeries and 36% in people's homes. The median time between the brief assessment and the detailed assessment was 13 days (interquartile range 7 to 21 days).
Sensitivity and specificity of self reported measures
We compared four of the domains included in the brief assessment with more accurate and objective assessments undertaken in the detailed assessment. Table 3 shows the sensitivity and specificity for the self reported measures. The sensitivity varied by method of administration, with a lower sensitivity for the nurse interview method for three of the four domains assessed (depression was the exception). However, the 95% confidence intervals for the three different methods overlapped within each domain. The sensitivity of the questions was generally low (all less than 51%). The specificity of the self reported measures was high for all domains, with little variation by method of questionnaire administration.
The main strengths of this study are its randomised design and the fact that it is the largest study of its kind yet undertaken in this age group. Our results are directly relevant to the “single assessment” described in the national service framework for older people.15 The response rate was substantially higher for the postal questionnaire than for either interview method. Previous studies have found higher response rates,8 no difference, 40 41 or lower response rates for postal questionnaires.42-44 It is likely that people's ability or willingness to travel to the general practice affected the response rates in the interview groups. It is also possible that people prefer filling in a questionnaire.
Men were more likely to respond than women for all methods. As far as we are aware this is a novel finding. However, with the exception of one study that found no sex difference in response rates,45 previous studies of screening older people have not analysed sex differences of responders and non-responders.
The proportion of missing or invalid responses in this study was low overall, but the proportion of missing responses was significantly greater for the postal method, in line with previous work. 41 43 46 Two previous studies have found that for intimate or sensitive issues the proportions of missing data were higher for the interview method. 8 40 We did not see this phenomenon in this study.
Although the reliability (that is, repeatability) of postal questionnaires and interview administered questionnaires has been compared and generally been found to be high, 6 8 the validity of the information obtained by the two approaches is less well established. In a study of outcomes after surgery, participants had a tendency to give more positive or optimistic answers in a self completed questionnaire than in an interview.8 In studies of alcohol dependence, participants were less likely to report excessive drinking and less likely to report adverse symptoms in self administered questionnaires than in interviews. 47 48 However, in another study people were less likely to give a socially undesirable response to an interviewer than in a postal questionnaire.49 In a household health survey in which individual responses were compared with health records, the accuracy of responses was higher for postal questionnaires.40
The levels of self reported morbidity in our study were substantially lower in the two interview groups than in the postal questionnaire group. In addition, slightly lower levels of morbidity were reported to nurse interviewers than to lay interviewers. These findings are difficult to explain with any certainty. The response rate in the postal group was higher. The higher levels of morbidity observed in the postal group could have occurred if people with higher levels of morbidity were more likely to respond to a postal questionnaire than to an interview assessment. The higher proportions of missing values for the postal approach could also have inflated estimates of morbidity if people who did not have a problem were more likely to miss out a question because it did not seem relevant to them. Another possibility is that people were more likely to report potentially embarrassing problems, such as incontinence, in a postal questionnaire than in a face to face interview. However, the patterns of both missing data and reported prevalence were the same for potentially sensitive or embarrassing questions as for questions unlikely to cause embarrassment.
In the four domains for which it was possible to assess the diagnostic accuracy of the self reported measures used in the brief assessment (visual acuity, whispered voice test, geriatric depression scale, and mini-mental state examination), specificity was high and similar for the three methods across all four domains. The sensitivity was somewhat lower for the nurse interview group for three of the domains. The explanation for this finding is not clear. The relatively low sensitivities of simple questions (by any of the three methods of administration) for the detection of poor vision, hearing impairment, depression, and cognitive problems suggest that caution is needed in the use of simple questions for case finding.
In everyday practice the rates of response, proportions of missing data, and validity of the answers are likely to depend on the particular questions being asked and the skills of the interviewer as well as on the method of administration. The nurse interviewers used in this study were usually the practice nurses, and the lay interviewers were often clerical staff in the practice. Both groups received identical training in administering the brief questionnaire. They were fairly typical practice staff, not highly trained researchers, and this represents a feasible use of practice staff. The use of a postal questionnaire is supported by the higher response rate and because it is likely to be much cheaper than using an interviewer. The only clear disadvantage of the postal technique in this study was a higher proportion of missing or invalid responses, but even this higher level was only around 4%.
Whether the differences observed for the three methods affect health outcomes (mortality, hospital admission rates, and quality of life) and cost effectiveness will be answered by the ongoing randomised trial.
We thank the nurses, general practitioners, other staff, and patients in the participating practices; everyone at the MRC general practice research framework coordinating centre, particularly Jeannett Martin and Nicky Fasey; clerical staff at London School of Hygiene and Tropical Medicine (Janbibi Mazar and Rakhi Kabawala) and Hammersmith Hospital (Ruth Peters) for all their work on the study; Amina Latif (research officer) and Elaine Springer (clerical officer) of the University of Wales College of Medicine for administrative assistance; Judith Nickson (University of Cambridge) and Jennifer Evans and Richard Wormald (Moorfields Eye Hospital) for advice and help with training the research nurses; Alistair Tulloch (University of Oxford) for advice; and the trial steering committee—J Grimley Evans (chair), A Haines (previous chair), K Luker, C Brayne, M Vickers, M Drummond, S Lonsdale, and L Davies.
Contributors: LS had the idea for the study, analysed the data, and wrote the paper. AEF is the principal investigator and CJB and DJ are co-investigators of the MRC trial of the assessment and management of older people in the community and designed and implemented the trial. SS devised and carried out the randomisation procedure, monitored data collection, and took part in training the nurses and lay interviewers. MN and EB were involved in administering the study and editing data. EN took part in data management and cleaning. All authors commented on drafts of the paper. LS and AEF are the guarantors.
Funding The MRC trial of the assessment and management of older people in the community was funded by the UK Medical Research Council, the Department of Health, and the Scottish Office. LS is funded by a research fellowship from London NHS Executive.
Competing interests None declared.