Epidemiological surveys use various study designs and range widely in size. At one extreme a case-control investigation may include fewer than 50 subjects, while at the other, some large longitudinal studies follow up many thousands of people for several decades. The main study designs will be described in later chapters, but we here discuss important features that are common to the planning and execution of surveys, whatever their specific design.
The success of data collection requires careful preparation. The first and often the most difficult question is “Why am I doing this survey?” Many studies start with a general hope that something interesting will emerge, and they often end in frustration. The general interest has first to be translated into precisely formulated, written objectives. Every survey should be reasonably sure to give an adequate answer to at least one specific question. This initial planning requires some idea of the final analysis; and it may be useful at the outset to outline the key tables for the final report, and to consider the numbers of cases expected in their major cells.
Every study needs a primary purpose. It is easy to argue “While we have the subjects there, let’s also measure…”; but overloading, whether of investigators or subjects, must be avoided if it in any way threatens the primary purpose. Sometimes subsidiary objectives may be pursued in subsamples (every nth subject, or in a particular age group) or by recalling some subjects for a second examination: when their initial contact has been favourable then response to recall is usually good.
Before planning the detail of a study, it is wise to carry out a library search of the relevant background publications. Occasionally this may show the answer to the study question without any need for further data collection; or it may uncover useful sources of published information, such as the registrar general’s mortality and cancer registry reports, which can form the basis of an analysis without the requirement for an expensive and time consuming field survey. Even when survey work remains necessary, experience in earlier related investigations may guide the design or indicate pitfalls to be avoided.
The overriding need in an epidemiological survey is to examine a representative sample of adequate size in a standardised and sufficiently valid way. This determines the choice of examination methods and the points where these differ from those of clinical practice. Methods must be acceptable, and if possible noninvasive, or else cooperation suffers and the study group becomes unrepresentative. They must be relatively cheap and quick, or not enough subjects can be examined: with fixed resources the need for detail conflicts with the need for numbers. Most important of all, methods and observers must be capable of rigorous standardisation; even if this excludes the benefits of clinical judgement.
Information abstracted from existing records
Sometimes adequately standardised information is already available from existing records. For example, in a study to examine the long term incidence of hypothyroidism after treatment with radioiodine for thyrotoxicosis, it was possible to identify treated patients and obtain the information needed to follow them up (name, date of birth, sex, address, etc) by searching hospital files. When existing records are exploited in this way, the required information is normally abstracted on to a specially designed form or even direct on to a portable computer.
The design of the abstraction form or of the computer program for inputting data should take into account the layout of the source material. Having to flick repeatedly backwards and forwards through the source record is not only tedious and time consuming, but may also increase the chance of error. Each abstracted record should be identified by a serial number, and should include sufficient information to permit easy access back to the source material for checking and to obt2in additional data if required. When data are not abstracted direct on to computer, later transfer to computer will often be facilitated by numerical coding, in which case coding boxes can be provided on the right hand side of the abstraction form. Some items of data (for example, dates of birth) can easily be written direct into the coding boxes. Others, such as occupation, may need to be recorded in words and coded later as a separate exercise. Time spent writing is minimised if non-numerical information is, when possible, ringed or ticked rather than having to be written out. To minimise the chance of error, any reformulation of numerical data (for example, derivation of age at hospital admission from date of birth and date of admission) should be carried out by the computer after date entry, and not as part of the abstraction process. When coding data, allowance must be made for the possibility of missing information.
Epidemiological data are often obtained by means of questionnaires. These may be either self administered (that is, completed by the subject) or administered at interview. Self administered questionnaires are easier to standardise because the possibility of systematic differences in interviewing technique is avoided. On the other hand, they are limited by the need to be unambiguously understood by all subjects. An interviewer may be essential to collect information on complex topics.
Good design of questionnaires requires skill. The language used should be clear and simple. Two short questions, each covering one point, are better than one longer question which covers two points at once. A question that has been used successfully in a previous study has obvious advantages. The order of questions should take into account the sensitivities of the person to whom they are addressed – it is better to start with “What is your date of birth?” than launch straight into “Have you ever been treated for gonorrhoea?” – and should be designed to facilitate recall. For example, all questions relating to one phase of the person’s life might be grouped together. As a check on the reliability of information, it may sometimes be helpful to include overlapping questions. In a study of risk factors for back pain, some people reported that their jobs entailed driving for more than four hours a day but did not involve more than two hours sitting. This suggests that they had not properly understood the questions. An important consideration is whether to use closed or open ended questions. Closed ended questions, with one box for each possible answer (including “don’t know”) are more readily answered and classified, but cannot always collect information in the detail that is required. When interviewers are used then the wording with which they ask questions should be standardised as far as is compatible with the need to obtain useful information. As in abstracting existing records, the forms used to record answers to questions should be designed for ease and accuracy of completion and to simplify subsequent coding and analysis.
Physical examination and clinical investigations
Methods of physical examination should be designed to reduce variation within and between observers. Often, a quantitative measurement (for example, respiratory rate) is easier to standardise than a qualitative judgement (whether someone is tachypnoeic or not). Standardisation of laboratory assays can be improved by careful specification of the method by which specimens should be collected and stored and by rigorous quality control of the analysis.
Whatever method of data collection is adopted, it is usually worth trying it out in a pilot survey before embarking on the main study. Identification of practical snags at this stage can save much difficulty later. In large studies the questionnaire or record design should be discussed with the statistician who will later be concerned in the analysis.
In a small study the doctor himself may do all the work, but in large surveys he will need helpers. If an epidemiological examination technique requires skill and clinical judgement it has probably been insufficiently standardised: if it is adequately standardised it can usually be taught to any intelligent person.
The figure shows how two observers had distinct but opposite time trends in their performances during the early stages of a survey of skinfold thickness. Such training effects, which are common, should have been completed before the start of the main study: new staff need supervised practice under realistic field conditions followed by pre-survey testing.
Trend in mean values for triceps skinfold thickness obtained by two observers in the same survey
Despite all precautions, observer differences may persist. Observers should therefore be allocated to subjects in a more or less random way: if, for example, one person examined most of the men, and another most of the women, then observer differences would be confounded with true sex differences. To maintain quality control throughout the survey each examiner’s identity should be entered on the record, and results for different examiners may then be compared.
Most surveys and trials are smaller than the investigator would wish, lack of numbers often setting a limit to some desirable subgroup analysis. This is inevitable. What can be avoided is discovering only at the final analysis that numbers do not permit achievement even of the study’s primary objective. To prevent this disappointment the purpose of the study has first to be formulated in precise statistical terms. If the aim is to estimate prevalence, then sample size will depend on the required accuracy of that estimate. (Table 5.1 gives some examples.) Sampling error is proportionally greater for less common conditions; that is to say, to achieve the same level of confidence requires a larger sample if prevalence is low.
|Table 5.1 95% confidence limits for various rates and sample sizes|
|Estimated prevalence (%)||95% confidence limits|
|n = 500||n = 1000|
|2||1.0 – 3.7||1.2 – 3.1|
|10||7.5 – 13.0||8.2 – 12.0|
|20||16.6 – 23.8||17.6 – 22.6|
Techniques also exist for calculating sample sizes required for estimating, with specified precision, the mean value of a variable, or for identifying a given difference in prevalence or mean values between two populations. These techniques may be found in textbooks or (better) by consulting a statistician; but either way the investigators must first know exactly what they want to achieve.
When the study sample is selected from a larger study population, statistical inference will be more rigorous if the selection process is random, or effectively random; that is to say, if each individual in the study population has a known (usually identical) non-zero probability of selection. To achieve this a census or listing of the study population is first required. In a survey of adults in a hospital district the electoral register will probably serve. In an occupational group the payroll is invariably complete, and in a school there are class registers. In general practice there is an age-sex register. To choose a simple random sample the listed people are numbered serially. Numbers within the appropriate range are then read off from a table or computer generated list of random numbers until enough people have been selected.
It may be that an investigator wishes to choose a sample in which certain subgroups (particular ages, for instance, or high risk categories) are relatively overrepresented. To achieve this he may divide the study population into subgroups (strata) and then draw a separate random sample from each, while adjusting the various sample sizes to suit the investigation’s requirements. This is a stratified random sample.
The study population may be large and widely scattered – for example, all the general practices in a city – but for the sake of convenience the investigator may wish to concentrate his survey in a few areas only. This can be done by drawing first a random sample of practices, and then, within these practices, drawing a random sample of individuals. Such two stage sampling works well, but there is some loss of statistical efficiency, especially if only a few units are selected at the first stage.
Most people are willing to take part in medical surveys provided that they trust the investigators, just as patients will nearly always help their own doctors in their research. In population studies, however, there has usually been no previous contact. The selected subjects need an explanation of the purpose of the study, of why they in particular have been asked to take part, of what is expected from them, and what if anything they will get out of it (for instance a medical check up or a report on the research findings). Local general practitioners, too, need to know what is going on. Time given to preparatory public relations is always well spent.
Response must be made as easy as possible. If attendance at a centre is required, it is better to send everyone a provisional appointment than to expect them to reply to a letter asking whether they are willing to attend. Provision of transport may be welcomed. Often the difference between a mediocre response and a good one is tactful persistence, including second invitations (perhaps by recorded delivery), telephone calls, identifying the reasons for non-attendance, and home visits.
The level of response that is acceptable depends both on the study question and on the population in which the question is being asked. Problems arise because non-responders may be atypical. For example, in a survey of coronary risk factors among adults registered with a group practice, those at highest risk may be the least inclined to complete a questionnaire or attend for examination. If a response rate of 85% were achieved, an estimated prevalence of heavy alcohol consumption of 3% among the responders could be substantially too low if most of the nonresidents drank heavily. On the other hand an estimated 50% prevalence of smokers would not need major revision, even if all of the non-responders smoked.
What matters is how unrepresentative non-responders are in relation to the study question. It is not important whether they are atypical in other respects. In a survey to evaluate the association between serum IgE concentrations and ventilatory function it would not matter if non-responders had an unusually high frequency of respiratory disease, provided that the relation of their ventilatory function to IgE was not unrepresentative.
Assessment of the likely bias resulting from incomplete response is ultimately a matter of judgement. However, two approaches may help the assessment. Firstly, a small random sample can be drawn from the non-responders, and particularly vigorous efforts made to encourage their participation, including home visits. The findings for this subsample will then indicate the extent of bias among nonresponders as a whole. Secondly, some information is generally available for all people listed in the study population. From this it will be possible to contrast responders and non-responders with respect to characteristics such as age, sex, and residence. Differences will alert the investigator to the possibility of bias.
In addition, it may help to put absolute bounds on the uncertainty arising from non-response by making extreme assumptions about the non-responders. For example, if the aim of a survey were to estimate a disease prevalence, what would be the prevalence if all of the non-responders had the disease, or none of them?
Small studies can sometimes be analysed manually with the help of a calculator. Nowadays, however, the analysis of epidemiological data is almost always carried out by computer. With recent advances in technology, all but the largest data sets can be handled satisfactorily on a personal computer. Moreover, a wide range of software packages is now available to assist epidemiological analysis.
The starting point for analysis by computer is the coding and entry of data. These procedures should be checked, usually by carrying them out in duplicate. In addition, once the data have been entered, further checks should be made to ensure that all codes are valid (for example, nobody should have 31 February as a birth date) and to look for any internal inconsistencies (such as a date of admission to hospital being earlier than the subject’s date of birth). Statistical analysis should only begin when the data set is as “clean” as possible.
With the ready availability of software packages, it is tempting for medical investigators to embark on analyses they do not fully understand, and in the process they may use inappropriate statistical techniques. For this reason it is preferable to obtain advice from a statistician when carrying out all but the simplest analyses. As with the earlier stages of data processing, statistical calculations should all be checked.