Time to event (survival) dataBMJ 1998; 317 doi: http://dx.doi.org/10.1136/bmj.317.7156.468 (Published 15 August 1998) Cite this as: BMJ 1998;317:468
- ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Oxford OX3 7LF
- Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE
In many medical studies an outcome of interest is the time to an event. Such events may be adverse, such as death or recurrence of a tumour; positive, such as conception or discharge from hospital; or neutral, such as cessation of breast feeding. It is conventional to talk about survival data and survival analysis, regardless of the nature of the event. Similar data also arise when measuring the time to complete a task, such as walking 50 metres.
The distinguishing feature of survival data is that at the end of the follow up period the event will probably not have occurred for all patients. For these patients the survival time is said to be censored, indicating that the observation period was cut off before the event occurred. We do not know when (or, indeed, whether) the patient will experience the event, only that he or she has not done so by the end of the observation period.
Censoring may also occur in other ways. Patients may be lost to follow up during the study, or they may experience a “competing” event which makes further follow up impossible. For example, patients being followed to a cardiac event may die from some other disease or in an accident.
In most survival studies patients are recruited over a period and followed up to a fixed date beyond the end of recruitment. Thus the last patients recruited will be observed for a shorter period than those recruited first and will be less likely to experience the event. An important assumption, therefore, is that patients' survival prospects (prognosis) stay the same throughout the study (although this will not matter too much in a randomised trial). We also assume that patients lost to follow up have the same prognosis as those remaining in the study.
Table 1 shows the survival times of 44 patients in a randomised trial. Several patients in each group were still alive at the end of the study, while one was lost to follow up. In such a study we wish to compare the survival times of the two groups of patients. Statistical methods such as t tests cannot cope with the uncertainty in the data caused by censoring. Patients with censored data contribute valuable information and they should not be omitted from the analysis. It would also be wrong to treat the observed time (at censoring) as the survival time. We cannot tell, for example, whether the patient in the control group who was still alive at 127 months would have lived longer than the patient in the prednisolone group who died after 143 months. Rather we need recourse to a specialised set of statistical methods that have been developed for handling such data. We shall consider methods for graphical display and analysis of survival data in subsequent Statistics Notes.
Implicit in the preceding discussion is that survival should be evaluated in a cohort of patients followed forwards in time from a particular time point, such as diagnosis or randomisation, even if the cohort is identified retrospectively. An alternative, and potentially highly misleading, approach is to take a group of people experiencing the event of interest, perhaps in a certain time interval, and ascertain the elapsed time since the start of the relevant preceding time span. For example, we might take all newly diagnosed diabetics and find out when they first experienced certain symptoms. Similarly we might take birth as the start of the time period of interest for a group of individuals who have died and investigate associations between age at death and other variables.
Analyses of such data can cause serious problems. A good example is the highly dubious finding that left handed people die on average seven years younger than right handed people.2 In this study those dying at old ages were survivors from a cohort born 70 or more years ago while those dying young may have been born at any time, and so on average will have been born later. Such studies make strong implicit assumptions—in essence that the prevalence of the risk factor(s), the characteristics of the population at risk, and the survival (prognosis) remain unchanged over many decades.3 These assumptions will usually be untenable and may also be untestable. Using this study design we would certainly find that people who use electric guitars or even personal computers die much younger than those who do not. The differing longevity in relation to handedness2 would have arisen if the prevalence of left handedness had increased over the past 80 years. Proper prospective studies have found no evidence of an effect of handedness on lifespan. 4 5
The same design was used in a study of long term survival in prostate cancer. All patients dying in a three year period who had been treated with palliative intent were “followed from death to diagnosis,”6 a period of up to 30 years. The authors reported that the proportion of deaths due to cancer increased with length of survival. This finding cannot be trusted because of the problems noted above, which are common to all such studies.3 Subjects with long survival times must have been diagnosed decades ago, whereas those with short survival times may include some patients diagnosed recently. The observed association could be a spurious consequence of improved treatment, earlier diagnosis, or some other change over time. The same error was seen recently in the BMJ.7
Retrospective studies can be valuable, but this design should be avoided when studying survival times. Whenever possible times to an event of interest should be studied in a definable cohort of individuals followed forwards in time.