Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles.
Lundin et al.1 are to be congratulated on their web-based system for individualised survival estimation in breast cancer based on the Finprog Study http://finprog.primed.info . In their system, the user enters the profile of a breast cancer patient which is matched to a database of 2,032 patients. Patients with identical profiles are retrieved and these are used to display a Kaplan-Meier estimate of cumulative survival probabilities over time.
The authors claim as an advantage of the system that "clinicians and researchers can obtain survival estimates based on actual data, rather than inferential estimates generated by a regression formula." However, any regression formula is based on actual data. More importantly, survival estimates from a regression model may be substantially more precise than Kaplan-Meier estimates, when there are few patients in particular strata. For example, a patient aged 35 or under with positive lymph node status and lobular histologic type who attempted to use Lundin et al.'s system would find their survival estimate to be based on 2 patients and 1 death, and hence to be very imprecise.
A second advantage claimed by Lundin et al. is that the output of their system is "a survival curve for the entire available follow up period and not just for a single time point." However, use of parametric survival models makes it straightforward to estimate survival probabilities for all times during the follow up period, for each combination of values of the prognostic variables.2
We have modelled the prognosis of drug-na�ve HIV positive patients starting potent antiretroviral therapy using data from the Antiretroviral Cohort Collaboration (ART-CC), a collaboration of thirteen HIV observational cohorts based in Europe and North America that supplied information on 12,574 patients2 http://www.art-cohort-collaboration.org. We identified five prognostic factors: CD4 cell count, viral load, AIDS at start of therapy, age and transmission group. These allocate patients to eighty strata, in some of which there were few events and no deaths at all, so that estimation of survival probabilities using Kaplan-Meier curves is impossible. Using regression models to estimate survival probabilities, estimates for strata with few or no events borrow strength from the pattern of events across all categories of the prognostic variables (at the cost of additional assumptions, for example that there is no interaction between the effects of different prognostic factors). Evaluation of the prognostic model in data from new patients has shown the model to be accurate and generalisable.
The figure contrasts the estimates from the parametric survival model for new AIDS event or death with those calculated using the Kaplan-Meier (KM) method for patients who were injecting drug users, had AIDS at start of therapy and had log viral load >5. The figure shows the survival curves for a) 119 patients (28 events) with CD4 cell count <_50 and="and" aged="aged" _50="_50" b="b" _11="_11" patients="patients" _2="_2" events="events" with="with" cd4="cd4" cell="cell" count="count" _100-199="_100-199" or="or" more="more" c="c" _3="_3" _50-99="_50-99" d="d" _4="_4" _0="_0" more.="more." the="the" model="model" km="km" curves="curves" agree="agree" quite="quite" closely="closely" in="in" a="a" but="but" _95="_95" confidence="confidence" interval="interval" ci="ci" for="for" is="is" much="much" narrower="narrower" than="than" that="that" estimate="estimate" particularly="particularly" smaller="smaller" group.="group." estimates="estimates" do="do" not="not" moreover="moreover" at="at" one="one" year="year" ranges="ranges" from="from" _5="_5" to="to" survival="survival" which="which" too="too" broad="broad" be="be" useful="useful" prediction="prediction" peters="peters" out="out" altogether="altogether" less="less" two="two" years="years" as="as" all="all" have="have" either="either" progressed="progressed" aids="aids" died="died" been="been" lost="lost" follow="follow" up.p="up.p"/> Figure d) shows a similar sized group to that in c) but by chance there were no events in this group despite having a worse risk profile (lower CD4 cell count). The KM estimate is therefore 100% survival, a completely misleading prediction. The four graphs show that KM estimates of survival are less precise than parametric survival model estimates, particularly for groups with few patients. Furthermore, in very small groups they may also be very inaccurate.
Margaret May Statistician
Jonathan Sterne Reader in Medical Statistics and Epidemiology
Department of Social Medicine, University of Bristol, Bristol BS6 2PR, UK
Matthias Egger Professor of Epidemiology
Department of Social and Preventive Medicine, University of Bern, Switzerland
1. Lundin J, Lundin M, Isola J, Joensuu H. A web-based system for individualised survival estimation in breast cancer
BMJ 2003;326:29.
2. Egger M, May M, Ch�ne G, et al. Prognosis of HIV-1-infected patients starting highly active antiretroviral therapy: a collaborative analysis of prospective studies. Lancet 2002; 360:119-129.
FIGURE LEGEND
Figure showing the survival curve predicted from the parametric model and 95% CI with the corresponding KM estimate and 95% CI for patients who were injecting drug users, had AIDS at start of therapy and had log viral load >5 and had a) CD4<_50 age50="age50" b="b" cd4="cd4" _100-199="_100-199" age="age"/>50, c) CD4 50-99, age>50, d) CD4<_50 age="age"/>50.
Competing interests:
None declared
Competing interests:
No competing interests
12 February 2003
Margaret May
statistician
Jonathan Sterne, Matthias Egger
Dept. Social Medicine, Bristol University, Bristol BS6 2PR
Parametric survival models versus Kaplan-Meier estimates
Editor
Lundin et al.1 are to be congratulated on their web-based system for individualised survival estimation in breast cancer based on the Finprog Study http://finprog.primed.info . In their system, the user enters the profile of a breast cancer patient which is matched to a database of 2,032 patients. Patients with identical profiles are retrieved and these are used to display a Kaplan-Meier estimate of cumulative survival probabilities over time.
The authors claim as an advantage of the system that "clinicians and researchers can obtain survival estimates based on actual data, rather than inferential estimates generated by a regression formula." However, any regression formula is based on actual data. More importantly, survival estimates from a regression model may be substantially more precise than Kaplan-Meier estimates, when there are few patients in particular strata. For example, a patient aged 35 or under with positive lymph node status and lobular histologic type who attempted to use Lundin et al.'s system would find their survival estimate to be based on 2 patients and 1 death, and hence to be very imprecise.
A second advantage claimed by Lundin et al. is that the output of their system is "a survival curve for the entire available follow up period and not just for a single time point." However, use of parametric survival models makes it straightforward to estimate survival probabilities for all times during the follow up period, for each combination of values of the prognostic variables.2
We have modelled the prognosis of drug-na�ve HIV positive patients starting potent antiretroviral therapy using data from the Antiretroviral Cohort Collaboration (ART-CC), a collaboration of thirteen HIV observational cohorts based in Europe and North America that supplied information on 12,574 patients2 http://www.art-cohort-collaboration.org. We identified five prognostic factors: CD4 cell count, viral load, AIDS at start of therapy, age and transmission group. These allocate patients to eighty strata, in some of which there were few events and no deaths at all, so that estimation of survival probabilities using Kaplan-Meier curves is impossible. Using regression models to estimate survival probabilities, estimates for strata with few or no events borrow strength from the pattern of events across all categories of the prognostic variables (at the cost of additional assumptions, for example that there is no interaction between the effects of different prognostic factors). Evaluation of the prognostic model in data from new patients has shown the model to be accurate and generalisable.
The figure contrasts the estimates from the parametric survival model for new AIDS event or death with those calculated using the Kaplan-Meier (KM) method for patients who were injecting drug users, had AIDS at start of therapy and had log viral load >5. The figure shows the survival curves for a) 119 patients (28 events) with CD4 cell count <_50 and="and" aged="aged" _50="_50" b="b" _11="_11" patients="patients" _2="_2" events="events" with="with" cd4="cd4" cell="cell" count="count" _100-199="_100-199" or="or" more="more" c="c" _3="_3" _50-99="_50-99" d="d" _4="_4" _0="_0" more.="more." the="the" model="model" km="km" curves="curves" agree="agree" quite="quite" closely="closely" in="in" a="a" but="but" _95="_95" confidence="confidence" interval="interval" ci="ci" for="for" is="is" much="much" narrower="narrower" than="than" that="that" estimate="estimate" particularly="particularly" smaller="smaller" group.="group." estimates="estimates" do="do" not="not" moreover="moreover" at="at" one="one" year="year" ranges="ranges" from="from" _5="_5" to="to" survival="survival" which="which" too="too" broad="broad" be="be" useful="useful" prediction="prediction" peters="peters" out="out" altogether="altogether" less="less" two="two" years="years" as="as" all="all" have="have" either="either" progressed="progressed" aids="aids" died="died" been="been" lost="lost" follow="follow" up.p="up.p"/> Figure d) shows a similar sized group to that in c) but by chance there were no events in this group despite having a worse risk profile (lower CD4 cell count). The KM estimate is therefore 100% survival, a completely misleading prediction. The four graphs show that KM estimates of survival are less precise than parametric survival model estimates, particularly for groups with few patients. Furthermore, in very small groups they may also be very inaccurate.
Margaret May
Statistician
Jonathan Sterne
Reader in Medical Statistics and Epidemiology
Department of Social Medicine, University of Bristol, Bristol BS6 2PR, UK
Matthias Egger
Professor of Epidemiology
Department of Social and Preventive Medicine, University of Bern, Switzerland
1. Lundin J, Lundin M, Isola J, Joensuu H. A web-based system for individualised survival estimation in breast cancer
BMJ 2003;326:29.
2. Egger M, May M, Ch�ne G, et al. Prognosis of HIV-1-infected patients starting highly active antiretroviral therapy: a collaborative analysis of prospective studies. Lancet 2002; 360:119-129.
FIGURE LEGEND
Figure showing the survival curve predicted from the parametric model and 95% CI with the corresponding KM estimate and 95% CI for patients who were injecting drug users, had AIDS at start of therapy and had log viral load >5 and had a) CD4<_50 age50="age50" b="b" cd4="cd4" _100-199="_100-199" age="age"/>50, c) CD4 50-99, age>50, d) CD4<_50 age="age"/>50.
Competing interests:
None declared
Competing interests: No competing interests