Intended for healthcare professionals

Information In Practice Infopoints

A web-based system for individualised survival estimation in breast cancer

BMJ 2003; 326 doi: (Published 04 January 2003) Cite this as: BMJ 2003;326:29
  1. Johan Lundin, researcher,
  2. Mikael Lundin, researcher,
  3. Jorma Isola, professor,
  4. Heikki Joensuu, professor
  1. Department of Oncology, University of Helsinki, FIN-00290, Helsinki, Finland
  2. Institute of Medical Technology, University of Tampere and Tampere University Hospital, FIN-33014 Tampere, Finland
  3. Molecular/Cancer Biology Research Programme and Department of Oncology, University of Helsinki, FIN-00290, Helsinki, Finland

    Clinicians want prognostic tools that not only aid prognostic classification, but also give quantitative probabilities of survival.1 We describe a way of generating survival estimates that uses existing survival data and generates survival curves online dynamically.

    The website

    On the website a selection of prognostic factors are available for case-match survival estimation (figure).* The default selection in the drop down list for each factor is “all,” which means that no selection has been made for the specific factor. The user can enter a prognostic factor profile by selecting any of the categories in the drop down lists. The software then queries the database to retrieve data on patients with matching prognostic profiles and known outcome and calculates a survival curve according to the Kaplan-Meier product-limit method using the actual survival data of all matching patients. The number of patients at risk, the confidence intervals for the Kaplan-Meier estimates, and the median survival time are also displayed. The user can compare two factor profiles by clicking the “two profiles” option. The distribution of patients according to vital status, therapy received, or a specific prognostic factor can also be displayed as a table or a chart (figure). The website also contains basic information about survival statistics and the prognostic factors, including guidelines for selecting variables and interpreting the results.

    Embedded Image

    The database

    This case-match survival estimation system could be applied to any clinical database with time to event information. At our website it is applied to the FinProg breast cancer series,2 which includes individual clinical data on women diagnosed with breast cancer in 1991–2 in five regions of Finland (representing about 50% of the Finnish population). The minimum prognostic factor information was available for 2842 patients (91% of all cases in the five regions and 51% of all breast cancers diagnosed in Finland in 1991-2). After we excluded some cases (in situ carcinoma, bilateral tumour, etc), 2032 patients were left in the final data set. All personal identification information was deleted before linking the database to the website. The distant disease free survival is calculated from the date of diagnosis to the occurrence of metastases outside the locoregional area and survival to death from breast cancer. Patients who died of other causes or were lost to follow up are censored.

    The prognostic factors available on the website are those recommended by the National Institutes of Health Consensus Development Panel3 and the International Consensus Panel on the Treatment of Primary Breast Cancer.4

    The advantages

    The advantage of this system is that clinicians and researchers can obtain survival estimates based on actual data, rather than inferential estimates generated by a regression formula. The output is a survival curve for the entire available follow up period and not just for a single time point. However, the robustness and accuracy of the Kaplan-Meier estimates depend on the quality of the underlying data set.

    Users are not restricted to a single model but can enter any prognostic factor data they have available. However, the more variables selected the fewer patients will match the selected categories and the more uncertain the survival estimates. Users should thereforestudy the variable selection guide on the websiteand select the most important variables first, keeping an eye on the number of matching patients and the confidence intervals.

    Our system also responds to a demand for openness, giving other researchers the possibility to check not only published results but also to further explore the database. It offers a compromise between demands to make raw data available and the reluctance of many researchers entirely to release their valuable datasets. This interactive, web based system may facilitate explorative analyses of prognostic factors and could be applied to a variety of diseases.



    1. 1.
    2. 2.
    3. 3.
    4. 4.
    View Abstract