# Estimating treatment effects for individual patients based on the results of randomised clinical trials

BMJ 2011; 343 doi: http://dx.doi.org/10.1136/bmj.d5888 (Published 03 October 2011) Cite this as: BMJ 2011;343:d5888- Johannes A N Dorresteijn, epidemiologist and medical doctor1,
- Frank L J Visseren, professor of vascular medicine, epidemiologist, and internist1,
- Paul M Ridker, Eugene Braunwald professor of medicine, epidemiologist, and cardiologist2,
- Annemarie M J Wassink, internist and postdoctoral researcher1,
- Nina P Paynter, assistant professor of epidemiology 2,
- Ewout W Steyerberg, professor of medical decision making, and methodologist3,
- Yolanda van der Graaf, professor of epidemiology and imaging4,
- Nancy R Cook, associate professor of biostatistics and epidemiology2

^{1}Department of Vascular Medicine, University Medical Center Utrecht, PO Box 85500, 3508 GA Utrecht, Netherlands^{2}Division of Preventive Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA^{3}Department of Public Health, Erasmus Medical Center, Rotterdam, Netherlands^{4}Julius Center for Health Sciences and Primary Care, Utrecht, Netherlands

- Correspondence to: F L J Visseren F.L.J.Visseren{at}umcutrecht.nl

- Accepted 12 August 2011

## Abstract

**Objectives** To predict treatment effects for individual patients based on data from randomised trials, taking rosuvastatin treatment in the primary prevention of cardiovascular disease as an example, and to evaluate the net benefit of making treatment decisions for individual patients based on a predicted absolute treatment effect.

**Setting** As an example, data were used from the Justification for the Use of Statins in Prevention (JUPITER) trial, a randomised controlled trial evaluating the effect of rosuvastatin 20 mg daily versus placebo on the occurrence of cardiovascular events (myocardial infarction, stroke, arterial revascularisation, admission to hospital for unstable angina, or death from cardiovascular causes).

**Population** 17 802 healthy men and women who had low density lipoprotein cholesterol levels of less than 3.4 mmol/L and high sensitivity C reactive protein levels of 2.0 mg/L or more.

**Methods** Data from the Justification for the Use of Statins in Prevention trial were used to predict rosuvastatin treatment effect for individual patients based on existing risk scores (Framingham and Reynolds) and on a newly developed prediction model. We compared the net benefit of prediction based rosuvastatin treatment (selective treatment of patients whose predicted treatment effect exceeds a decision threshold) with the net benefit of treating either everyone or no one.

**Results** The median predicted 10 year absolute risk reduction for cardiovascular events was 4.4% (interquartile range 2.6-7.0%) based on the Framingham risk score, 4.2% (2.5-7.1%) based on the Reynolds score, and 3.9% (2.5-6.1%) based on the newly developed model (optimal fit model). Prediction based treatment was associated with more net benefit than treating everyone or no one, provided that the decision threshold was between 2% and 7%, and thus that the number willing to treat (NWT) to prevent one cardiovascular event over 10 years was between 15 and 50.

**Conclusions** Data from randomised trials can be used to predict treatment effect in terms of absolute risk reduction for individual patients, based on a newly developed model or, if available, existing risk scores. The value of such prediction of treatment effect for medical decision making is conditional on the NWT to prevent one outcome event.

**Trial registration number** Clinicaltrials.gov NCT00239681.

## Introduction

Usually the results of trials are implemented in clinical practice by either treating all patients (in the case of a positive trial result) or treating no one (in the case of a negative trial result), expecting the treatment effect for every patient to be similar to the average treatment effect in the original trial. Clinicians intuitively know that this idea is oversimplified because in reality some patients benefit more than average from treatment, whereas others do not or may even be harmed.1 2 3 4 5 6

The direct translation of trial results to individual patients in clinical practice is, however, complicated by some important limitations. The treatment effects of randomised trials are typically expressed in terms of relative risks or hazard ratios at a group level—that is, treatment versus control. Yet treatment that is associated with a considerable reduction in relative risk will still result in a modest absolute effect when the incidence rate of the disease is low. Absolute risk reduction is usually more informative because it combines the relative risk reduction and the incidence rate of the disease outcome.1 2 The absolute risk reduction is sometimes expressed in trial reports as the number needed to treat (NNT). Still, implicit in the use of estimates at group level is that all patients are at average risk and all have the same likelihood of response to treatment. Usually at least one of these two assumptions is untrue because the expected absolute risk reduction resulting from treatment often depends on the characteristics of individual patients.1 2 3 4 5 6

Although prespecified subgroup analyses take a step towards identifying those characteristics of patients that modify the treatment effect, some important limitations are retained. In subgroup analyses the study cohort is typically divided according to the presence or absence of a single patient characteristic such as diabetes, age (below or above a certain limit), or sex, and the effect of the intervention is presented accordingly. However, these univariable analyses do not fully incorporate all available patient characteristics and are less well powered but still return relative, rather than absolute, average effect measures at a group level.2 3 4

A more comprehensive approach towards making well informed decisions about treatment is to predict the treatment effect for individual patients based on all relevant characteristics together.1 2 3 4 5 7 Although not yet widely appreciated, data from randomised controlled trials usually provide an opportunity to develop models for the prediction of a treatment effect on the basis of individual patient characteristics.1 5 Such models can enable clinicians to estimate a treatment effect for individual patients in terms of absolute risk reduction for the disease of interest. This can be done before the start of intended treatment, and therefore decisions about treatment can be based on such predictions.4 Moreover, individualised predictions of treatment effect provide an opportunity to determine which implications the results of randomised trials should have in clinical practice.6 Making treatment decisions on the basis of a predicted treatment effect for individual patients may in some situations result in more net benefit on a group level than treating all patients (in the case of a positive trial result) or treating no one (in the case of a negative trial result). Although this approach is occasionally used in the research of cancer8 9 10 and cardiovascular disease,11 12 13 the full potential has yet to be recognised by both researchers and clinicians.

We developed and evaluated methods for predicting treatment effect using rosuvastatin in individual patients in a primary prevention setting based on data from the Justification for the Use of Statins in Prevention trial.14 This study was a randomised, double blind, placebo controlled, multicentre trial that showed on average a 44% relative risk reduction in major vascular events in those treated with rosuvastatin. As the trial was carried out in a primary prevention cohort at moderate absolute risk for cardiovascular disease, the overall treatment effect was modest for average absolute risk reduction. Therefore the trial represents a typical situation in which the prediction of treatment effect can be used to identify those who will benefit from treatment. We predicted treatment effects for individual patients based on data from randomised trials, taking rosuvastatin treatment in primary prevention of cardiovascular disease as an example, and evaluated the net benefit of making treatment decisions for individual patients based on predicted absolute treatment effect.

## Methods

The design, rationale, and outcomes of the Justification for the Use of Statins in Prevention trial are described in detail elsewhere.14 15 16 Briefly, the trial evaluated the effect of rosuvastatin 20 mg daily compared with placebo on the occurrence of myocardial infarction, stroke, arterial revascularisation, admission to hospital for unstable angina, or death from cardiovascular causes among 17 802 apparently healthy men and women who had low density lipoprotein cholesterol levels of less than 3.4 mmol/L (130 mg/dL) and high sensitivity C reactive protein levels of 2.0 mg/L or more. After a median follow-up of 1.9 years the hazard ratio for occurrence of the primary end point was 0.56 (95% confidence interval 0.46 to 0.69), favouring rosuvastatin.14 Univariable subgroup analyses (for example, for age, sex, smoking status, ethnicity, and Framingham risk score) showed no significant deviations from this effect size.

### Estimating treatment effects for individual patients

We estimated the baseline 10 year risk for cardiovascular events (myocardial infarction, stroke, arterial revascularisation, admission to hospital for unstable angina, or death from cardiovascular causes) for individual patients if untreated, using the existing Framingham risk score17 and the Reynolds risk score, without prior updating or refitting of the coefficients (see box).18 19 Based on the assumption that treatment effect increases linearly with baseline risk (fig 1⇓), we estimated the patient’s residual risk when given treatment by multiplying baseline risk by the overall relative effect measure (relative risk or hazard ratio) from the original trial report. Consequently the estimated absolute risk reduction achieved by treatment with rosuvastatin for 10 years (10 year treatment effect) is equal to the difference between these two [individual treatment effect=(1−overall relative effect measure from trial)×baseline risk derived from an existing prediction model].

#### Prediction of treatment effect for individual patients

##### 10 year treatment effect (absolute risk reduction)=baseline risk without treatment−residual risk with rosuvastatin treatment

###### Baseline 10 year absolute risk for cardiovascular events (%) without treatment

Framingham risk score based method: risk as calculated from the Framingham risk score, published in the Adult Treatment Panel III guidelines17

Reynolds risk score based method: risk as calculated from the Reynolds risk score, derived for women from the Women’s Health Study18 and for men from the Physicians Health Study19

Optimal fit model*:

(1−0.985433

^{(5×exp[B])})×100%, where:B=0.09379363 × age in years + 3.34656382 (if male) − 0.03698750 × age in years (if male) + 0.81823698 (if current smoker) + 0.54045383 (if using blood pressure lowering drugs) + 0.60281674 (if family history of premature coronary heart disease) − 6.9932

###### Residual 10 year absolute risk for cardiovascular events (%) with rosuvastatin treatment

Framingham risk score: 0.56 × baseline 10 year absolute risk for cardiovascular events (%) without treatment

Reynolds risk score: 0.56 × baseline 10 year absolute risk for cardiovascular events (%) without treatment

Optimal fit model*:

(1−0.985433

^{(5×exp[B]})×100%, where:B=0.09379363 × age in years + 3.34656382 (if male) − 0.03698750 × age in years (if male) + 0.81823698 (if current smoker) + 0.54045383 (if using blood pressure lowering drugs) + 0.00932154 (if family history of premature coronary heart disease) − 7.484613

*Treatment effect of rosuvastatin expressed in optimal fit model for residual risk as different coefficient for family history of premature coronary heart disease and different constant subtraction factor

Alternatively we developed a new prediction model (optimal fit model) based on trial data only (see web extra appendix 1). A theoretical advantage of this strategy over using existing risk scores is that the model may be better calibrated to the population of interest. Furthermore, such a model is not based on the assumption that treatment effect increases linearly with baseline risk: modification of the treatment effect by patient characteristics can be tested and, if significant, included in the model. Importantly, even in the absence of subgroup effects defined by univariable characteristics, a multivariable adjusted prediction model may contain such modifications of treatment effect. In situations where no existing prediction models are available, developing a new prediction model may be the only option.

### Performance of the prediction models

We assessed the calibration of the predictions based on the Framingham risk score, the Reynolds risk score, and the optimal fit model. To do this we plotted the observed Kaplan-Meier survival for cardiovascular events at two years within 10ths of the predicted survival against the mean predicted two year survival of each 10th and by the P value derived from the Hosmer-Lemeshow test. Based on the assumption that the hazard rate is constant and thus survival is exponential over time we derived two year risk estimates of the Framingham risk score and the Reynolds risk score from the 10 year predicted risks. Discrimination was assessed by calculation of the C statistic.

### Assessment of net benefit

We determined the value of individualised predictions of treatment effect for medical decision making using the previously described net benefit assessment method.5 This method calculates the impact of different treatment strategies using the event rates and the treatment rates in study participants. We considered the following approaches to rosuvastatin treatment of patients without previous vascular disease or diabetes mellitus and low levels of low density lipoprotein cholesterol: treat all, treat no one, or treat based on prediction—that is, the selective treatment of patients whose predicted treatment effect exceeds a decision threshold. To facilitate clinical interpretation we extrapolated the observed event rates at two years to 10 years (see web extra appendix 2 for an explanation of the net benefit assessment method, with a sample calculation).

The decision threshold used for prediction based treatment represents the estimated harms of treatment, such as excess risk for adverse reactions, monetary costs, and the discomfort of sustaining treatment (fig 1). Notably, estimation of the harms of treatment is also needed to calculate and interpret the net benefit of one treatment strategy over another. One research team proposed to estimate the decision threshold by weighing the harms of treatment against the harms of an outcome event.5 20 For example, if the harms of a cardiovascular event are assumed to be 20 times worse than those of rosuvastatin treatment for 10 years, the appropriate decision threshold is 5% (1 divided by 20), and only those individuals whose predicted 10 year absolute treatment effect exceeds 5% should be advised to start rosuvastatin treatment. Usually, however, the level of the decision threshold is not discussed, but rather the maximum acceptable number needed to treat (NNT). For this purpose we propose to rename the NNT that is associated with clinical equipoise as the number willing to treat (NWT). If rosuvastatin treatment of 20 people for 10 years is assumed to be exactly as harmful as one outcome event (for example, a case of myocardial infarction), doctors would be willing to treat up to 20 patients to prevent one event, therefore the NWT is 20. The NWT is the inverse of the decision threshold but generally more intuitive to clinicians.

The main harms resulting from rosuvastatin treatment include monetary costs and the discomfort of taking the drug daily, since multiple trials, including the Justification for the Use of Statins in Prevention trial, show that treatment with rosuvastatin 20 mg daily is not associated with an increased risk of adverse reactions, except for a small increase in the probability of newly diagnosed diabetes. Moreover, particularly among those with impaired fasting glucose (the group most likely to develop diabetes), large risk reductions in macrovascular disease are observed. None the less, the appropriate decision threshold is subjective and may differ between countries and over time. For this reason we did not make any assumptions about the severity of the harms associated with treatment but calculated the net benefit for a range of values of NWT. To graphically represent the net benefit assessment results for this range of values of NWT in a decision curve, we applied locally weighted scatter plot smoothing.5 20

Analyses were done using open source statistical software, R version 2.10.0 (R Foundation for Statistical Computing, www.R-project.org).

## Results

Table 1⇓ shows the baseline clinical characteristics of the Justification for the Use of Statins in Prevention cohort. Overall, 140 events were observed in the rosuvastatin treated group (8853 participants) and 251 in the placebo treated group (8857 participants). Data related to 92 participants were excluded from the analyses owing to missing data for one or more predictor variables. The Kaplan-Meier survival curves of both treatment groups had been published before and did not show any remarkable aberrations at the two year follow-up.14

The box shows the models used for the prediction of the treatment effect from using rosuvastatin. The final optimal fit model contains terms for age, sex, age-sex interaction, smoking, blood pressure lowering drugs, and family history of premature myocardial infarction. Importantly, the study population was selected to have low density lipoprotein cholesterol levels of less than 3.4 mmol/L (130 mg/dL) and high sensitivity C reactive protein levels of 2.0 mg/L or more. This might have contributed to the fact that neither lipids nor high sensitivity C reactive protein were selected in the final optimal fit model. The model contained one treatment-covariate interaction: a family history of premature coronary heart disease was the only patient characteristic that affected the rosuvastatin treatment effect.

Calibration and discrimination of all three prediction methods were moderate. The C statistic of the Framingham model based predictions in the Justification for the Use of Statins in Prevention cohort was 0.65 (95% confidence interval 0.62 to 0.68), almost equal to the C statistic of the predictions based on the Reynolds model (0.66, 0.63 to 0.69). As expected, the optimal fit model performed a little better because discrimination was tested in the same cohort from which it was developed. The C statistic was 0.71 (0.68 to 0.74). The Reynolds risk score somewhat overestimated risk for cardiovascular events within the highest 10th of predicted risk, resulting in some lack of fit as evidenced by a significant Hosmer-Lemeshow statistic (fig 2⇓).

The Framingham risk score, the Reynolds risk score, and the optimal fit model can be applied to calculate the predicted 10 year treatment effect of using rosuvastatin for two patient scenarios (table 2⇓). Likewise, the 10 year treatment effect was predicted for every individual within the Justification for the Use of Statins in Prevention cohort, the distributions of which are presented in figure 3⇓. Coloured bars indicate how the predicted 10 year treatment effect of the two patient scenarios relate to that of all participants within the study cohort. The median predicted 10 year absolute risk reduction for all participants in the Justification for the Use of Statins in Prevention trial according to the Framingham based model was 4.4% (interquartile range 2.6-7.0%), the Reynolds based model was 4.2% (2.5-7.1%), and the optimal fit model was 3.9% (2.5-6.1%).

### Net benefit assessment

Web extra appendix 2 shows an example of a net benefit calculation. In this example the net benefit of prediction based treatment using the Framingham risk score is compared with the net benefit of treating all patients, assuming that 20 is the appropriate NWT. Similar calculations were carried out for a range of values for NWT and also for prediction based treatment using the Reynolds risk score and optimal fit model. The net benefit of treating no one serves as a reference and is equal to zero. The net benefit of the other strategies represents the resulting decrease in the event rate minus the cost of treatment.

Treatment of all patients is more beneficial than treating no one if the NWT is high (little harm, treat even at low risk) but not if the NWT is low (considerable harm, treat at high risk only; table 3⇓ and fig 4⇓). If the NWT is about 20, then the benefits of treating all patients and treating no one are equivalent (zero). Prediction based treatment is associated with equal net benefit as treating all patients for high values of NWT and the net benefit curves of prediction based treatment converge to zero (treat no one) for lower values of NWT (fig 4). For a range of NWT (between about 15 and 50), prediction based treatment is the preferred strategy of choice. Notably, the net benefits of prediction based treatment based on the optimal fit model and the Framingham or Reynolds risk score were similar. Therefore the assumption that treatment effect increases linearly with baseline cardiovascular risk (fig 1) appears to be true in this example.

Interpreting the size of the net benefit advantage of one strategy over another is complex. One study proposed to imagine that the same net benefit value was achieved by an infallible prediction model that identifies a certain percentage of people as being not at risk for the outcome and thus not in need of treatment. Such a fictitious infallible prediction model reduces the treatment rate without increasing the event rate.5 If this method is applied to the present data this means that for a NWT of 30, the net benefit advantage of prediction based treatment (mean net benefit over all three methods is 0.0228) over treating all patients (net benefit is 0.0165), is equivalent to that of treatment by a fictitious infallible prediction model that reduces the treatment rate by 19% without increasing the event rate. Likewise, if the NWT is 20, the mean advantage of prediction based treatment over treating all patients is equal to a 16% reduction of the treatment rate.

### Translation to clinical practice

Figure 5⇓ illustrates how the findings could be translated to clinical practice. Treatment of all patients is the strategy of choice if the 10 year NWT is 50 or more. Treat none is preferable if the 10 year NWT is 15 or fewer. If the NWT is between 15 and 50, prediction based treatment results in most net benefit. Because the three prediction methods resulted in similar net benefit, treatment prediction based on existing risk scores is most appropriate in clinical practice. These risk scores are already externally validated and more easily implemented. This means that if, for example, the 10 year NWT to prevent one cardiovascular event is 20, patients with a baseline (for example, Framingham score or Reynolds score) risk of 11.4% or more (95% confidence interval 9.3% to 16.1%) benefit from treatment. Likewise, if the 10 year NWT is 30, patients with a baseline risk of 7.6% or more (6.2% to 10.8%) benefit from treatment. These findings do not contradict the current guidelines, which also recommend treating those whose risk for cardiovascular events exceeds a certain threshold.21 However, our findings do suggest that the optimal treatment threshold may be lower than is often assumed.

## Discussion

The direct translation of results of trials to individual patients in clinical practice is often difficult because not all respond to treatment similar to the average patient enrolled in a trial. This is because the effect of treatment often depends on the characteristics of individual patients. In the present study we have shown how data from randomised clinical trials can be used to predict absolute treatment effects for individual patients, taking patient characteristics into account. In addition, we have assessed the added value of such predictions for medical decision making.

Implementation of an individualised prediction of treatment effect in clinical practice is not necessarily complicated. Several prediction rules are already available for estimating baseline risk for vascular events in primary prevention—for example, the Framingham risk score and Reynolds risk score. The example from the Justification for the Use of Statins in Prevention trial shows that estimation of an individual treatment effect can be as easy as multiplying the individual baseline risk, as estimated from the Framingham risk score or the Reynolds risk score, by the average relative treatment effect from the trial report. If, however, risk scores are not yet available in a certain area of medicine, a new prediction model to estimate individual treatment effect can be developed from the trial data. The methods described in this paper can thus be applied to various medical specialties. Online calculators and integration of prediction models in electronic patient record systems could facilitate the widespread use of prediction of treatment effect in clinical practice. The trial example used in this article also shows that even when discrimination and calibration of a prediction model are moderate, the net benefit of treatment assignment according to prediction can still be superior to both treating all patients within the study domain and treating no one for a certain range of NWT (in this example between about 15 and 50).

Prediction of treatment effect for individual patients may enable doctors to practise individualised medicine in an evidence based manner. It could help to make better informed treatment decisions and perhaps motivate patients to adhere to treatment. Presentation of the net benefit of all possible strategies of treatment assignment for a spectrum of NWT is useful in this respect because the NWT possibly varies with patient and provider preferences. This is especially true when treatment is associated with important adverse reactions. For example, treatment with tissue plasminogen activator for acute myocardial infarction is associated with an increased risk for intracranial haemorrhage that also varies according to individual patient characteristics.6 13 If patients have difficulties understanding the concept of risk, the predicted individual treatment effect (expressed in terms of absolute risk reduction) can be expressed as a NNT (the number of similar patients that needs to be treated to prevent one outcome event; table 2), which might be more intuitive, and this can be compared to the appropriate NWT.

Prediction of treatment effect for individual patients might also facilitate the work of practice guideline committees that aim to make well informed decisions about indications for treatment on a group level. When the trial results are presented using the methods presented in this paper, the remaining issue that guideline committees need to focus on is the appropriate NWT. The NWT is estimated by weighing the total harms of treatment (for example, adverse reactions, monetary costs, discomfort of sustaining treatment) against the harms of the outcome event of interest (cardiovascular event). For any given NWT three possible treatment strategies must be considered: treat everyone, treat no one, or treat based on prediction (selective treatment of patients whose predicted treatment effect exceeds a decision threshold). When the NWT is agreed on, the trial results can be used to estimate the net benefit of each strategy (table 3 and fig 4) and to determine the optimal treatment strategy (fig 5). The treatment strategy with the highest net benefit for the appropriate value of NWT results in the most favourable trade-off between treatment rate and event rate. Applying this strategy in clinical practice leads to more selective treatment of patients who will benefit from treatment.

Previously, risk stratified reporting of trial results was proposed as a method for presenting heterogeneity of treatment effects in trials.3 7 In line with this, the relative risk and NNT for participants of the Justification for the Use of Statins in Prevention trial within subgroups of estimated baseline risk were published earlier.22 23 24 Stratified analysis of treatment effects in subgroups of the total study cohort may, however, lead to imprecision owing to loss of statistical power. Moreover, existing risk scores are not available for many diseases, invalidating the risk stratified approach. Also, risk based stratification may still obscure important modification of relative treatment effect that can be discovered or excluded (as in the Justification for the Use of Statins in Prevention trial example) by a multivariate model for predicting treatment effect based on trial data. Also, the cut-off values for defining subgroups of estimated baseline risk are usually predefined, whereas the methods shown in this paper allow searching for the treatment threshold that is associated with maximum net benefit.

Although data from clinical trials have been used before to predict treatment effects for individual patients, evidence supporting the added value of individualised prediction of treatment effect for clinical practice has been sparse.8 9 10 11 12 13 Expensive and long lasting impact trials were needed to show the benefit of prediction based treatment.25 In this article we show that the net benefit assessment methods, described previously, provide a more efficient and readily available opportunity for evaluating the potential net benefit of prediction based treatment and for determining implications of contemporary trial results for clinical practice.5 This report also shows that the added value of individualised prediction of treatment effect for medical decision making may not be universal but instead is conditional on the NWT.

### Limitations and challenges of the study

Limitations of using trial data for individualised predictions of treatment effect generally include short and variable follow-up times, whereas meaningful predictions of cardiovascular event risk usually comprise a 10 year period. This is particularly true for the Justification for the Use of Statins in Prevention trial because the study was discontinued early, but few clinical trials have a follow-up period as long as 10 years either. Thus the predictions and observations usually need to be extrapolated. Furthermore, similar to conventional trial reports, generalisability of the results may be problematic. Trial participants were often selected on the basis of strict eligibility criteria and are healthier and more compliant to treatment than are patients in clinical practice.6 In this example, the results apply to patients without manifest vascular disease or diabetes, but additional eligibility criteria of the trial were low levels of low density lipoprotein cholesterol and increased levels of high sensitivity C reactive protein. Hence application of trial based predictions of treatment effect to the general population may be suboptimal. This is especially true for newly fit models because important risk factors (such as low density lipoprotein cholesterol and high sensitivity C reactive protein in our example) may not be included in the prediction model if all trial participants had similar characteristics.

Apart from these practical constraints, many feel reluctant to interpret the implications of subgroup analyses let alone multivariate prediction of treatment effect, because over-accuracy and chance findings may occur.26 Predictions of treatment effect should therefore be based on existing risk scores developed in external data when possible.2 7 Yet even when validated risk scores are available, as in our example, developing a new prediction model fit to the trial data can help to confirm the assumption that treatment effect increases linearly with baseline risk (fig 1). Moreover, it should be stressed that the estimated treatment effects in prediction models originating from randomised trials are not subject to confounding bias, because treatment was allocated randomly in the study population. Over-fitting can be minimised by careful and preferably prespecified selection of candidate predictors and shrinkage of the model coefficients when needed. Web extra appendix 3 summarises considerations that need to be taken into account when applying the methods described in this paper to other trial datasets.

### Conclusions

Data from randomised trials can be used to predict treatment effect in terms of absolute risk reduction for individual patients before the start of intended treatment. Predictions could be based on existing risk scores, if available, or a newly developed model. The value of such prediction of treatment effect for medical decision making is conditional on the NWT to prevent one outcome event. Prediction based treatment may result in positive net benefit for a range of NWT, even when model calibration and discrimination are moderate. The methods shown in this paper could therefore become a routine part of reporting clinical trials and be used in everyday clinical practice.

#### What is already known on this topic

In clinical practice some patients benefit more than average from treatment, whereas others do not or may even be harmed

Implementing trial results by treating all or no patients, expecting the treatment effect for everyone to be similar to the average treatment effect in the original trial, may not lead to optimal benefit

#### What this study adds

Data from randomised trials can be used to predict treatment effect in terms of absolute risk reduction for individual patients

Predictions could be based on existing validated risk scores, if available, or a new prediction model fit to the trial data

The value of such prediction of treatment effect for medical decision making is conditional on the number willing to treat (NWT) to prevent one outcome event

## Notes

**Cite this as:** *BMJ* 2011;343:d5888

## Footnotes

Contributors: JAND designed and carried out the data analyses, interpreted the results, and drafted the manuscript. FLJV conceived the research question, designed the data analyses, interpreted the results, and revised the manuscript for important intellectual content. PMR conceived the research question, collected the data, designed the data analyses, interpreted the results, revised the manuscript for important intellectual content, and is guarantor for the validity of the data and analyses. AMJW and NPP designed the data analyses, interpreted the results, and revised the manuscript for important intellectual content. EWS and YvdG conceived the research question, designed the data analyses, interpreted the results, and revised the manuscript for important intellectual content. NRC conceived the research question, collected the data, designed the data analyses, interpreted the results, and revised the manuscript for important intellectual content.

Funding: The Justification for the Use of Statins in Prevention was an investigator initiated trial. The sponsor of the study collected the trial data and monitored the study sites but had no role in the conduct of the analyses or drafting of the report. All statistical analyses were done by the investigators.

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: PMR is the principal investigator of the investigator initiated Justification for the Use of Statins in Prevention trial, which was funded by AstraZeneca (Wilmington, Delaware). PMR received grant support from Novartis and Roche; consulting fees from Siemens Medical Systems, ISIS, and Vascular Biogenetics; and is listed as a co-inventor on patents held by the Brigham and Women’s Hospital that relate to the use of inflammatory biomarkers in cardiovascular disease that have been licensed to Siemens Medical Systems (Erlangen, Germany) and AstraZeneca. FLJV’s department receives grant support from Merck, the Netherlands Organisation for Health Research and Development, and the Catharijne Foundation Utrecht; and speaker fees from Merck and AstraZeneca. JAND, AMJW, NPP, EWS, YvdG, and NRC have no relationships with industry that might have an interest in the submitted work in the previous three years. All authors have no non-financial interests that may be relevant to the submitted work.

Ethical approval: The protocol for the Justification for the Use of Statins in Prevention trial was approved by the local institutional review boards at each participating centre. All study participants provided written informed consent before taking part.

Data sharing: No additional data available.

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.