Benefits and harms associated with hormone replacement therapy: clinical decision analysisBMJ 2004; 328 doi: https://doi.org/10.1136/bmj.328.7436.371 (Published 12 February 2004) Cite this as: BMJ 2004;328:371
- Cosetta Minelli, research associate in evidence synthesis1,
- Keith R Abrams, professor of medical statistics ()1,
- Alex J Sutton, lecturer in medical statistics1,
- Nicola J Cooper, research fellow in health services research1
- 1Centre for Biostatistics and Genetic Epidemiology, Department of Health Sciences, University of Leicester, Leicester LE1 6TP
- Correspondence to: K R Abrams
Objective To evaluate harms and benefits associated with use of combined hormone replacement therapy (HRT) for five years in women with different baseline risks for breast cancer.
Design Probabilistic clinical decision analysis.
Setting Hypothetical population of white UK women aged 50 years with different baseline risks for breast cancer.
Main outcome measure Gain or loss in quality adjusted life years (QALYs).
Results Women free of menopausal symptoms showed a net harm from HRT use, which increased for increasing baseline risk of breast cancer. Those with a baseline risk of 1.2% would expect a loss in QALYs of 0.4 months (- 0.03 QALYs, 95% credibility interval - 0.05 to - 0.01). The main analysis showed HRT to be on average beneficial in women with symptoms, with benefit decreasing with increasing baseline risk of breast cancer. The results were sensitive to the assumed value of quality of life with menopausal symptoms, therefore a contour plot was developed to show the probability of net harm for a range of different values and baseline risks.
Conclusions HRT for primary prevention of chronic diseases in women without menopausal symptoms is unjustified. Perceived quality of life in women with symptoms should be taken into account when deciding on HRT. Thus, a decision analysis tailored to an individual woman is more appropriate in clinical practice than a population based approach.
Although hormone replacement therapy (HRT) is mainly used to relieve menopausal symptoms, it has shown benefits in several chronic diseases such as osteoporosis, colorectal cancer, and depression; it may also have a protective role in dementia and cognitive decline in postmenopausal women.1–7 HRT is also associated with an increased risk of breast cancer, stroke, and venous thromboembolism.2 3 8 Ovarian and endometrial cancers have been associated with unopposed oestrogens and unopposed or sequential regimens, respectively, and continuous treatment with combined oestrogen and progestogen seems to protect against endometrial cancer.9–14 In contrast to observational studies, recent randomised controlled trials showed an increased risk of coronary heart disease associated with HRT.2 13 15 16 A modest increase in the risk of gallstones and possibly gallbladder cancer have been reported, but there is still a paucity of evidence.17–19
The prevention of chronic diseases, particularly osteoporosis, has been a strong consideration in the prescription of HRT, but potential risks need to be reviewed.13 14 16 20 Several decision analyses have assessed the risks and benefits of HRT, yet the only one that took into account recent evidence from randomised controlled trials, particularly the women's health initiative trial (16 600 women), was qualitative.13 21–28 The women's health initiative trial addressed this issue by defining a “global index,” summarising the balance of risks and benefits of HRT. It could not recommend HRT for primary prevention of chronic diseases; in fact the trial was halted on the basis of an interim analysis. In balancing harms and benefits, the researchers did not consider menopausal symptoms, which could have been achieved by using quality of life (QoL). Therefore, although these results are important, they are not sufficient evidence on which to develop a strategy for HRT use. Given the potential harms of HRT, future large trials might be difficult to plan: based primarily on the women's health initiative trial, a large UK trial (22 000 women by the end of 2016) was halted.29
We performed a decision analysis on the benefits and harms of HRT based on the best currently available evidence. We considered combined HRT in women free of menopausal symptoms (when HRT might be given for primary prevention of chronic diseases) and in women with symptoms.
Benefits associated with combined HRT were relief of menopausal symptoms, prevention of hip fractures, and decreased risk of colorectal and endometrial cancers. Harms were an increased risk of breast cancer, coronary heart disease, pulmonary embolism, and stroke (fig 1).
Quality adjusted life years (QALYs) were used to study the impact of each outcome. These are standard measures, which evaluate deterioration in length of life along with changes in quality of years of life left when comparing different health states.30 QALYs are calculated by multiplying the QoL weight of a particular disease by time with the disease. We used a utility based QoL weight, a number between 0 and 1 (0 for death, 1 for perfect health), reflecting the “desirability” of that condition for each year spent in it, so that 1 - QoL weight represents the loss of utility associated with the disease for each year.31 We therefore evaluated the “net gain” in QALYs obtained with HRT, with a positive value for overall benefit and a negative value for overall harm. Our target population was white UK women aged 50, with or without menopausal symptoms, who had used combined HRT for five years.
We used the net benefit model, originally described by Glasziou and Irwig.32 The basic equation on which the model is based is net benefit = (risk level x risk reduction) - harm. We extended the model to include multiple outcomes of benefit and harm by assuming additive effects for both and by considering the associated uncertainty. This approach consisted of subtracting the harm of an intervention from its benefit, with both expressed as a common measure of effect. The measure we chose was QoL, as this allowed us to evaluate the impact on overall health of relief of menopausal symptoms from HRT.33 We estimated the net benefit of HRT in patients with different baseline risks for breast cancer, the most relevant adverse outcome given the magnitude of the relative risk associated with HRT and the background incidence and mortality in the study population (table). The model identifies a baseline risk threshold above which potential harms outweigh benefits.
Figure 1 shows the model. The average loss in QALYs from harms was calculated by multiplying the estimated number of cases caused (five year cumulative incidence and the relative risk increase of the disease) by the loss in QoL associated with the disease over five years. Since QoL was evaluated only for women who were still alive with the disease, we also considered the impact of HRT on mortality by multiplying the number of cases caused by the five year cause specific mortality. Thus, the effect of HRT on the increase in risk is limited to the five year treatment period, but the effect on QoL and mortality is over five years from the occurrence of the disease. As effect on QALYs due to deaths caused would continue for the remainder of a woman's expected lifetime, we projected the impact of mortality on overall QALYs, based on the average life expectancy of a UK woman of 80 years and an annual discount rate of 3%.34 35
We followed the same structure for benefits, but we considered the relative risk reduction of the outcome associated with HRT, with prevented rather than excess cases of the disease, and associated deaths.
For breast cancer, we considered baseline risks ranging from 0.1% to 50% rather than an average risk. We assessed high risk levels to determine the suitability of HRT in women with BRCA1 or BRCA2 gene mutations, in whom the risk of breast cancer can approach 50%.36 37 For menopausal symptoms we only considered the relative risk reduction in symptoms and the impact on QoL because symptoms were experienced by the whole target population and did not lead to any cause specific mortality.
Our model was developed within a bayesian framework using non-informative prior distributions for all model parameters, so that we did not use information external to the data, and possibly based on subjective prior knowledge.38 A bayesian approach was adopted because of its flexibility in statistical modelling.39 40 Model parameters were estimated by using Markov Chain Monte Carlo simulation methods (WinBUGS 1.4; MRC Biostatistics Unit, Cambridge).39 Results are reported with credibility intervals, analogous to confidence intervals from a classical approach.
Our model is based on three assumptions:the additive nature of the QoL lost or gained for risks or benefits of HRT; the constant increase in relative risk of breast cancer associated with HRT for all levels of baseline risk, with an absolute risk increase increasing linearly with baseline risk; and the constant absolute risk increase and absolute risk reduction associated with HRT for all other outcomes at all levels of baseline risk of breast cancer. The assumption of additive effects implies that there is no interaction in the effect of HRT on different outcomes at any level (baseline risk, relative risk increase or relative risk reduction, QoL weight).
Data for model
The table gives the data used in the model. When available, we selected incidence, mortality, and QoL data for the target population, otherwise we used data on more general populations, such as broader age category and men (see bmj.com).
Relative risk reduction and relative risk increase
We calculated the relative risk reduction for benefits and relative risk increase for harms from the relative risk of developing each outcome in HRT users compared with non-users—that is, relative risk reduction = 1 - relative risk and relative risk increase = relative risk - 1. The data were based on three randomised controlled trials, reviewed in a recent meta-analysis, although for the heart and estrogen/progestin replacement study II trial we used updated results.3 14 16 We excluded a fourth trial included in the review since the intervention was unopposed oestrogen therapy.41 We estimated the pooled relative risk using a fixed effect meta-analysis, due to the small number of trials (table). As none of the three trials considered relief of menopausal symptoms, we based the relative risk on a recent meta-analysis of trials carried out by the Cochrane Collaboration.42
Quality of life
We chose QoL data based on the methods used, in particular the time trade-off method. For this method participants are asked to make trade-offs between a shorter life span in “perfect” health compared with a longer life span with the condition under study. Respondents were individuals in the community, not affected by the disease. Exceptions were endometrial cancer and menopausal symptoms for which data based on time trade-off and unaffected women in the community, respectively, were not found.
Although several studies investigated QoL associated with menopausal symptoms, only two used the time trade-off method: QoL weights per year 0.63 (95% confidence intervals 0.58 to 0.68) and 0.75 (0.66 to 0.83).33 43 The value of 0.63 was for 104 women taking HRT for mild or severe symptoms and who recalled their QoL before starting HRT, with the assumption that recall represented the “no treatment” state. As we had reservations about this, particularly bias towards a lower estimate, we selected the 0.75 value. This was assessed using hypothetical scenarios depicting mild and severe symptoms in 63 postmenopausal women. Although this assumption was less extreme, the result that women might give up a quarter of a year of life to live the rest of the year without menopausal symptoms is surprising, particularly when considering the much higher QoL weights reported for all other outcomes—0.89 for breast cancer, for example.
We therefore carried out our analysis across a range of QoL values (0.4 to 1.0). This not only allowed us to assess the sensitivity of the results to the QoL weight assumed for menopausal symptoms, but also enabled us to obtain results tailored to individual women according to their perceived QoL with symptoms and their baseline risk of breast cancer.
Our model showed a net harm associated with HRT use for five years in asymptomatic women, which increased with baseline risk of breast cancer (fig 2). The loss in QALYs associated with HRT use for five years was 0.2 months in perfect health (- 0.02 QALYs, 95% credibility interval - 0.04 to 0.00) for women at low risk (0.7%), 0.4 (- 0.03, - 0.05 to - 0.01) for women at average risk (1.2%), 2.4 (- 0.20, - 0.40 to - 0.03) for women at high risk (12%), and 9.7 (- 0.81, - 1.63 to - 0.09) for women at very high risk (50%).
Our main analysis showed HRT to be on average beneficial in women with menopausal symptoms, with the magnitude of benefit decreasing with increasing baseline risks of breast cancer. The results are, however, sensitive to the value assumed for the QoL associated with symptoms: figure 2 shows the results for a QoL value of 0.75 (95% confidence interval 0.66 to 0.83). For a baseline risk greater than 25% the probability of net harm was greater than 2.5%. In particular, the net benefit for women at low, average, high, and very high baseline risk was, respectively, 10.7 months in perfect health (QALYs 0.89, 95% credibility interval 0.56 to 1.26), 10.6 months (0.88, 0.55 to 1.25), 8.5 months (0.71, 0.33 to 1.12), and 1.2 months (0.10, - 0.78 to 0.89).
Results were robust when using multiway sensitivity analyses to assess the impact of different QoL values for all outcomes, except menopausal symptoms. In particular, when using extreme QoL values of 0.5 and 1 for each outcome, QALYs were 0.88 to 0.89 for women at low baseline risk of breast cancer, 0.87 to 0.89 for women at average risk, 0.65 to 0.73 for women at high risk, and - 0.13 to 0.17 for women at very high risk.
An alternative approach for dealing with the uncertainty in QoL values is to consider the implication for an individual woman according to her baseline risk of breast cancer and the utility she ascribes to her own menopausal symptoms. The contour plot in figure 3 provides a graphical representation of the probability of net harm associated with use of combined HRT for five years for different combinations of utilities and baseline risks. For example, a 50 year old woman with a baseline risk of breast cancer over five years of 5.4% (calculated using the Gail predictive model44) and who attributes a utility of 0.90 to her menopausal symptoms (she would rather live four and a half years without symptoms than five years with symptoms), would have a probability of overall harm between 0 and 1%. The data on QoL for menopausal symptoms in this plot were modelled without uncertainty, since the utility is assumed to be obtained directly from the woman rather than being an estimate for an average woman. The probabilities of net harm shown in the plot represent the most plausible point estimates derived from our model.
According to our decision analysis, HRT for the primary prevention of chronic diseases in asymptomatic women is unjustified, with a net harm that increases with baseline risk of breast cancer. The loss in QALYs associated with HRT use for five years ranged from around seven days in perfect health for women with the lowest baseline risk (0.7%) to two and a half months for those with a high baseline risk (12%).
In women with menopausal symptoms the magnitude of net benefit associated with HRT depended on the QoL value assumed for the symptoms. This value varies widely owing to the substantial variability in severity of symptoms and perception of their impact on everyday life reported by women.33 43 Given these limitations a tailored approach to an individual woman, based on her own utility and baseline risk, would be more appropriate than a population based approach for decision making.
Simple graphs based on a decision model can help clinicians and women to make informed decisions about HRT.45 Our contour plot could be used as a decision aid as it predicts the probability of net harm in a woman with a specific baseline risk of breast cancer who can express the utility of her own menopausal symptoms. In this way women can make a decision based on what they consider an acceptable probability.
We chose the published QoL estimate of 0.75, derived using the time trade-off method, but previous decision analyses on HRT have used the higher QoL weight of 0.99 based on experts' rather than women's opinion.23 46 Using the higher value in our model showed a probability of net harm of less than 2.5% for women with a five year cumulative baseline risk of breast cancer below 0.7%, considerably lower than the UK and US average.
The reporting of the QoL component of large high quality studies would add significantly to the evidence base. However, trials that have evaluated the effectiveness of HRT on prevention of chronic diseases might not necessarily provide the best evidence. In fact, since the effect of HRT on vasomotor symptoms is now well established, informed consent may be difficult to obtain from women who perceive their symptoms as distressful.42 A subgroup analysis of women with menopausal symptoms at baseline in the women's health initiative trial showed that although HRT was effective at improving vasomotor symptoms, it had no major impact on health related QoL except a small benefit on sleep disturbance.47 Women previously receiving HRT with moderate or severe symptoms during a washout period were “discouraged,” but not excluded, from participating in the study, which might have resulted in selection bias by including women who did not attribute major importance to their symptoms. 47
The discordance between our results and those of the subgroup analysis might reflect the use of different approaches for measuring health states—respectively, health state measures and preference based methods such as time trade-off. Agreement between the two methods is poor to moderate, as shown by studies that have incorporated both assessments.48 For example, utilities assessed using the time trade-off method compared with several measures of health status in patients with coronary heart disease produced a correlation of only 0.38 to 0.52.49 Preference based measures, although less sensitive than psychometric based measures to changes in health state over time, are considered more suitable for making decisions about the suitability of interventions. This is because they assess QoL as a whole, thus including aspects that may not be properly addressed by health state measures, but which the individual may consider important.50
Our decision analysis is based on the net benefit model. Compared with the more prevalent probabilistic decision models, such as Markov models, the net benefit model has a simpler structure, although its complexity does depend on the number of outcomes considered.
Limitations of study
We considered only the average risk for outcomes, apart from breast cancer. This limitation could be overcome by using a multidimensional model that allowed baseline risks of additional outcomes to vary, thus producing a net benefit response surface.51
Our model assumes that the harms and benefits from HRT do not continue once treatment stops at five years. However, there is evidence that the effect of HRT declines after stopping treatment, and this could easily be incorporated using a Markov decision modelling approach.52 53
A limitation of any decision model is its dependency on the data used. The results of our model showed sensitivity to QoL for menopausal symptoms whereas they were robust to the assumptions on all other QoL values. Moreover, our results applied to white UK women aged 50 years, although our model could be easily customised to other populations.
In theory the utilities for all outcomes in our analysis could be individualised to one patient.54 This would be difficult to implement in practice, given the number of outcomes considered here, and no straightforward graphical aid could help in decision making. Moreover, the results of our sensitivity analyses illustrate the robustness of the results when varying all utilities other than those for menopausal symptoms. We used utilities derived using time trade-off independently for each outcome which does not allow a comparison of the attitudes to risk with different outcomes.
What is already known on this topic
HRT is not justified for primary prevention of chronic disease among postmenopausal women
Such therapy may be justified to relieve menopausal symptoms that reduce quality of life, but evidence is scant
What this study adds
Fifty year old women whose HRT for five years to relieve menopausal symptoms may find treatment beneficial
Overall, benefits depend highly on the severity of symptoms and the associated effect on quality of life
Another limitation is that we could not assess properly the assumptions on which our model is based. In particular, that the additive risks and benefits of HRT are not likely to be satisfied for some outcomes, since interactions might be present. For example, reduced physical activity from hip fracture might have affected the risk of cardiovascular disease. A further step would be to extend the model to allow for interactions, although with a large number of outcomes a Markov model would be better.
We did not consider the risk of ovarian cancer, which seems to be associated with only the use of unopposed oestrogens, nor the effect of HRT on depression and cognitive activities and gallbladder disease, due to the paucity of evidence.9
Women with menopausal symptoms on average benefit from HRT, results that concur with the recommendations of the UK Medicines and Healthcare Products Regulatory Agency.55 The results, however, depend on the QoL attributed to symptoms, which in turn greatly varies with severity of symptoms and women's perceptions. Thus, a decision analysis tailored to an individual woman would be more appropriate in clinical practice than a population based approach.
Editorial by McPherson
Detailed data are on bmj.com
Contributors CM undertook all analyses and drafted the paper. KRA was involved in the planning and supervision of the project and redrafted the paper; he will act as guarantor for the paper. AJS gave advice on the structure of model and revised the paper. NJC gave advice on the population of the net benefit model and revised the paper.
Competing interests KRA has received research funding from Schering Health Care, a manufacturer of combined HRT, to evaluate a levonorgestrel emitting intrauterine device
Ethical approval Not required.