- L J Middleton, medical statistician1,
- R Champaneria, research associate1,
- J P Daniels, research fellow1,
- S Bhattacharya, professor of reproductive medicine2,
- K G Cooper, consultant gynaecologist3,
- N H Hilken, IT co-coordinator1,
- P O’Donovan, consultant gynaecologist4, professor of medical innovation5,
- M Gannon, consultant senior lecturer in obstetrics and gynaecology6,
- R Gray, unit director1,
- K S Khan, professor of obstetrics-gynaecology and clinical epidemiology7
- on behalf of the International Heavy Menstrual Bleeding Individual Patient Data Meta-analysis Collaborative Group
- 1Birmingham Clinical Trials Unit, University of Birmingham, Birmingham B15 2TT
- 2Section of Applied Clinical Sciences, Division of Applied Health Sciences, School of Medicine and Dentistry, University of Aberdeen, Aberdeen Maternity Hospital, Foresterhill, Aberdeen AB25 2ZD
- 3Department of Obstetrics and Gynaecology, Aberdeen Royal Infirmary, Foresterhill, Aberdeen AB25 2ZN
- 4Bradford Royal Infirmary (Bradford Teaching Hospitals NHS Foundation Trust), Bradford BD6 6RJ
- 5University of Bradford, Bradford BD7 1DP
- 6Department of Obstetrics and Gynaecology, Midland Regional Hospital, Mullingar, Co Westmeath, Republic of Ireland
- 7Academic Department of Obstetrics and Gynaecology, Division of Reproductive and Child Health, University of Birmingham, Birmingham Women’s Hospital, Birmingham B15 2TG
- Correspondence to: L Middleton
- Accepted 6 June 2010
Objective To evaluate the relative effectiveness of hysterectomy, endometrial destruction (both “first generation” hysteroscopic and “second generation” non-hysteroscopic techniques), and the levonorgestrel releasing intrauterine system (Mirena) in the treatment of heavy menstrual bleeding.
Design Meta-analysis of data from individual patients, with direct and indirect comparisons made on the primary outcome measure of patients’ dissatisfaction.
Data sources Data were sought from the 30 randomised controlled trials identified after a comprehensive search of the Cochrane Library, Medline, Embase, and CINAHL databases, reference lists, and contact with experts. Raw data were available from 2814 women randomised into 17 trials (seven trials including 1359 women for first v second generation endometrial destruction; six trials including 1042 women for hysterectomy v first generation endometrial destruction; one trial including 236 women for hysterectomy v Mirena; three trials including 177 women for second generation endometrial destruction v Mirena).
Eligibility criteria for selecting studies Randomised controlled trials comparing hysterectomy, first and second generation endometrial destruction, and Mirena for women with heavy menstrual bleeding unresponsive to other medical treatment.
Results At around 12 months, more women were dissatisfied with outcome with first generation hysteroscopic techniques than with hysterectomy (13% v 5%; odds ratio 2.46, 95% confidence interval 1.54 to 3.9, P<0.001), but hospital stay (weighted mean difference 3.0 days, 2.9 to 3.1 days, P<0.001) and time to resumption of normal activities (5.2 days, 4.7 to 5.7 days, P<0.001) were longer for hysterectomy. Unsatisfactory outcomes were comparable with first and second generation techniques (odds ratio 1.2, 0.9 to 1.6, P=0.2), although second generation techniques were quicker (weighted mean difference 14.5 minutes, 13.7 to 15.3 minutes, P<0.001) and women recovered sooner (0.48 days, 0.20 to 0.75 days, P<0.001), with fewer procedural complications. Indirect comparison suggested more unsatisfactory outcomes with second generation techniques than with hysterectomy (11% v 5%; odds ratio 2.3, 1.3 to 4.2, P=0.006). Similar estimates were seen when Mirena was indirectly compared with hysterectomy (17% v 5%; odds ratio 2.2, 0.9 to 5.3, P=0.07), although this comparison lacked power because of the limited amount of data available for analysis.
Conclusions More women are dissatisfied after endometrial destruction than after hysterectomy. Dissatisfaction rates are low after all treatments, and hysterectomy is associated with increased length of stay in hospital and a longer recovery period. Definitive evidence on effectiveness of Mirena compared with more invasive procedures is lacking.
Heavy menstrual bleeding is a common problem in women of reproductive age.1 It is often incapacitating and expensive to treat and can severely affect a woman’s quality of life.2 3 Many women are not happy with medical treatment and end up undergoing surgery.4 Hysterectomy was once the only surgical option for heavy menstrual bleeding, and almost half of the hysterectomies currently performed worldwide are carried out for this reason.5 Endometrial destruction techniques, which aim to destroy or remove the endometrial tissue,6 have become increasingly popular alternatives, and, as a result, the number of hysterectomies in the United Kingdom declined by 64% between 1995 and 2002.7 They were introduced in the 1980s, with rollerball ablation and transcervical resection emerging as the main approaches under direct hysteroscopic vision.8 Subsequently, second generation non-hysteroscopic techniques have been developed, which are easier to perform. Here, devices are sited and activated to treat the whole endometrial cavity simultaneously without visual control. Destruction is achieved through various methods, including high temperature fluids and bipolar electrical or microwave energy. Intrauterine devices were initially introduced as contraceptives, but the addition of progestogen resulted in reduced menstrual bleeding. Mirena, the levonorgestrel releasing intrauterine system, provides a non-surgical alternative, which is reversible and spares fertility.9
Women and clinicians now have a greater choice of treatment, although evidence to support decision making is inadequate. In the UK, guidelines from the National Institute for Health and Clinical Excellence10 recommend the use of Mirena in the first instance for women with benign heavy menstrual bleeding, followed by endometrial destruction, if drug treatments fail to resolve symptoms. Syntheses of evidence from randomised controlled trials comparing these treatments have been limited,11 12 13 partly because of scarcity of head to head comparisons and variation in outcome measurements used to evaluate effectiveness. We undertook a meta-analysis of data from individual patients from all relevant trials to address previous deficiencies in evidence synthesis. This sort of meta-analysis has several advantages over traditional reviews of published data,14 including the ability to carry out data checks, standardise analytical methods, and undertake subgroup analyses.
We sought data on individual patients from randomised controlled trials of hysterectomy, endometrial destruction techniques, and Mirena to examine their relative efficacy as second line treatment for heavy menstrual bleeding. The systematic review was conducted based on a protocol designed with widely recommended methods15 16 that complied with guidelines for reporting meta-analysis17 (www.bctu.bham.ac.uk/systematicreview/hmb/protocol.shtml).
Literature search and study selection
We searched the Cochrane Library, Medline (1966-2010), Embase (1980 to May 2010), and CINAHL databases (1982 to May 2010) using relevant terms and word variants for population and interventions (see appendix 1 on bmj.com). We also hand searched the bibliographies of all relevant primary articles and reviews to identify any articles missed by the electronic searches. Experts were contacted to identify further studies. To identify any ongoing randomised controlled trials, we searched the Meta-Register of Controlled Trials and the ISRCTN register. No language restriction was applied.
Studies were selected in a two step process. Firstly, we scrutinised the citations identified by the electronic searches and obtained full manuscripts of all the citations that met, or were thought likely to meet, the predetermined inclusion criteria based on patients’ entry criteria (women with heavy menstrual bleeding or abnormal/excessive/prolonged uterine bleeding that was unresponsive to other medical treatment) and study design, the latter limited to randomised controlled trials. We then considered four categories of intervention: hysterectomy (performed abdominally, vaginally, or laparoscopically); “first generation” endometrial destruction techniques (using operative hysteroscopy, including endometrial laser ablation, transcervical resection of the endometrium (TCRE), and rollerball endometrial ablation); “second generation” endometrial destruction techniques (those that use a “blind” device to simultaneously treat the whole cavity, including thermal balloon (Cavaterm, Thermachoice, and Vesta), microwave (Microsulis), laser (ELITT), bipolar radio frequency (NovaSure), cryoablation, and hydrothermal ablation); and a levonorgestrel releasing intrauterine system (Mirena). We compared these categories of treatment against each other; studies were excluded from the meta-analysis if a comparison between relevant categories did not exist, although we also requested data from studies making a comparison within these categories to allow further exploration of possible predictors of the primary outcome measure.
Data collection and study quality assessment
We made repeated attempts to contact corresponding authors via post, email, or telephone to access data. When initial attempts failed, we attempted personal contact via our links through the British and European Societies for Gynaecological Endoscopy. Authors were asked to supply anonymised data for each of the prespecified outcome measures and were invited to become part of the collaborative group with joint ownership of the final publication. When investigators declined to take part in the study or could not be contacted, two independent reviewers (RC and LJM) extracted published data from manuscripts using predesigned proformas. Any disagreements were resolved by consensus or arbitration by a third reviewer (JPD). Received data were merged into a master database, specifically constructed for the review. The data were cleaned and results cross checked against published reports of the trials. When discrepancies existed we contacted authors for clarification.
Authors of the protocol reviewed all relevant outcome measures to be used in the meta-analysis from articles identified in the literature search. Level of satisfaction with treatment was the most commonly measured outcome across all identified studies, with 21 out of 30 (70%) using this measure, and we used it as our primary outcome measure. Dissatisfaction rates are presented to simplify interpretation of statistical output. Responses of “very satisfied” or “satisfied” were taken as a positive response, likewise “very dissatisfied” or “dissatisfied” were taken as a negative response. Where a “not sure” or “uncertain” response was given these were conservatively taken to be a negative rating of treatment, although we carried out a sensitivity analysis to test the robustness of this assumption. For a small number of studies,18 19 20 21 we used surrogate outcomes for satisfaction (major problem resolved/improvement of health state/menstrual symptoms successfully treated/degree of recommendation). This assumption was also tested by sensitivity analysis without these studies (indicated in the results section where important) (see appendix 2 on bmj.com). A more disease specific quality of life tool22 would have been the ideal choice for primary measure, but relevant data were not available from the identified studies. We have shown from the data in this review, though, that there is a strong relation between dissatisfaction and patients’ quality of life (see results section).
Other outcome measures were bleeding scores (ranging from a minimum of zero with no upper limit),23 amenorrhoea rate (converted from a bleeding score of zero where data existed, otherwise as reported), heavy bleeding rate (converted from bleeding scores of >10023 where data existed, otherwise as reported), EQ-5D utility score,24 SF-36 scores,25 duration of surgery/hospital stay, rates of general anaesthesia, postoperative pain score (standardised from visual analogue and ordinal scale scores on to a scale of 0-10), time to return to work/normal activities/sexual activity, dysmenorrhoea/dyspareunia rate, and proportion undergoing subsequent ablation/hysterectomy or discontinuing use of Mirena. Predefined subgroups were age at randomisation (≤40 v >40), parity (nulliparous v parous), length of uterine cavity (≤8 v >8 cm), presence or absence of fibroids/polyps, and, when available, severity of bleeding at baseline (bleeding score ≤350 or >350).
We assessed all selected trials for their methodological quality by using received datasets when available in addition to the reported information. Quality was scrutinised by checking adequacy of randomisation, group comparability at baseline (examining baseline characteristics for any substantive differences), blinding (where appropriate), use of intention to treat analysis, completeness of follow-up, compliance, reliability by using a priori estimation of sample size, and generalisability by using a description of the sample recruited. Adequacy of randomisation was assessed with sub-questions examining information on sequence generation, the process of allocation, and allocation concealment.
To minimise the possibility of bias, we combined data on individual patients and aggregate data in a two stage approach.26 Data on individual patients were reduced to aggregate data to allow studies with only aggregate data to be combined with those with data on individual patients. Unless specifically stated, all estimates shown are from all available data (both individual patient data and aggregate data). Point estimates and 95% confidence intervals were calculated for individual studies at each time point. Differences in effect estimates between trials and the predefined subgroups of patients are displayed with odds ratio plots, with heterogeneity investigated by using Cochran’s Q27 and I2 statistics.28 Subgroup analyses to explore the causes of heterogeneity were undertaken if the P values of these tests were <0.1. Differences between studies contributing data on individual patients and those with only aggregate data were examined in the same fashion to check that the latter results were consistent with those for which we received individual data. Limited data were available for studies comparing Mirena with endometrial destruction, so we compared Mirena with first and second generation studies combined as well as separately. We used assumption-free “fixed effect” methods to combine dichotomous outcome measures and estimate pooled odds ratios using the method of Peto,29 and, for continuous variables, calculated weighted mean differences30 at each time point. Data at less than 12 months were combined and are described as results at six months. Results from the limited number of studies with follow-up longer than two years are not referred to in the text but are given in the appendices (see bmj.com).
The primary outcome measure of dissatisfaction was investigated comprehensively by using received data. Results at 12 months, when most studies had collected data, were used as the focus for analysis. When responses were not available at this time point, data were substituted, in the first instance, from two years and failing that six months. Estimates of dissatisfaction at any time were also examined, along with an analysis allowing for the correlation of the repeated measurements using generalised estimating equations (data on individuals patient only).31 If we could not directly compare treatments, we made indirect estimates32 using a logistic regression model33 allowing for trial and treatment.34 For example, we estimated the effect for the comparison of hysterectomy versus second generation endometrial destruction using the common comparators of first generation endometrial destruction and Mirena. It should be noted that for this particular analysis we assumed that there are no systematic differences between the sets of trials that could bias the indirect measure.32
Access to data from individual patients also allowed the inclusion of patient level covariates to examine possible predictors of dissatisfaction. Firstly, we considered covariates individually, while allowing for differences between trial estimates by including this parameter in the model. If considered statistically important (P<0.1), we included covariate parameters together in a multivariable analysis to examine adjusted estimates. In addition to the analysis of the primary outcome measure described above, as a sensitivity analysis, we also used data from individual patients to explore the effect observed in compliance rates for comparisons between first and second generation endometrial destruction (unfortunately there were insufficient data to extend this analysis to comparisons with Mirena). For example, for those women who were “satisfied” with treatment but subsequently underwent a hysterectomy, positive responses were substituted with negative ones. The relation between dissatisfaction and responses from the SF-36 quality of life questionnaire was examined at the patient level by using a regression model allowing for trial. We used Revman v5.0 (Cochrane Collaboration, Denmark) and SAS v9.2 (SAS Institute, Cary, USA) software for analyses.
Trials and patients
We identified 556 potentially relevant citations by electronic searches. After detailed evaluation, 30 trials were eligible for inclusion in the review (fig 1⇓). Of these trials, seven compared hysterectomy with endometrial destruction techniques. Six of these studies involved first generation techniques.18 35 36 37 38 39 The seventh study used a combination of first and second generation in equal proportions40 and was included here as a first generation comparison, with a sensitivity analysis performed without the trial. One study compared hysterectomy with Mirena.41 Fourteen studies compared first generation endometrial destruction techniques with second generation techniques,19 42 43 44 45 46 47 48 49 50 51 52 53 54 and eight studies compared Mirena with endometrial destruction, three of which were first generation20 55 56 and five second generation.21 57 58 59 60 Appendix 2 on bmj.com shows the characteristics of these studies. Data from a further five studies (including one unpublished study),61 62 63 64 which involved comparisons within first and second generation endometrial destruction, were also received.
Trials that compared hysterectomy with endometrial destruction and those that compared first and second generation endometrial destruction involved women of a similar age, with average ages of 40.6 (SD 5.1) and 41.0 (SD 4.9), respectively. Women in trials comparing Mirena with endometrial destruction were slightly older, with an average age of 43.6 (SD 3.5). Eligibility criteria for women with uterine pathology varied between trials; inclusion of women with fibroids was generally limited by size or number of the fibroids. When affected women were included, they amounted to a maximum of 30% of the women in each individual study.
We received a high proportion of data from trials involving hysterectomy (7/8 studies, 1278/1363 women) and less from trials of endometrial destruction techniques (7/14 studies, 1359/2448 women) and those that compared Mirena with endometrial destruction (3/8 studies, 177/494 women) (see appendix 2 on bmj.com). Overall, we received some data on individual patients from 65% (2814/4305) of women involved in the trials, although only eight studies were able to provide all requested variables.36 37 42 43 46 49 54 60 The remaining studies had some missing information, with limited details on follow-up of patients covering subsequent operations (for example, hysterectomy after Mirena). Details on how we used data from studies providing data from individual patients are in the section on statistical analysis.
The methodological quality of the studies was variable (fig 2⇓ and appendix 3 on bmj.com). More than half the studies failed to give adequate information about their randomisation procedure and details of allocation concealment. There was a general lack of true intention to treat analysis, with some studies stating that an intention to treat had been performed yet analysing only those women who had received treatment. For four studies that reported per protocol analyses,43 46 49 58 intention to treat analyses were undertaken with the available data on individual patients, although it was not always clear if patients who deviated from protocol were followed up correctly in these cases. Small sample sizes often lacked a sensible justification, especially in studies involving Mirena. In the nine trials involving Mirena, only four had more than 80% of women with Mirena in situ 12 months after randomisation.
Dissatisfaction as an outcome measure
Data from four studies that provided data from individual patients on both outcomes36 49 51 58 showed that satisfied patients had significantly increased scores in seven of eight domains of the SF-36 quality of life questionnaire when compared with dissatisfied patients in the analysis of change from baseline scores, including the general health perception (7.4 points, 95% confidence interval 3.1 to 11.8, P<0.001) and mental health (10.5 points, 5.4 to 15.6, P<0.001) domains (table 1⇓). Differences from absolute values (not adjusted for baseline score) were highly significant (P<0.001) in all eight domains in favour of satisfied patients.
Effectiveness in reducing dissatisfaction with treatment
Hysterectomy v first generation endometrial destruction—More women were dissatisfied at 12 months after first generation endometrial destruction than after hysterectomy (13% (57/454) v 5% (23/432); odds ratio 2.5, 1.5 to 3.9, P<0.001) (fig 3⇓), with no significant heterogeneity between study estimates (P=0.9, I2=0%). This estimate of effect size was consistent with, although slightly less than, the estimate from the repeated measures analysis (individual patient data only) over all time points (3.8, 2.2 to 6.5, P<0.001) and an analysis using dissatisfaction at any time point (3.4, 2.1 to 5.3, P<0.001). There was no evidence of any differences between subgroups (see data collection and study quality assessment section), including between studies providing individual patient data or aggregate data (test for heterogeneity P=0.9).
First v second generation endometrial destruction techniques—Similar rates of dissatisfaction were seen with first and second generation endometrial destruction (12% (123/1006) v 11% (110/1034); odds ratio 1.2, 0.9 to 1.6, P=0.2; test for heterogeneity P=0.7, I2=0%) (fig 4⇓). Comparable estimates were obtained from the repeated measures analysis of data from individual patients (1.2, 0.8 to 1.7, P=0.3), the analysis of dissatisfaction at any time (1.2, 0.9 to 1.6, P=0.2), and an analysis adjusted for patients who went on to receive hysterectomy (1.3, 0.9 to 1.7, P=0.1). Results were consistent over all subgroups, including those studies providing data from individual patient or only aggregate data (test for heterogeneity P=0.8).
Mirena v endometrial destruction techniques—Rates of dissatisfaction with Mirena and second generation endometrial destruction were similar (18% (17/94) v 23% (23/102); odds ratio 0.8, 0.4 to 1.5, P=0.4) (fig 5⇓). The combined estimate of this and the one study that compared Mirena with first generation endometrial destruction20 (test for differences between subgroups P=0.2) also showed no evidence of a difference (0.9, 0.5 to 1.8, P=0.9; test for heterogeneity over all studies P=0.1, I2=54%). Overall rates of dissatisfaction were 17% (22/128) for Mirena and 18% (25/137) for both first and second generation endometrial destruction. Lack of data from individual patients prohibited any further investigation of subgroups or repeated measures. Sensitivity analysis performed without two studies that used surrogates for dissatisfaction significantly reduced the data available for analysis but did not change the findings.
Indirect comparisons of hysterectomy with second generation endometrial destruction techniques and Mirena—Indirect estimates (fig 6⇓) suggest that hysterectomy is also preferable to second generation endometrial destruction (5% (23/432) v 11% (110/1034); odds ratio 2.3, 1.3 to 4.2, P=0.006) in terms of patients’ dissatisfaction. This was confirmed by the repeated measures analysis (individual patient data only) over all three time points (3.1, 1.6 to 5.9, P<0.001). The evidence to suggest hysterectomy is preferable to Mirena was weaker (5% (23/432) v 17% (22/128); 2.2, 0.9 to 5.3, P=0.07), but given the lack of precision from Mirena comparisons this was not a surprising result and should be interpreted cautiously.
Predictors of dissatisfaction—For second generation endometrial destruction, data from individual patients showed that the length of the uterine cavity was the strongest predictor of dissatisfaction (P=0.02), with shorter cavities (≤8 cm v >8 cm) being associated with reduced rates (odds ratio 0.6, 0.4 to 0.9; P=0.02) (table 2⇓). Absence of fibroids/polyps also showed a trend towards reduced dissatisfaction (P=0.07), although no further adjusted estimates including both parameters were attempted as only three studies had data on fibroids/polyps. There were no convincing associations with any of the variables for hysterectomy or first generation endometrial destruction.
Effectiveness in improving other outcomes
Hysterectomy v endometrial destruction and Mirena—These comparisons focused on recovery times and quality of life because estimates of postoperative menstrual blood loss are redundant after hysterectomy (see appendix 2 on bmj.com). Endometrial destruction offered quicker surgery (weighted mean difference 32 minutes, 30 to 34 minutes, P<0.001), shorter hospital stay (3.0 days, 2.9 to 3.1 days, P<0.001), faster recovery periods (time to return to normal activities 5.2 days, 4.7 to 5.7 days, P<0.001), and less postoperative pain (2.5 points, 2.2 to 2.9 points, P<0.001), although estimates of differences for some of these results should be used with caution given the high variability between studies (see appendix 4 on bmj.com). One study suggested no obvious difference in EQ-5D utility score,40 while another suggested differences in favour of hysterectomy in the general health (9.6 points, 5.7 to 13.5 points, P<0.001), social functioning (24 points, 21 to 27 points, P<0.001), and vitality (13 points, 9.3 to 16 points, P<0.001) domains of the SF-36 questionnaire (change from baseline).18 Relatively few perioperative adverse events were associated with hysterectomy (0.5%-2% each), but urinary tract infections were more common (8%, 43/530) than with endometrial destruction (2%, 9/585) (odds ratio 4.4, 2.5 to 7.8, P<0.001). Of the women who were initially treated with endometrial destruction, 15% (38/246) had undergone a hysterectomy by two years. There were no differences in EQ-5D scores at six or 12 months in the single study comparing hysterectomy with Mirena (see appendix 5 on bmj.com), while the only significant effect observed in the SF-36 questionnaire was in the pain domain (change from baseline), favouring hysterectomy (weighted mean difference 9.6 points, 2.7 to 16.6 points, P=0.007). All results were consistent over subgroups.
First v second generation endometrial destruction techniques—The proportion of women with amenorrhoea or still experiencing heavy bleeding was similar in both groups at all time points, apart from at two years, where there was a difference of borderline significance in favour of second generation techniques (odds ratio 0.64, 0.41 to 0.99, P=0.04, for amenorrhea; 0.54, 0.30 to 0.97, P=0.04, for heavy bleeding) (see appendix 6 on bmj.com). Change from baseline analysis of bleeding scores showed no evidence of a difference at any of the time points. Two studies49 51 using the SF-36 questionnaire and one small study42 using the EQ-5D questionnaire showed no consistent difference between first and second generation techniques in terms of change from baseline results. Second generation endometrial destruction was quicker (weighted mean difference 14.5 minutes, 13.7 to 15.3 minutes, P<0.001) and less likely to require general anaesthesia (odds ratio 0.16, 0.12 to 0.20, P<0.001), although highly significant heterogeneity makes estimates difficult to interpret. Lower use of general anaesthesia with second generation endometrial destruction translated to a slightly quicker time to normal activities (weighted mean difference 0.48 days, 0.20 to 0.75 days, P<0.001) and time to return to work (1.36 days, 0.69 to 2.03 days, P<0.001). Postoperative pain was similar after either method. Adverse events were relatively low in both groups (<2% each), but perioperative complications such as uterine perforation (odds ratio 0.20, 0.07 to 0.57, P=0.003), excessive bleeding (0.14, 0.03 to 0.55, P=0.005), fluid overload (0.12, 0.04 to 0.36, P<0.001), and cervical laceration (0.12, 0.05 to 0.33, P<0.001) were lower with second generation techniques. The number of women requiring a subsequent hysterectomy was lower for second generation endometrial destruction, but these differences were not large enough to be significant within the first two years (odds ratio 0.77, 0.47 to 1.24, P=0.3, at 12 months; 0.68, 0.41 to 1.13, P=0.1, at two years). Overall rates were 3% (74/2265) and 8% (71/939) at these time points. Any differences among subgroups were confined to single time points only. Results from studies providing data from individual patients were consistent with those with only aggregate data.
Mirena v endometrial destruction techniques—More women experienced heavy bleeding after endometrial destruction at six months (odds ratio 4.3, 1.8 to 10.6, P=0.001) and at two years (13.0, 2.0 to 84.2, P=0.007), although not at 12 months (1.4, 0.6 to 2.97, P=0.5), when the largest number of women was evaluated (see appendices 7 and 8 on bmj.com). Rates of amenorrhoea were similar at all time points. Change in bleeding scores favoured endometrial destruction only at 12 months (weighted mean difference 38 points, 15 to 60 points, P<0.001). Other outcome measures could not separate the two treatments. Two studies provided SF-36 change from baseline scores, and no differences were found in any of the domains.58 60 The number of women subsequently undergoing a hysterectomy was similar at each time point; rates at 12 months were 2% (2/86) for endometrial destruction and 7% (6/89) for Mirena (odds ratio 0.36, 0.09 to 1.48, P=0.2). A high proportion of women originally prescribed Mirena discontinued use of this treatment: 16% (30/191) at 12 months rising to 28% (29/105) by two years. Reported adverse events were low with Mirena; only around 3% reported an expelled or migrated coil within the first month. These results were from studies of first and second generation studies combined, when first generation data existed, and were consistent over both types of endometrial destruction.
In women undergoing second line treatment for heavy menstrual bleeding, both first and second generation endometrial destruction techniques were associated with greater dissatisfaction than hysterectomy, although rates were low for all treatments and absolute differences were small. Recovery times and length of hospital stay were longer for hysterectomy. Dissatisfaction levels with second generation techniques were slightly lower than those associated with first generation techniques. In addition, second generation methods were quicker, associated with faster recovery times, and associated with fewer adverse procedural events and could be carried out under local anaesthesia. Fewer women subsequently underwent hysterectomy after second generation compared with first generation endometrial destruction, but this difference was not significant. Shorter uterine cavity length was associated with lower levels of dissatisfaction for second generation endometrial destruction. Comparisons of endometrial destruction with a levonorgestrel releasing intrauterine system (Mirena) suggest comparable efficacy, although studies of Mirena were generally small and consequently imprecise. Substantial discontinuation of use of Mirena was noted and makes interpretation of findings for this treatment difficult. The primary outcome measure of dissatisfaction with treatment was shown to be strongly related to reduced quality of life.
Strengths and limitations
Access to data from individual patients enabled a more rigorous analysis than is possible from published data. We used optimal methods, complying with guidelines on reporting of systematic reviews and meta-analyses.65 An extensive literature search was conducted, with no language restrictions, minimising the risk of missing information. The collection of data from individual patients allowed us to use previously unreported data, improve the assessment of study quality, standardise outcome measures, undertake intention to treat analysis, and use optimal analytical methods. Subgroup, repeated measures, and multivariable analyses would not have been possible without the collection of individual data.
We were unable to retrieve individual data from at least 35% of randomised women because researchers did not agree to collaborate or could not be contacted. The data that we did receive were sometimes incomplete and on occasions failed quality checks and so were unusable. The review’s inferences are also limited by the inconsistent outcome measure used across trials: studies involving endometrial destruction and Mirena focused on comparing reduction in bleeding, while hysterectomy trials focused on women’s satisfaction, quality of life, and use of resources.
We found that more women were dissatisfied after endometrial destruction than after hysterectomy, though this should be placed in context of longer operating time, total hospital stay, and recovery period for hysterectomy. Rates of dissatisfaction were relatively low for endometrial destruction, and it is an effective alternative for women with abnormal uterine bleeding who do not want amenorrhoea. While this review has shown that hysterectomy is a relatively safe operation, other studies with a more comprehensive follow-up of large populations have shown higher levels of morbidity after hysterectomy.6 In contrast, endometrial destruction has low rates of complication.66 All these factors need to be taken into consideration when considering any potential benefit of hysterectomy.
We found that second generation techniques, such as thermal balloon ablation (Thermachoice and Cavaterm),46 67 68 the Novasure device,48 or microwave (Microsulis),51 69 were at least as effective as first generation techniques. Moreover, they are simpler and quicker, require less skill on the part of the operator, and can be attempted under local anaesthetic. Importantly, fewer operative complications have been recorded. Thus they are clearly preferable to first generation techniques. The association of shorter uterine cavity and lower dissatisfaction with second generation endometrial destruction could be because endoscopic treatment is technically more difficult, though given the borderline statistical significance it could also have arisen by chance.
The comparisons involving Mirena were encouraging, and given that it is a relatively cheap and minimally invasive procedure, it could be considered first if drug treatment for heavy bleeding fails.70 It could even be an alternative to oral drug treatment as a first line agent, but we did not examine this question in our review. The current body of evidence comparing Mirena with more invasive techniques, however, is limited and prohibits us making any strong conclusions about the current findings of this treatment. Furthermore, research on Mirena presents some specific difficulties in interpretation because of the high proportion of women discontinuing treatment. This can be seen in the trial by Hurskainen et al,41 71 which compared Mirena with hysterectomy. While the study was well conducted and reported, the lack of further investigation into the analysis of the primary outcome measure (EuroQol EQ-5D) made the interpretation that there was no evidence of a difference questionable. Of the 119 women allocated to Mirena, 24 (20%) had undergone hysterectomy before the main analysis time point at 12 months, with a further 13 (11%) no longer using the Mirena. Unfortunately, missing data from individual patients in this trial meant we could not investigate further.
Implications for practice
Our review provides evidence that hysterectomy reduces dissatisfaction compared with endometrial destruction, and this information should be used as part of consultation with women making a choice about treatment options when initial drug treatment fails to control heavy menstrual bleeding. Endometrial destruction is satisfactory for a high proportion of women, but, if complete cessation of bleeding is sought, then hysterectomy could be offered. Although the evidence is not strong, our findings concur with a recent NICE recommendation that women should be offered Mirena before more invasive procedures, particularly as this can be offered in primary care.11
Implications for research
Further investment in a randomised controlled trial comparing hysterectomy with second generation endometrial destruction would be of limited value, given the similar efficacy of first and second generation techniques. Questions remain about the long term clinical effectiveness of all the treatments; evidence from trials with longer term follow-up (four years or more) is limited to a handful of studies involving differing comparisons.40 69 72 73 71 74 Mirena, in particular, versus alternative forms of surgical treatment requires further research. While the small studies included in this review have indicated promising results for this treatment, the substantial levels of non-compliance make interpretation of outcomes difficult and casts some doubt on the equivalent efficacy conclusions. The cost effectiveness of all these treatments is currently being examined in a concurrent study. Issues such as discontinuation of Mirena will be an important factor.
Meta-analysis of data from individual patients is an extremely powerful tool if used correctly75 and provides the most definitive possible synthesis of the available evidence. Such collaborative meta-analyses are well established in cancer research and have greatly influenced clinical practice, resulting in striking improvements in, for example, survival after breast cancer.28 Clinicians in specialty groups, such as gynaecology, need to be aware that contributing data from individual patients is certainly as important as conducting the original research, if not more so. Consensus on optimal outcome measures would also be helpful for meta-analysis.
What is already known on this topic
Less invasive alternatives to hysterectomy for the treatment of heavy menstrual bleeding, such as endometrial destruction and the levonorgestrel releasing intrauterine system (Mirena), have become increasing popular
What this study adds
More women are dissatisfied after first or second generation endometrial destruction techniques than after hysterectomy, although rates are low after all treatments
Second generation non-hysteroscopic endometrial destruction techniques are preferable to first generation techniques
Dissatisfaction with treatment in studies of heavy menstrual bleeding is associated with reduced quality of life
Cite this as: BMJ 2010;341:c3929
We thank all authors of identified trials for sending us their trial data and the British and European Societies of Gynaecological Endoscopy for their support and help with the review.
Members of collaborative group
The following authors provided us with data on individual patients: J Abbott, University of New South Wales, Sydney, Australia; J Barrington, Torbay Hospital, South Devon; S Bhattacharya, University of Aberdeen, Aberdeen Maternity Hospital, Aberdeen; M Y Bongers, Maxima Medical Centre, Veldhoven, Netherlands; J-L Brun, Hopital Universitarie Pellegrin, Bordeaux, France; R Busfield, data supplied by M Sowter, Auckland Obstetric Centre, New Zealand; T J Clark, Birmingham Women’s Hospital, Birmingham; J Cooper (2004 trial), data supplied by Microsulis Medical, Hampshire; K G Cooper, Aberdeen Royal Infirmary, Aberdeen; S L Corson (2001 trial), data supplied by Boston Scientific Corporation, Marlborough, USA; K Dickersin, John Hopkins Bloomberg School of Public Health, USA; N Dwyer, Weston General Hospital, Weston Super Mare; M Gannon, Midland Regional Hospital, Mullingar, Ireland; J Hawe, Countess of Chester Hospital, Chester; R Hurskainen, University of Helsinki, Finland; W R Meyer, data supplied by Ethicon, Johnson and Johnson, New Jersey, USA; H O’Connor, Coombe Women’s Hospital, Dublin 8, Ireland; S Pinion, Aberdeen Royal Infirmary, Aberdeen; A M Sambrook, Aberdeen Royal Infirmary, Aberdeen; W H Tam, Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong, China; I A A van Zon-Rabelink, Medical Spectrum Twente, Enschede, Netherlands; E Zupi, Tor Vergata University, Rome, Italy
Contributors: KSK, JPD, and SB conceived the idea for the review and, with KC, RG, RC, and LJM, developed the protocol. JPD and RC carried out literature searches and retrieved the identified papers. RC acted as the group secretariat for the collection and amalgamation of individual patient data. RC and LJM were first and second reviewers for data extraction for trials where individual patient data were not available. JPD acted as the third reviewer, if consensus couldn’t be reached between RC and LJM. RC and JPD were first and second reviewers for the trial quality data extraction. NHH developed the master database. LJM performed the statistical analysis and wrote the initial draft of the manuscript and all subsequent drafts with input from the other members of the writing committee (RC, JPD, SB, KC, NHH, RG, POD, MG, and KSK) and remainder of the collaborative group. LM and KSK are guarantors.
Funding: This review was funded by the Health Technology Assessment Programme of the National Institute for Health Research (05/45/02).
Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any institution for the submitted work; no financial relationships with any institutions that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work. SB, KC, POD, and MG were authors of papers included in the review.
Ethical approval: Not required.
Data sharing: No additional data available.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.