NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Grant MD, Marbella A, Wang AT, et al. Menopausal Symptoms: Comparative Effectiveness of Therapies [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2015 Mar. (Comparative Effectiveness Reviews, No. 147.)
Overview
Agents
Almost 20 specific agents were included in the literature search. Additional unique nonpresprescription agents were identified as well. Agents were categorized according to the scheme in Table 6. Hormones were further classified according to estrogen dose and route of administration (see Appendix D for dose categorization by route of administration). The hormone general category in the table below includes estrogen alone, estrogen/progestogen, testosterone, and progesterone alone. “Menopausal hormone therapy” in the text refers to estrogen (for women without uteri) and estrogen/progestogen (for women with intact uteri). When testosterone or progesterone was used alone, this was explicitly stated. No trials of compounded estrogen formulations met inclusion criteria. A discussion of compounded hormone therapies appears at the end of the KQ1 results section.
Results are organized by Key Question. For KQ1, the results are presented by the six outcome categories: vasomotor symptoms, quality of life, psychological symptoms, sexual function, urogenital atrophy, and sleep disturbance. Within each of these six categories, there are the following sections: a summary table of the included trials; a presentation of the quantitative synthesis (either network meta-analysis or pairwise comparisons) for those trials with data that was amenable to pooling; a strength of evidence assessment for the evidence that was synthesized; a summary of the trials that were not amenable to a quantitative synthesis; and key points.
KQ2 and KQ3 results are presented by condition: breast cancer; gallbladder disease; colorectal cancer; coronary heart disease, stroke, and venous thromboembolism; endometrial cancer; osteoporotic fractures; and ovarian cancer. KQ3 includes an additional discussion of adverse events.
KQ4 results are organized by the six outcome categories, as listed in the KQ1 description.
Results of Literature Searches
The literature search identified 9,655 records, with an additional 72 records identified through the gray literature search and hand searching of bibliographies. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)78 diagram shown in Figure 2 depicts the flow of search screening and study selection. From the total 9,727 abstracts screened, 1,355 full text articles were assessed for inclusion. For KQ1, 735 full text articles were screened, with 271 records included. Twelve of those records presented results for two distinct trials, so those publications were given two unique reference numbers and were counted as two trials, for a total of 283 trials included in KQ1. For KQ 2, a systematic review by Nelson et al.28 published in May 2012, contained the most current literature review addressing the same outcomes in this Key Question. This systematic review therefore became the primary source for KQ2. For KQ3, 72 articles were screened, with 14 studies included: eight RCTs, two cohort studies, and four case control studies. Twenty-seven trials from KQ1 included subgroup analyses of interest and were the evidence base for KQ4.
The list of excluded studies with reasons for exclusion is presented in Appendix B.
Key Question 1. Effectiveness of Different Treatments for Postmenopausal Symptoms
Description of Included Studies
Two hundred and fifty-four trials were included in this Key Question, providing results for the following outcomes: vasomotor symptoms (187 trials), quality of life (108 trials), psychological symptoms (90 trials), sexual function (76 trials), urogenital atrophy (63 trials), and sleep disturbance (48 trials). Some trials contributed results to more than one outcome.
Evidence synthesis was dependent on the number of trials with comparators and outcomes that could be appropriately pooled. When the number of trials allowed for a synthesis of outcomes by comparator group, either meta-analyses or pairwise comparisons were performed. Strength of evidence was then determined. When there were not enough trials for certain comparators and outcomes, synthesis was not possible and strength of evidence was not determined. Descriptions of these trials are provided.
Results for KQ1 are presented by outcome. Within each of these six categories, there are the following sections: a summary table of the included trials; a presentation of the quantitative synthesis (network meta-analysis and/or pairwise comparisons) for trial data amenable to pooling; a strength of evidence rating for synthesized evidence; a summary of the trials that were not amenable to a quantitative synthesis; and key points.
Navigating Key Question 1 Results
Owing to the use of different outcome scales all results were quantified in a standardized effect metric or a standardized mean difference (SMD). Interpreting results when continuous effect measures and multiple scales are used is challenging; it is difficult to infer proportions of women achieving minimally clinically important improvements.79, 80 The GRADE Working Group has suggested alternative approaches to SMDs for analysis and interpretation of continuous outcomes—transformation to a common scale, conversion to relative or absolute effects, ratios of mean, and analysis in minimally important difference (MID) units. Still, none is a substitute for differences in clinically meaningful response between treatments. With the exception of vasomotor symptoms, the alternative approaches were judged less than satisfactory, owing to the large number of instruments used (e.g., the need to define an MID for each).
Still, as a guide for interpretation and as noted in the methods, with control-group event rates of 20 to 60 percent, SMDs can be expressed as odds ratios—magnitudes of -0.2, -0.3, -0.4, -0.5, 0.3, 0.6, and 0.75 corresponding to odds ratios of 0.7, 0.6, 0.5, 0.4, 2, 3, and 4 respectively. For example, the placebo response rate of women with vasomotor symptoms can range from approximately 20 to 40 percent.81-83
Except for sexual function and psychological outcomes, results are displayed first as a grid or matrix displaying comparisons among multiple treatments or agents. When a network meta-analysis was performed (vasomotor symptoms, quality of life, and sleep outcomes), all comparisons are represented as estimated by the model—direct and indirect. For pairwise results, only direct comparisons are displayed. Table 7 displays how comparisons are presented in the grid or matrix form. Forest plots for pairwise comparisons can be found in the appendixes. When a network meta-analysis was performed, a table of rank efficacy for treatments is shown. Finally, a graphical representation is provided as a caterpillar plot that summarizes all pooled estimates or forest plots, which can be found in appendices. Note that for the network meta-analyses, the plot incorporates all possible comparisons between agents in the analyses, whereas for others, only pairwise pooled (not single-trial) comparisons are shown.
Strength of evidence ratings are provided in the text and in tabular form for comparisons with placebo involving multiple trials and between active comparators where multiple trials were able to be pooled (e.g., between different estrogen doses or routes of administration). All comparisons represented by single trials were judged insufficient.
Strength of Evidence Ratings—Vasomotor Symptoms
Key Points
- A total of 211 trials including over 53,000 women examined treatment of vasomotor symptoms with prescription agents (estrogen, SSRIs, SNRIs, gabapentin, progestogens, eszopiclone, and clonidine) and nonprescription agents (isoflavones, black cohosh, vitamin E, flax seed, St. John’s wort, ginseng, and a variety of herbs and other agents).
- Study quality was generally rated poor (75 percent). The sole funding source was industry for 105 trials and public for 31 trials. A combination of industry and public funding was noted in 12 trials. Funding was not identified for 63 trials.
- Amelioration of vasomotor symptoms was measured using a number of different patient-reported outcomes—most trials commonly included some metric of hot flushes.
- Strength of evidence of the comparative effectiveness of agents in relieving vasomotor symptoms is as follows:
- There is high strength of evidence that estrogen is the most effective agent for relieving vasomotor symptoms. Combined results of trials that included a total of more than 22,000 women showed that the SMD is -0.5 or lower, corresponding to approximately 3 or fewer hot flushes per day, compared with placebo.
- There is high strength of evidence that SSRIs or SNRIs improve vasomotor symptoms compared with placebo: SMD -0.35 (95% CI: -0.46 to -0.24; 13 trials, n=4,037).
- There is moderate evidence that gabapentin improves vasomotor symptoms compared with placebo: SMD -0.28 (95% CI: -0.38 to -0.19; 5 trials, n=1,936).
- There is low strength of evidence that isoflavones improve vasomotor symptoms compared with placebo: SMD -0.31 (95% CI: -0.41 to -0.22; 35 trials, n=4,022) owing to inconsistency, potential bias, and potential reporting bias.
- There is low strength of evidence that, black cohosh (SMD -0.31, 95% CI: -0.46 to -0.15; 4 trials, n=663) or ginseng (SMD -0.17, 95% CI: -0.43 to 0.09; 3 trials, n=513) improve vasomotor symptoms compared with placebo.
- There is insufficient evidence on the effectiveness of other agents.
- Analyses comparing effectiveness of treatments show estrogens alleviate vasomotor symptoms best, with the following mean rankings (1 being best, 9 worst—placebo ranked 8.9): high-dose estrogens (1.9), standard-dose estrogens (1.3), and low-dose estrogens (2.9). The nonhormone treatments were ranked much lower: SSRI/SNRI (4.9), gabapentin (5.6), isoflavones (5.9), black cohosh (6.7), and ginseng (7.0).
Included Trials
Of the 283 trials included in this review for KQ1, treatment effects on vasomotor symptoms were reported in 211 trials (74.6 percent). The trials included over 53,000 women enrolled at more than 3,800 sites. Twenty-two trials (10.4 percent) were multinational whereas 189 (89.6 percent) nonmultinational trials were conducted in 30 countries including Ecuador, Estonia, Greece, Islamic Republic of Iran, Norway, Singapore, Spain, Switzerland, Ukraine, Austria, Sweden, Thailand, Japan, Finland, Hong Kong, Netherlands, Brazil, Denmark, France, India, South Korea, Taiwan, China, Turkey, Australia, Canada, Germany, United Kingdom, Italy, and the United States (in order of increasing numbers with 71 United States trials).
The mean ages of women enrolled in individual trials ranged from 43.8 to 63.5 years (not reported in 28 trials). The average number years since menopause (4.1 years overall) was reported in 70 trials (33.1 percent). Race or ethnicity was reported in 76 trials (36.0 percent) (Table 8). The presence or absence of a uterus in women was stated in 158 trials (74.9 percent) and most (n=90, 42.7 percent) enrolled women in either category. Mean body mass index was noted in approximately two thirds of trials and ranged from 17.3 to 29.3 kg/m2. Other trial characteristics are shown in Table 8.
Approximately two-thirds of trials randomized women to 2 arms and the remainder to multiple arms. Followup ranged from 4 weeks (for trials of centrally acting agents including SSRIs, SNRIs, and gabapentin) to more than 5 years with a mean of 24.7 weeks. The most commonly studied agents were hormones (116 trials, 55.0 percent) administered by various routes and isoflavones (40 trials, 19.0 percent). Agents examined in fewer trials included SSRIs, SNRIs, eszopiclone, clonidine, methyldopa, gabapentin, isoflavones, black cohosh, St. John’s wort, ginseng, flax seed, vitamin E, dong quai, DHEA, other herbal ingredients, and combinations of nonprescription agents.
Vasomotor symptoms were ascertained and reported in different ways and in 93 trials (55.9 percent) using two or three metrics. The most common metric was hot flush frequency — daily or weekly (and both), but sometimes monthly. Daily occurrence was analyzed if reported, followed by weekly, and then monthly. Other instruments and metrics included hot flush severity, night sweats, indices combining frequency and severity of hot flushes, visual analogue scales, graphic rating scales, women experiencing greater than 50 or 80 percent improvement, and vasomotor scale components (e.g., Greene Climacteric Scale, MENQOL, WHQ, MRS, Kupperman Menopausal Index). The vasomotor domains of specific scales were as follows:
- Greene Climacteric Scale includes one hot flush and one night sweat item each rated 0 (none) to 3 (severe).
- WHQ includes one hot flush and one night sweat item rated as 0 (not at all) to 3 (definitely).
- MENQOL vasomotor domain includes hot flushes, night sweats, and sweating items scaled from 0 (not at all bothered) to 6 (extremely bothered).
- Kupperman Menopausal Index includes one hot flush item, scaled from 0 (none) to 3 (severe).
- MRS includes a rating of hot flushes and sweating, scaled from 0 (none) to 4 (very severe).
Some measure of hot flush frequency was reported in 132 trials (62.6 percent), hot flush severity in 63 (29.9 percent), night sweats in 25 (11.8 percent), combined hot flush and night sweats in 19 (9.0 percent), Greene vasomotor scale in 26 (12.3 percent), Kupperman vasomotor in 21 (10.0 percent), MENQOL vasomotor in 25 (11.8 percent), WHQ vasomotor in 11 (5.2 percent), MRS in 9 (4.5 percent), and another measure in 33 (15.6 percent). We included in the analyses the most commonly reported outcome metric (hot flush frequency) followed by next most common (severity) and so on. Overall, 147 (69.7 percent) trials reported hot flush frequency, severity, and or night sweats.
Most trials were rated as poor quality (n=158, 74.9 percent); 26 (12.3 percent) fair and 24 (11.4 percent) good quality. The funding source was not stated for 63 trials (29.9 percent), 105 (49.8 percent) appeared wholly industry sponsored, 12 (5.7 percent) reported some industry funding, and 31 (14.7 percent) funding only from public sources. Table 8 displays further detail summarizing trial and patient characteristics.
Evidence Synthesis
Meta-Analysis
Treatments studied in multiple trials and of likely greatest clinical interest included estrogens (high-, standard-, and low/ultralow-dose), SSRI/SNRIs, and gabapentin, isoflavones, black cohosh, and ginseng. Comparisons between one or more nonplacebo treatments were reported for all treatments except ginseng, and gabapentin. Comparative efficacy of these agents was examined in a network meta-analysis including results from 157 trials. Figure 3 displays the network of comparisons. Data were most extensive for estrogens (n=133 comparisons) followed by isoflavones (n=37), SSRI/SNRIs (n=14), black cohosh (n=8), gabapentin (n=5), and ginseng (n=3) (comparisons exceed trial total owing to multi-arm trials).
Four trials were examined only in sensitivity analyses owing to inconsistencies with the network and clinically or numerically improbable estimates. One trial84 found black cohosh superior to fluoxetine (SMD -0.49, 95% CI: -0.94 to -0.05). SMDs from three trials were judged not numerically plausible—one reporting effectively complete resolution of hot flushes with both estrogen and isoflavones;85 and two trials reported no placebo effect and SMD magnitudes inconsistent with other placebo comparisons (SMD -1.81 95% CI: -2.26 to -1.36 for black cohosh;86 and -3.13, 95% CI: -4.33 to -1.94) for isoflavones87). The network estimates were otherwise generally consistent (Appendix F, Figure F-11 and Table F-1), but these results suggested examining the influence of black cohosh trial results. Additionally, owing to the large number of trials and their various reported characteristics, other sensitivity analyses were also performed. The set of sensitivity analyses included networks restricted to: 1) trials specifying vasomotor symptoms as a primary outcome or requiring symptoms for inclusion, 2) excluding trials judged to have included women without vasomotor symptoms, 3) excluding all black cohosh trials (owing to some evidence of inconsistency), 4) trials rated good or fair quality, 5) trials examining effects on moderate to severe hot flushes, and 6) excluding trials focused on disease prevention.
To facilitate interpreting effects across multiple scales that required pooling standardized effect sizes, we transformed effects79, 80 to hot flush frequencies. Predicted comparative reductions in daily hot flushes corresponding to standardized effect sizes were obtained by fitting a regression model (piecewise being quadratic for SMDs less than 0 and linear otherwise) to pooled results from trials reporting hot flush reductions accompanying standard dose estrogen, low dose estrogen, and SSRI/SNRIs. The transformation from standardized effects to hot flush frequency reduction assumes that the relationship between SMDs and hot flushes can apply to the various scales. That assumption cannot be tested and the results therefore appropriately used to assist interpretation. However, as the majority of data pooled were obtained from some hot flush measure, the predicted estimates are plausibly accurate values, and are similar in magnitude reported in placebo comparison meta-analyses restricted to studies reporting hot flush frequencies.25 Finally, these results were similar restricting the conversion to only trials reporting moderate-to-severe hot flushes.
Table 9 and Figure 4 display estimated SMDs and 95% credible intervals from the fitted model. Negative values represent comparative improvements in vasomotor symptoms. In Table 10, the bottom row shows SMDs comparing each treatment with placebo, the next row up SMDs comparing each treatment with ginseng, and so forth. Of all comparators, estrogens appeared the most effective relieving vasomotor symptoms; only the credible interval for the indirect comparison of low/ultralow dose estrogens with gabapentin did not exclude 0. The magnitudes of effect for SSRI/SNRIs, isoflavones, gabapentin, black cohosh, and ginseng were substantially lower. Table 10 and Figure 5 display rankings of efficacy with estrogens consistently the highest ranked, followed by SSRI/SNRIs, gabapentin, isoflavones, black cohosh, and ginseng. Similar results for effect magnitudes were obtained across the sensitivity analyses, with some differences in credible intervals and rankings attributable to smaller numbers of included trials (Appendix F, Tables F-2 through F-13).
Figure 6 displays effects transformed to comparative daily hot flush frequency reductions. Compared with placebo, estrogens were accompanied by reductions between two to three hot flushes per day, while the remainder of agents by approximately one or fewer.
Finally, Table 11 displays results from pairwise meta-analyses for all direct comparisons. Heterogeneity was evident for comparisons of standard dose estrogen and isoflavones with placebo—both including a large number of comparisons. This is most likely attributable to underlying clinical heterogeneity and samples of women having a wide range of symptoms.
Estrogen Compared With Placebo
There were 101 pairwise comparisons of placebo with estrogen—nine high-dose (one good, one fair, and seven poor quality trials), 39 standard dose (three good, six fair, and 30 poor quality trials), and 53 with low/ultralow dose (two good, nine fair, and 42 poor quality trials). The magnitudes of pooled SMDs for all doses of estrogen were comparatively large and the estimates precise. Although most trials were rated poor quality, given consistency over a large number of comparisons the strength of evidence that estrogens (of any dose) improve hot flush symptoms is rated high.
Estrogen Compared With Estrogen
Comparisons among estrogens included 12 high versus standard dose (one good, one fair, and 10 poor quality trials), five high versus low/ultralow dose (all poor quality trials), and 24 standard versus low/ultralow dose (one good, four fair, and 19 poor quality trials). Direct effects were derived from 41 trials, of which, five were rated as good or fair quality. Pooled estimates differed only between standard and low/ultralow dose categories. However, heterogeneity was substantial in the pairwise analysis (tau2=0.02 or a between-study effect standard deviation of 0.14). Moreover, there was no apparent dose-response across high, standard, and low/ultralow dose estrogens compared with placebo—respective SMDs -0.50, -0.64, and -0.55. The strength of evidence that there is similar improvement in vasomotor symptoms across estrogen doses is rated moderate.
Isoflavones Compared With Placebo
There were pairwise comparisons of isoflavones with placebo included from 35 trials (five good, two fair, and 28 poor quality). The funnel plot and Egger test (p=0.017) were consistent with possible publication bias. Limiting the pairwise analysis to the seven fair and good quality trials yielded an SMD of -0.12 (95% CI: -0.31 to -0.08; tau2=0.04). SMDs in seven trials favored placebo (see Figure F-5 in Appendix F). The strength of evidence that isoflavones improve hot flush symptoms compared with placebo is rated low.
Gabapentin Compared With Placebo
Comparisons of gabapentin with placebo were pooled from five trials (one good and two poor quality; two trials not rated owing to lack of complete publication). The estimated SMD was precise and significantly different from placebo. The strength of evidence that gabapentin improves hot flush symptoms compared with placebo is rated moderate.
SSRI/SNRI Compared With Placebo
There were 13 comparisons of SSRIs or SNRIs (including escitalopram, venlafaxine, desvenlafaxine, citalopram, fluoxetine, and paroxetine) with placebo (four good, three fair, and six poor quality trials). The SMD was precise and effect differed from placebo (-0.37; 95% CrI: -0.51 to -0.23), was similar limited to the good and fair quality trials in a pairwise analysis (-0.33; 95% CI: -0.42 to -0.24; tau2=0.006), or those of venlafaxine or desvenlafaxine alone (-0.36; 95% CI: -0.55 to -0.17; tau2=0.04; 6 trials). The strength of evidence that SSRIs or SNRIs improve hot flush symptoms compared with placebo is rated high.
Black Cohosh Compared With SSRI
Oktem et al.84 compared black cohosh with fluoxetine for treatment of menopausal symptoms—120 randomized women with 85 (70.1 percent) women evaluated at 12 weeks. Trial quality was rated poor. Using a “monthly hot flush score” the authors reported black cohosh superior to fluoxetine SMD of -0.49 (95% CI: -0.94 to -0.05). (As noted earlier, this trial result was not included in the network owing to inconsistency.)
Black Cohosh Compared With Placebo
Four trials compared black cohosh with placebo (two poor and two good quality) with a pooled SMD of -0.24 (95% CrI: -0.46 to -0.03). The strength of evidence that black cohosh improves hot flush symptoms compared with placebo is rated low.
Ginseng Compared With Placebo
Three trials compared ginseng with placebo (one fair and two poor quality)88, 89 yielding a pooled SMD of -0.20 (95% CrI: -0.51 to 0.12). The strength of evidence that ginseng improves vasomotor symptoms compared with placebo is rated low.
Different Routes of Estrogen Administration
Ten trials90-99 compared different routes (oral, topical, and nasal) of estrogen administration employing similar doses (one good and nine poor quality). Nine trials used a standard estrogen dose. Routes of administration were compared in network analysis demonstrating no differences between routes. Results are displayed in Figure 7. All credible intervals overlapped and SMDs were close to 0 (topical versus oral: -0.07, 95% CrI: -0.39 to 0.20; topical versus nasal: 0.02, 95% CrI: -0.27 to 0.29; oral versus nasal: 0.09, 95% CrI: -0.12 to 0.33). The strength of evidence that the effect of estrogens improving vasomotor symptoms does not differ according to route of administration is rated high.
Trials Not Pooled
If there were fewer than three trials with the same comparators, pooled analyses (meta-analysis or paired comparisons) could not be performed.
Progesterone and Other Hormones Compared With Placebo
Five trials (Table 12) were identified that compared progesterone in different doses, either with estrogen100, 101 or alone,102-104 for relief of vasomotor symptoms. Three of the trials administered progesterone through a cream,102-104 one through a patch,100 and one orally.101 Among the trials using cream, one found significant vasomotor symptom relief with low doses of progesterone,104 with a standard mean difference of -1.67 (95% CI: -2.26 to -1.06). The other two progesterone cream trials report no significant symptom relief.102, 103 Rozenberg et al. reported that both sequential and continuous administrations of transdermal estrogens/progesterones were as effective as a combination estrogen patch and oral progesterones.100 Gambacciani et al. reported equally significant improvements in vasomotor symptoms among several combinations of estrogens/progesterones.101 Because trials studied different therapy combinations, the strength of evidence was not rated.
Other Prescription Agents Compared With Placebo
One trial compared eszopiclone, a sedative hypnotic, with placebo for the relief of vasomotor symptoms (Table 13).105 In this randomized, double-blind, placebo-controlled crossover trial, half the participants (n=30) received eszopiclone patches for four weeks, followed by a two-week washout period, and then four weeks of placebo patches. The other half of the participants (n=29) received the placebo patches first, followed by the eszopiclone patches. There was no difference between eszopiclone and placebo in the relief of vasomotor symptoms.105
One trial compared clonidine with placebo and reported mean change in weekly hot flushes (Table 13).106 In this double-blind, placebo-controlled crossover trial, treatment lasted four weeks. Treatment with clonidine resulted in 19.2 fewer hot flushes per week while 13.1 fewer hot flushes per week were reported during the placebo phase. The SMD was -0.08 (95% CI: -0.51 to 0.35).
Other Nonprescription Agents Compared With Placebo
Twenty-seven trials, not appropriate for pooling, compared nonprescription treatments with placebo for the relief of vasomotor symptoms (Table 14). Nonprescription treatments included various herbal or plant extracts,107-125 black cohosh,126-128 St. John’s wort,126, 128, 129 DHEA,130 and other nutritional supplements.131-133 Eleven of the trials showed significant improvements in vasomotor symptoms compared with placebo: two trials which combined black cohosh with St. John’s wort,126, 128 and one trial each of Nutrafem® (mung beans and eucommia bark),108 pine extract,110 isoflavones/lactobacilli/magnolia bark,111 rheum rhaponticum,112 Femal® (pollen and pistol extract),113 Estro-G 100 (cynanchum wilfordii, phlomis umbrosa, angelica gigas),119 Jiawei Qing’e Fang,120 a combination of Chinese herbs,123 and a combination of micronutrients.133 The variety of treatments and dosages among these 27 trials did not allow for pooling effects.
Estrogen Compared With a Nonprescription Agent
Two trials (Table 15) compared estrogen, with or without progestin, with a nonprescription treatment, pueraria mirifica134 and licorice135 in one trial each, for the relief of vasomotor symptoms. Pueraria mirifica is a highly estrogenic herb found in Thailand and licorice is a plant with estrogenic properties. In the pueraria mirifica trial, both hormone therapy and pueraria mirifica reduced hot flushes equally well. After three months of followup, pueraria mirifica reduced the average Greene score from 2.1 to 0.55 and estrogen treatment reduced the score from 2.1 to 0.35.134 In the licorice trial, only the estrogen and progestin treatment significantly reduced the number of hot flushes, though the difference between the two treatment groups was not significant.135
Nonprescription Agents Compared
Four trials (Table 16) compared nonprescription agents for relief of vasomotor symptoms. In one trial, two different doses of pueraria mirifica were equally effective in relieving vasomotor symptoms,136 and in another trial, two different doses of isoflavones were equally effective in relieving vasomotor symptoms.137 One trial compared isoflavones alone with isoflavones and magnolia bark. Both treatments were equally effective in relieving vasomotor symptoms.138 In a trial comparing vitamin E with isoflavones, isoflavones significantly improved vasomotor symptoms compared with vitamin E. After one year followup, 41.9 percent of the isoflavones group report no more hot flushes and 16.1 percent of the vitamin E group report no more hot flushes (p<0.05).139
Trials Without Quantifiable or Poolable Data
Five trials lacked sufficient data to estimate an effect size or would have yielded a problematic estimate. Results of these trials would not have affected the overall outcomes presented above.
Raynaud et al. conducted a three-arm trial using transdermal patches with low, standard, and high doses of estrogen.140 All doses were considered effective, using percent reporting greater than a 50 percent reduction in weekly hot flushes as an outcome: 99.2 percent of women treated with the low dose patch, 100 percent of women treated with the standard dose patch, and 97 percent of the women treated with the high dose patch.140
Hidalgo et al. conducted a trial comparing two different doses of a treatment that combined isoflavones, primrose oil, and vitamin E. Both doses worked similarly in reducing the Blatt-Kuperman hot flush score.141
A trial comparing oral (n=35), gel (n=25), and patch (n=28) administrations of estrogen with or without progestogen collected information on complete symptom relief of vasomotor symptoms. The authors reported the following percentages experiencing complete vasomotor symptom relief: oral 62 percent, gel 95 percent, and patch 100 percent.99
In the series of SMART (Selective estrogens, Menopause, And Response to Therapy) trials, low and standard doses of conjugated estrogens were combined with different doses of bazedoxifene and compared with placebo. The SMART-1 trial performed on analysis on a subset of subjects who had greater than or equal to seven moderate to severe hot flushes per day (n=216). Lobo et al. reported that all treatment dosages significantly reduced the frequency of hot flushes, but the number in each treatment group was not provided.142
Gupta et al. conducted a trial comparing conjugated equine estrogen, DHEA, and placebo. The authors did not report the proportions of women experiencing vasomotor symptoms at baseline for any of the groups. At followup, 36 percent of the placebo group, 12 percent of the estrogen group, and 16 percent of the DHEA group reported hot flushes.143
Strength of Evidence Ratings—Vasomotor Symptoms
Table 17 summarizes strength of evidence ratings.
Quality of Life
Key Points
- A total of 125 trials including over 58,000 women reported some measure of quality of life or general well-being after treatment with prescription (estrogen, SSRIs, SNRIs) and nonprescription agents (isoflavones, black cohosh, vitamin E, flax seed, ginseng, and a variety of herbs and other agents).
- Study quality was generally rated poor (73 percent). Industry was reported as the sole funding source for 55 trials, 22 trials were supported by public funds alone, and a combination of industry and public funding in 9 trials. Funding support was not stated for 39 trials.
- Results were reported from a variety of scales—a majority used menopause-specific instruments.
- Strength of evidence of the comparative effectiveness of agents for improving measures of quality-of-life scores is as follows:
- There is high strength of evidence that estrogen of any dose is effective improving measures of quality of life compared with placebo. Combined results of trials that included a total of more than 35,000 women showed SMDs between 0.40 and 0.55 compared with placebo. In a network meta-analysis estrogens of any dose consistently ranked higher than SSRI/SNRI, isoflavones, black cohosh, or ginseng.
- There is high strength of evidence that SSRIs or SNRIs improve quality-of-life measures compared with placebo: SMD 0.28 (95% CI: 0.17 to 0.37; 6 trials, n=3,518).
- Strength of evidence ratings for other agents compared with placebo were either low (ginseng, isoflavones) or insufficient (black cohosh).
- Analyses comparing effectiveness of treatments show estrogens improve quality-of-life symptoms best, with the following mean rankings (1 being best, 8 worst; placebo ranked 7.8): standard dose estrogens (1.6), high dose estrogens (1.8), and low dose estrogens (3.6). The nonhormone treatments were ranked much lower: SSRI/SNRIs (4.9), isoflavones (5.1), black cohosh (5.9), and ginseng (5.5).
Included Trials
Of the 283 trials included in this review, 125 (44.2 percent) reported general well-being or quality-of-life outcomes (69 trials specified as a primary outcome). Fifty-nine trials examined hormone treatment effects on these outcomes, including the following comparators: placebo (40 trials), other hormones (16 trials), and nonprescription treatments (three trials). Fifty-four trials examined nonprescription treatment effects including the following comparators: placebo (44 trials), other nonprescription treatments (three trials), hormones (two trials), and SSRIs (one trial). Nonprescription treatments included isoflavones, ginseng, black cohosh, DHEA, herbal extracts, and vitamins and minerals. Seven trials compared SSRI/SNRIs’ effect on quality of life compared with placebo (six trials) and nonprescription treatments (one trial). Desvenlafaxine, escitalopram, and fluoxetine were the SSRI/SNRIs included in the trials.
The 125 trials were conducted in over 29 countries; 16 trials were multinational. Trials conducted in single countries were most commonly from the United States (n=19), Italy (n=10), Germany (n=7), Australia (n=5), Brazil (n=5), and Turkey (n=5). Other countries included Austria, Canada, China, Denmark, France, Hong Kong, Netherlands, Taiwan, United, Kingdom, India, South, Korea, Thailand, Japan, Belgium, Ecuador, Estonia, Finland, Norway, Poland, Singapore, Spain, Sweden, Switzerland, and Ukraine. The trials were conducted in over 2,400 sites. Length of followup ranged from 8 to 187 weeks.
General well-being and quality-of-life outcomes were reported using a variety of scales, both general health-related quality-of-life scales and menopause-specific quality-of-life scales. A majority of the trials used menopause-specific scales (n=90), which focus on physical and psychological symptoms relating to menopause. Several trials used general health-related quality-of-life measures that include broader domains, such as the Short Form-36 (SF-36, sometimes referred to as Rand-36), EuroQol, Utian QOL, and 15D. The most common scales in the included trials were: Kupperman Menopausal Index (n=59), Greene Climacteric Scale (n=20), Menopause Rating Scale (MRS) (n=10), Menopause-specific Quality of Life (MENQOL) (n=14), and SF-36 (n=4). The following are brief descriptions of commonly used scales:
- The Kupperman Index is a numerical index that scores 11 menopausal symptoms: hot flushes, paresthesia, insomnia, nervousness, melancholia, vertigo, weakness, arthralgia or myalgia, headache, palpitations, and formication. Each symptom is rated from 0 to 3 according to severity, where 0 = no symptoms and 3 = most severe. The scores are weighted and a total sum is calculated. The maximum score is 51 points, with a higher score indicating a worse quality of life.
- The Greene Climacteric Scale includes 21 questions covering five domains: anxiety, depression, somatic symptoms, vasomotor symptoms, and sexual function. Each question is answered on a four-point Likert scale (0 – “not at all”; 1 – “a little”; 2 –“quite a bit”; 3 – “extremely”). The answers to all 21 questions are summed to give a total quality-of-life measure; a higher score indicates a worse quality of life.
- MENQOL consists of 29 questions covering four domains: vasomotor, psychosocial, physical, and sexual. The scoring for each question is 1 – “No”, 2 –“Yes, but not at all bothered” through 8 – “Yes, extremely bothered.” The scores for each question are summed for a total quality-of-life score, in which the higher score indicates a worse quality of life.
- MRS scores 11 menopausal symptoms: hot flushes, heart discomfort, sleep problems, depressive mood, irritability, anxiety, physical and mental exhaustion, sexual problems, bladder problems, vaginal dryness, and joint and muscular discomfort. Each item is scored from 0 – “none” to 4 – “extremely severe.” The scores are summed for a total quality-of-life score, in which a higher score indicates a worse quality of life.
- SF-36, or Rand-36, is a general quality-of-life scale, not created specifically for menopausal women. This scale consists of 36 questions covering the following eight domains: physical functioning, role limitations caused by physical health problems, role limitations caused by emotional problems, social functioning, emotional well-being, energy/fatigue, pain, and general health perceptions. The answer to each question is transformed linearly to a 0-100 score and then all items in one domain are averaged. This scale can be used to produce outcomes on a total quality of life, subscores for each of the domains, a physical health subscore, or a mental health subscore. For this scale, the higher the score, the better the quality of life.
Study quality was generally rated as poor (72.8 percent), with 18 good and 16 fair quality trials. Industry funding was indicated in 64 trials and public funding was reported in 31 trials. Table 18 describes additional trial and patient characteristics.
Evidence Synthesis for Quality of Life
Meta-Analysis
Treatments of greatest clinical interest and studied in multiple trials were compared in a network meta-analysis in addition to pairwise analyses—estrogens (according to dose), SSRI/SNRIs, isoflavones, black cohosh, and ginseng. Figure 8 displays the network and comparisons included. Data were most extensive for estrogens (72 comparisons), followed by isoflavones (24 comparisons), and SSRI/SNRIs (7 comparisons). The result from a trial concluding that women taking black cohosh had considerably better general well-being than those given fluoxetine84 was not incorporated in the main network analysis; the effect was qualitatively (opposite effect direction) inconsistent with the other results. Finally, in sensitivity analyses, we excluded eight trials utilizing general quality-of-life measures.
Table 19 displays estimated standardized mean differences and 95% credible intervals from the fitted model. In the bottom row are SMDs comparing each treatment with placebo, the penultimate row are SMDs comparing each treatment with ginseng, and so forth. Compared with placebo, the greatest improvement in quality-of-life scores were reported in women taking estrogens. The results suggested greater improvements with standard compared with low/ultralow dose estrogens (95% CrI: 0.01 to 0.29). Compared with placebo, SSRI/SNRIs and isoflavones were associated with effects of lesser magnitude different from 0. Neither black cohosh nor ginseng had statistically significant effects in the network analysis, although the pairwise result was consistent with an effect for ginseng. In a sensitivity analysis, excluding trials using general health related quality-of-life scales, resulted in comparable effect sizes and credible intervals that did not substantively change these results (Appendix G, Tables G-1 and G-2).
Figure 9 displays the estimated SMDs estimated from the network. Table 20 lists comparative treatments ranked with accompanying uncertainty; lower ranking representing greater improvement in reported quality-of-life scores. Although there is overlap of the credible intervals, estrogens appear to be superior to other agents in the network. Finally, Table 21 displays pooled effects from pairwise meta-analyses. There was little discrepancy with the network analysis indicating the network-estimated direct and indirect effects are likely accurate representations.74
Estrogen Compared With Placebo
There were 48 pairwise comparisons of estrogen with placebo—five with high-dose estrogen (one fair and four poor quality trials), 26 with standard dose (two good, six fair, and 18 poor quality trials), and 17 with low/ultralow dose (one good, six fair, and 10 from poor quality trials). The estimated SMDs for high, standard, and low/ultralow estrogen doses were 0.76 (95% CI: 0.48 to 1.03; tau2=0.06), 0.55 (95% CI: 0.41 to 0.69; tau2=0.10), and 0.36 (95% CI: 0.27 to 0.45; tau2=0.05) (). The funnel plot of the standard dose estrogen−placebo comparison exhibited asymmetry, but was attributable to three large trials focused on prevention and using general quality-of-life instruments.35, 144, 145 The mean ages of women in those trials were at the upper end of the distribution (62.8 to 63.6 years); excluding those trials yielded a symmetric funnel plot and an SMD of 0.64 (95% CI: 0.46 to 0.82; tau2=0.17; 23 trials) with notable heterogeneity. Limiting the pooling further excluding poor quality trials resulted in an SMD of 0.65 (95% CI: 0.38 to 0.92; tau2=0.09; 6 trials). The magnitudes of pooled standardized mean differences for all dose categorizations of estrogen are large and the estimates are precise. Although many trials were rated poor quality, with consistency over a large number of comparisons, the strength of evidence that estrogens of any dose improve quality-of-life scores compared with placebo is rated high.
Estrogen Compared With Estrogen
Seven trials (all poor quality) compared high with standard dose estrogens, three trials (all poor quality) compared high with low dose, and twelve trials (five fair and seven poor quality) compared standard with low dose estrogens with low-dose. Pooled estimates showed no or little differences between dose categories: high versus standard (SMD: -0.06; 95% CI: -0.16 to 0.04; tau2=0.00); high versus low/ultralow (SMD: 0.04; 95% CI: -0.25 to 0.33; tau2=0.04); and standard versus low/ultralow (SMD: 0.13; 95% CI: 0.02 to 0.24; tau2=0.02). Although there was a difference between standard and low/ultralow dose estrogens, the magnitude of effect was small. Additionally, there was no evidence for dose response. The strength of evidence that changes in reported quality-of-life scores do not meaningfully differ by estrogen dose is rated moderate.
Estrogen Compared With Isoflavones
A single trial (poor quality) compared standard dose estrogens with isoflavones (SMD: 0.22; 95% CI: -0.25 to 0.70).
SSRI/SNRI Compared With Placebo
There were six trials that compared SSRI/SNRIs with placebo (three good, one fair, and two poor quality). The standardized mean difference was 0.27 (95% CI: 0.17 to 0.39; tau2=0.01). The strength of evidence that SSRI/SNRIs improve quality of life among menopausal women is rated high.
Isoflavones Compared With Placebo
There were 24 trials comparing isoflavones with placebo (three good and 21 poor quality). The standardized mean difference was 0.27 (95% CI: 0.17 to 0.37; tau2=0.02). Funnel plot asymmetry was notable and Egger test significant (p=0.03). The pooled SMD from the three good quality trials was 0.19 (95% CI: -0.20 to 0.57). The strength of evidence that isoflavones improve quality-of-life scores compared with placebo is rated low.
Black Cohosh Compared With Placebo
Four trials comparing black cohosh with placebo reported quality-of-life outcomes (two poor quality, one fair, and one good). The pooled SMD was 0.26 (95% CI: -0.15 to 0.66; tau2=0.14). The strength of evidence that black cohosh improves quality-of-life scores is rated insufficient.
Ginseng Compared With Placebo
Three trials (one fair and two poor quality) including 513 women, compared ginseng with placebo resulting in a pooled SMD of 0.19 (95% CI: 0.01 to 0.36; tau2=0.00). The strength of evidence that ginseng improves quality-of-life scores is rated low.
Trials Not Pooled
Different Routes of Estrogen Administration
Seven trials compared similar estrogen doses administered through different routes (Table 22).90-94, 146, 147 (See Appendix D for dose categorization by route of administration.) Three trials compared estrogen spray with estrogen patch, two compared oral estrogen with estrogen spray, one compared oral estrogen with estrogen patch, and one compared estrogen patches administered sequentially or combined. These trials were not included in the meta-analyses. Six of the seven trials showed no difference between the routes of administration, with all routes improving quality of life. One trial comparing an estradiol patch with an estradiol spray found that both routes significantly improved quality of life, with the spray improving significantly more than the patch.91 These results support a conclusion, limited by trial quality, that route of administration does not determine estrogen effectiveness with respect to changes in quality-of-life scores. The strength of evidence that quality-of-life scores do not differ by route of estrogen administration is rated moderate.
Estrogen Compared With a Nonprescription Agent
One trial compared estrogen/progestin with a nonprescription treatment, pueraria mirifica134 and reported quality-of-life outcomes (Table 23). Pueraria mirifica is a highly estrogenic herb found in Thailand. Both hormone therapy and pueraria mirifica improved quality of life similarly. After three months of followup, pueraria mirifica reduced the total modified Greene score from 29.0 to 12.6 and estrogen/progestin treatment reduced the score from 32.3 to 9.6.134
Different Doses of Same Nonprescription Treatments
Three trials compared different doses of the same nonprescription treatments and reported quality-of-life outcomes (Table 24).136, 137, 141 Two trials compared two doses of isoflavones and reported significant improvements in quality of life in both groups, with no between-group difference.137, 141 The other trial compared two doses of pueraria mirifica and also reported significant improvements in quality of life in both groups, with no difference between doses.136
SSRI/SNRIs Compared
One trial compared two different SSRI/SNRIs, desvenlafaxine and escitalopram, and reported quality-of-life outcomes (Table 25).148 The trial was of good quality and reported that both antidepressants improved quality-of-life scores significantly, without a difference between groups.
Nonprescription Agents Compared With Placebo
Twenty-three trials compared nonprescription treatments with placebo (Table 26). Three trials tested DHEA,130, 149, 150 three trials used herbal extracts,109, 119, 122 two trials combined isoflavones and black cohosh,127, 151 two trials combined black cohosh and St. John’s wort,126, 128 two trials used dong quai,115, 123 and two trials tested flaxseed.131, 152 St. John’s wort,129 rheum rhaponticum,112 pollen extract,113 a vitamin/mineral mixture,132 dioscorea alata,117 green tea,153 pomegranate seed oil,118 maritime pine extract,125 and ovaria bovis121 were compared with placebo in one trial each.
The three DHEA trials (two of poor quality), with a total of 365 participants, reported inconsistent results. Two trials of oral DHEA compared with placebo did not find significant differences in quality of life among study groups.130, 150 One trial compared three different doses of DHEA in vaginal ovules with placebo and found improvements in quality-of-life scores with two of the three doses compared with placebo.149 The strength of evidence that DHEA improves quality-of-life scores was rated insufficient.
The two trials that combined black cohosh with St. John’s wort reported significant improvements in quality of life compared with placebo. One trial with 77 women had a standard mean difference of 0.78 (95% CI: 0.31 to 1.24)126 and the other trial with 294 women had a standard mean difference of 0.39 (95% CI: 0.16 to 0.62).128
Of the remaining trials, three found significant improvements in quality of life compared with placebo: a trial (n=64) using a mixture of Cynanchum wilfordii, Phlomis umbrosa, and Angelica gigas;119 a trial (n=75) using a combination of isoflavones and black cohosh;151 and a trial (n=108) using dong quai.123
Trials Without Quantifiable or Poolable Data
Below is a description of four trials that did not have data that could be analyzed by the standardized method or pooled because of the reporting metric. Results of these trials would not have affected the overall outcomes presented above.
The Estonian Postmenopausal Hormone Therapy Trial compared 0.625 mg estrogen plus 2.5 mg medroxyprogesterone acetate with placebo.154 Quality of life was measured using the EQ-5D developed by the EuroQol group. No baseline measures were reported. Post-treatment median EQ-5D scores showed no significant difference in quality of life among the treatment and placebo groups.
A randomized blinded trial (n=152) compared two different doses of black cohosh (39 mg and 127.3 mg) and reported median Kupperman Index scores as a measure of quality of life.155 Both black cohosh doses improved quality-of-life scores equally.
Foidart et al. compared a low-dose estrogen vaginal pessary with placebo and reported total Kupperman Index scores as a quality-of-life outcome. Kupperman Index scores decreased more with estrogen-alone therapy compared with the placebo.156
Pandit et al. compared a micronutrient supplement with placebo and reported percentage with negative well-being as an outcome. The placebo group had a baseline percentage of negative well-being of 48.3, which decreased to 24.0 after 12 weeks of followup. The group treated with micronutrients had a baseline for negative well-being of 55.2 percent, which decreased to 0.0 at followup.133
Psychological Symptoms
Key Points
- A total of 108 trials including over 52,000 women reported at least one psychological outcome measure (depressive symptoms, anxiety, and/or global psychological well-being) in women treated with prescription (estrogen, testosterone, SSRIs, SNRIs) and nonprescription agents (isoflavones, black cohosh, ginseng, DHEA, herbal extracts, and others).
- Study quality was generally rated poor (71 percent). Funding was reported provided by industry alone in 41 trials, public sources in 21 trials, industry and public sources in 10 trials, funding, and the type of funding was not stated for 36 trials.
- Psychological outcomes were reported using a variety of scales in three domains: global, anxiety, and depressive symptoms.
- Strength of evidence of comparative effectiveness of agents in treating psychological symptoms is as follows:
- There is high strength of evidence that, compared with placebo, an SSRI or SNRI is accompanied by improved depressive symptoms: SMD -0.43, 95% CI: -0.60 to -0.26; 5 trials, n=2,882); anxiety symptoms (outcomes for SNRI only): SMD -0.31, 95% CI: -0.50 to -0.12; 3 trials, n=2,688); and global psychological well-being: SMD -0.42 (95% CI: -0.60 to -0.24; 6 trials, n=3,021).
- There is high strength of evidence that, compared with placebo, that estrogens are accompanied by improved depressive symptoms: SMD -0.36 (95% CI: -0.53 to -0.20; 18 trials, n=2,104); anxiety symptoms: SMD -0.31 (95% CI: -0.50 to -0.18; 13 trials, n=1,718); and global psychological well-being: SMD -0.26 (95% CI: -0.40 to -0.13; 14 trials, n=3,386).
- There is low strength of evidence that, compared with placebo, isoflavones are accompanied by improved depressive symptoms: SMD -0.29, 95% CI: -0.49 to -0.09; 7 trials, n=1,055); and global psychological well-being: SMD -0.11 (95% CI -0.22 to 0.01; 7 trials, n=1,228); and moderate strength of evidence for improved anxiety symptoms: SMD -0.30 (95% CI: -0.46 to -0.14; 7 trials, n=853).
- There is insufficient evidence that gabapentin is accompanied by improved global psychological well-being compared with placebo: SMD -0.23 (95% CI: -0.48 to 0.02; 2 trials; n=252).
- There is insufficient evidence on the effectiveness of other agents and comparators on psychological outcomes.
Included Trials
Of the 283 trials included in this review for KQ1, 108 (35.4 percent) trials reported psychological outcomes in three domains: global, anxiety, and depressive symptoms (50 trials specified at least one as a primary outcome). Trials often reported outcomes in more than a single domain: global (n=61), anxiety (n=48), and depressive symptoms (n=61). Fifty-two trials examined hormones compared with: placebo (34 trials), other hormones (13 trials), and nonprescription agents (five trials). Other comparators categories are shown in Table 28.
The 108 trials originated from 24 different countries and 10 trials were described as multinational. Nonmultinational trials were conducted in the United States (n=24), United Kingdom (n=7), and six each from Turkey, Italy, Germany, and Canada; other countries included China, Hong Kong, India, Taiwan, Australia, Ecuador, France, Japan, Netherlands, Norway, Poland, Singapore, Ukraine, Finland, Sweden, Austria, Denmark, and Brazil. The trials were conducted at over 2,000 sites. Length of followup ranged from four to 192 weeks.
Psychological symptoms were reported using a variety of scales. The most common scales were: Greene (12 anxiety, 12 depressive symptoms, 15 global), WHQ (10 anxiety, 18 depressive symptoms, one global), MENQOL (22 global), Beck (four anxiety, eight depressive symptoms), Hamilton (six anxiety, seven depression), SF-36 (nine global), and Kupperman (six anxiety, six depressive symptoms). Additional scales used include CES-D, Hospital Anxiety and Depression Scale, Psychological General Well-Being, MRS, Profile of Mood States, and the Bond and Lader Mood Rating Scale. The following are brief descriptions of the most commonly used scales:
- The Greene anxiety subscale consists of six items, with scores ranging from 0 to 18.157 Questions include heart beating quickly and strongly, feeling tense or nervous, difficulty sleeping, excitable, attacks of panic, and difficulty concentrating. The Greene depressive symptom subscale consists of five items, with scores ranging from 0 to 15. Questions include feeling tired or lacking in energy, loss of interest in most things, feeling unhappy or depressed, crying spells, and irritability. Total psychological scores range from 0 to 33. Higher scores indicate more severe symptoms.
- The WHQ can be administered as a 23- or 37-item instrument. The 37-item version includes four items in the anxiety assessment: I get very frightened or panic feelings for apparently no reason at all, I feel anxious when I go out of the house on my own, I get palpitations or a sensation of “butterflies” in my stomach or chest, and I feel tense or “wound up.” The depressive symptom score includes seven items: I feel miserable and sad, I have lost interest in things, I still enjoy the things I used to, I feel life is not worth living, I have a good appetite, I am more irritable than usual, and I have feelings of well-being. Total scores on subscales are 0 to 1 (some scales reversed according to the construct probed). Higher scores indicate more severe symptoms.
- The MENQOL psychosocial score is derived from seven items (scored 1 for “not bothered” to 8 for “extremely bothered”): being dissatisfied with my personal life; feeling anxious or nervous; experiencing poor memory (no or yes); accomplishing less than I used to; feeling depressed, down, or blue; being impatient with other people; and feelings of wanting to be alone. Higher scores indicate more severe symptoms.68
- The Beck anxiety inventory and Beck depression inventory each include 21 items, scored from 0 for “not at all” to 3 for “severely bothered,” with total scores ranging from 0 to 63. The Beck anxiety inventory lists symptoms common to anxiety such as numbness, heart pounding, trembling, shaking, indigestion, and flushing.158 The Beck depression inventory assesses mood, satisfaction, appetite, sleep, weight, and sexual activity. Higher scores indicate more psychological distress.159
- The Hamilton scales are completed by a health care professional following an examination of the patient. This scale measures both mental distress as well as physical complaints related to anxiety and depression.160, 161 The Hamilton anxiety score consists of 14 items with a total score of 0 to 56. The depression scale consists of 21 items with a total score of 0 to 52. Higher scores indicated worse psychological health.
- The SF-36 mental health score consists of five items. The items assess nervousness, cheerfulness, peacefulness, depressive symptoms, and happiness. Scores are summed, then normalized to a 0-100 scale. Higher scores indicate improvement in mental health.162
- Kupperman measures insomnia, nervousness, and melancholia.163 Total scores range from 0 to 16 summed. Higher scores indicate more severe symptoms. Hospital Anxiety & Depression Scale (HADS) includes 14 items (seven depression and seven anxiety), with higher scores indicating more severe symptoms. The Psychological General Well Being is a 22-item derivative of the General Well Being Index Menopause Rating Scale, in which a higher score indicates better mental health.
In many cases, the presence of climacteric symptoms and/or anxious depressive disorders was required for inclusion in the study. However, women were often excluded if taking psychoactive drugs, had too high of a score on the assessment tool, or had suicidal thoughts. Table 28 further describes the trial and patient characteristics.
Evidence Synthesis for Psychological Symptoms
Standard mean differences were calculated to allow comparison of outcomes across different psychological symptom scales. Analyses were performed according to domain: anxiety, depressive symptoms, and global measures of psychological well-being. There were either few trials reporting comparisons between different estrogen doses, or in the single instance there were multiple comparisons there was little apparent difference between doses (standard versus low/ultralow doses for the global domain, SMD -0.06; 95% CI: -0.14 to 0.02; tau2=0.00, 9 trials). Estrogens were therefore combined in the analyses (results according to dose can be found in Appendix H). Because results from large trials focused on prevention with estrogen35, 144, 145 showed lesser effects, pooled effects including and excluding those trial results were estimated. In addition, trial results were pooled for isoflavones, SSRI/SNRIs, and gabapentin compared with placebo.
Table 29 displays effect estimates for psychological outcomes (forest plots shown in Appendix H) and Figure 10 a caterpillar plot for the comparisons.
SSRI/SNRI Compared With Placebo
Global
Six trials compared an SSRI or SNRI with placebo and reported a global measure of psychological well-being (three good and three poor quality).164-168 Compared with placebo, the pooled SMD for improved well-being on a global scale was -0.42 (95% CI: -0.60 to -0.24; tau2=0.03); limited to the three high quality trials -0.38 (95% CI: -0.56 to -0.20; tau2=0.02). The strength of evidence that an SSRI or SNRI is accompanied by improved psychological well-being compared with placebo is rated high.
Depressive Symptoms
Five trials compared an SSRI or SNRI with placebo and reported depressive symptoms (two good and three poor quality).168-172 Compared with placebo, the pooled SMD for improved reported depressive symptoms was -0.43 (95% CI: -0.60 to -0.26; tau2=0.02); limited to the three high quality trials -0.37 (95% CI: -0.60 to -0.15; tau2=0.02). The strength of evidence that an SSRI or SNRI is accompanied by improved depressive symptoms compared with placebo is rated high.
Anxiety
Three trials compared an SNRI (desvenlafaxine) with placebo and reported some measure of anxiety (two good and one poor quality trial).168, 171, 172 The pooled SMD for improvement in reported anxiety symptoms for estrogen compared with placebo was -0.31 (95% CI: -0.50 to -0.12; tau2=0.02). The strength of evidence that desvenlafaxine is accompanied by improved anxiety symptoms compared with placebo is rated high.
Estrogens Compared With Placebo
Global
Sixteen trials including one or more estrogen-placebo comparison and reported some global measure of psychological well-being (two good, six fair, and eight poor quality).144, 145, 173-186 Compared with placebo, the pooled SMD for improved well-being on a global scale from all trials was -0.18 (95% CI: -0.27 to -0.10; tau2=0.01), and excluding two large disease prevention focused trials -0.26 (95% CI: -0.40 to -0.13; tau2=0.04). There was no indication for potential reporting bias. The strength of evidence that estrogens are accompanied by improved psychological well-being compared with placebo is rated high.
Depressive Symptoms
Twenty trials reported some measure of depression for estrogen compared with placebo (two good, one fair, and 17 poor quality).35, 145, 148, 173, 174, 179, 180, 185, 187-198 Compared with placebo, the pooled SMD for fewer reported depressive symptoms was -0.31 (96 percent CI: -0.44 to -0.18; tau2=0.05), and excluding two large disease prevention focused trials -0.36 (95% CI: -0.53 to -0.20; tau2=0.07) with no indication of reporting bias. The strength of evidence that estrogens are accompanied by improved depressive symptoms compared with placebo is rated high.
Anxiety
Some measure of anxiety was reported in 14 trials (one good, one fair, and 12 poor quality).35, 173, 179, 180, 185, 187, 190-192, 195-199 The pooled SMD for less reported anxiety symptoms for estrogen compared with placebo was -0.30 (95% CI: -0.48 to -0.12; tau2=0.08), and excluding one large disease prevention focused trials -0.34 (95% CI: -0.50 to -0.18; tau2=0.05). Reporting bias was not suspected. The strength of evidence that estrogens are accompanied by improved anxiety symptoms compared with placebo is rated high.
Gabapentin Compared With Placebo
Global
Two trials compared gabapentin with placebo and reported a global measure of psychological well-being (both rated poor quality).42, 200 Compared with placebo, the pooled SMD for improved well-being on a global scale was -0.23 (95% CI: -0.22 to 0.02; tau2=0.0). The strength of evidence that gabapentin is accompanied by improved psychological well-being compared with placebo is rated insufficient.
Isoflavones Compared With Placebo
Global
Seven trials compared isoflavones with placebo and reported a global measure of psychological well-being (four good, one fair, and two poor quality trials).152, 201-206 Pooled estimates show no significant difference in global measures compared with placebo (SMD: -0.11; 95% CI: -0.22 to 0.01; tau2=0.00). The strength of evidence that isoflavones are accompanied by improved global mental psychological well-being compared with placebo among menopausal women is rated low.
Depressive Symptoms
Nine trials compared isoflavones with placebo and reported a measure of depressive symptoms (one good and eight poor quality).87, 201, 202, 205, 207-211 Pooled analyses showed a significant improvement in depressive symptoms among the group treated with isoflavones compared with placebo (SMD: -0.29; 95% CI: -0.49 to -0.09; tau2=0.05). Four of the trials, including the two largest201, 202 showed SMDs close to 0, whereas in three of the smallest87, 209, 210 calculated SMDs were large (-0.65 to -0.78) indicating potential for reporting bias. The strength of evidence that isoflavones are accompanied by improved depressive symptoms compared with placebo is rated low.
Anxiety
Seven trials compared isoflavones with placebo and reported a measure of anxiety symptoms (one good and six poor quality trials).87, 201, 205, 207, 209, 210, 212 The pooled effect was consistent with an improvement in anxiety among women treated with isoflavones compared with the placebo—SMD -0.30 (95% CI: -0.46 to -0.14; tau2=0.01). The strength of evidence that isoflavones improve reported anxiety symptoms compared with placebo among menopausal women is rated moderate.
Trials Not Pooled
Different Routes of Estrogen Administration
Four trials (Table 30) compared similar doses of estrogen administered through different routes (see Appendix D for dose categorization by route of administration). Three of the trials reported that changes in psychological symptoms were with the following routes of administration: sequential compared with combined progestogen added to estrogen patches,146 oral compared with transdermal patch,98 and nasal spray compared with transdermal patch.90 One trial compared oral, skin gel, and transdermal patch in administering estrogen. Akhila et al. reported that the skin gel and the transdermal patch significantly improved global psychological scores compared with oral estrogen.99
Given the different treatments and outcomes, the strength of evidence was not rated.
Estrogen Compared With Estrogen Plus Testosterone
One trial (Table 31) compared an estrogen/progestogen skin gel (n=53) with an estrogen/progestogen plus testosterone skin gel (n=53) and reported depressive symptoms, anxiety, and global psychological well-being using the Psychological General Well-Being scale.213 The trial was rated poor quality and reported no difference between groups in depressive symptom scores. Significant improvements were reported in both anxiety scores and global scores in the testosterone group.
Progesterone Alone Compared With Placebo
Two trials (Table 32) compared progesterone skin cream with placebo and reported psychological outcomes. One compared four different progestin skin cream doses (5 mg, 20 mg, 40 mg, and 60 mg) with placebo skin cream and reported Greene psychological scores. The trial was rated fair quality and found no significant difference in global psychological scores between any of the doses of progesterone skin cream compared with placebo.102 The other trial compared a 32 mg progesterone skin cream with placebo, and reported Greene anxiety and depression scores, and MENQOL global psychological scores. None of the psychological measures improved significantly in the treatment group compared with the placebo group.103
Estrogen Compared With Nonprescription
Two trials compared hormone treatments with black cohosh and reported psychological outcomes. Both trials found psychological outcomes for black cohosh similar to hormone treatments. One 12 week 3-arm trial compared black cohosh, standard dose estrogen plus progesterone, and standard dose estrogen plus MPA. The authors reported that all three treatments were accompanied by significantly improved overall MENQOL psychological score, Hospital Anxiety Score, and Hospital Depression Score, with no statistically significant difference between the treatments.214 The other trial compared black cohosh with an ultralow-dose estrogen/progestogen patch and reported anxiety outcomes.215 Both treatments were accompanied by significantly improved anxiety (p<0.001 for both arms of the trial). There was no significant difference between the treatments (Table 33).
Prescription Compared With Placebo
One randomized, double-blind trial (Table 34) compared eszopiclone, a treatment used for insomnia (n=30), with placebo (n=29) and reported the Beck anxiety score as an outcome.105 The trial was rated poor quality and found a significant improvement in anxiety among the treatment group with a wide confidence interval (SMD: -0.57; 95% CI: -1.10 to -0.05).
Nonprescription Agents Compared With Placebo
Twenty-five trials (Table 35) compared various nonprescription agents with placebo and reported 41 psychological outcomes (depressive symptoms [n=11], anxiety [n=14], and global psychological well-being [n=16]) (6 good, 4 fair, and 15 poor quality). Three trials compared black cohosh with placebo.130, 149, 150 Two trials each examined: black cohosh,86, 216 and maritime pine extract.110, 125 One trial each examined: Er-Xian decoction,123 micronutrients,133 homeopathic remedy,121 Jiawei Qing’e Fang,120 Chinese medicinal herbs,122 Estro-G 100®,119 nutritional supplement,132 dioscorea alata,117 green tea polyphenols,153 St. John’s wort,129 herbal extract,107 black cohosh with plant extracts,109 isoflavones with magnolia bark,111 rheum rhaponticum,112 flaxseed,152 black cohosh plus St. John’s wort,128 gingko biloba with ginseng,89 and ginseng.88
Trials reporting significant improvements compared with placebo were: Zhong et al.—improved global psychological well-being with Er-Xian decoction (SMD: -0.56; 95% CI: -0.95 to -0.18)123; Schellenberg et al.—improved global psychological well-being with both doses of black cohosh (6.5 mg, SMD: -0.43; 95% CI: -0.81 to -0.05 and 13 mg, SMD: -0.96; 95% CI: -1.36 to -0.56)86; Chang et al.—improved depressive symptoms and anxiety with Estro-G 100® (SMD: -0.69; 95% CI: -1.22 to -0.17 and SMD: -1.04; 95% CI: -1.58 to -0.50)119; Hsu et al.—improved anxiety and global psychological well-being with dioscorea alata (SMD: -0.95; 95% CI: -1.50 to -0.36 and SMD: -0.78; 95% CI: -1.36 to -0.20)117; Labrie et al—inconsistent improvements in global psychological well-being with different doses of vaginal DHEA.149; Yang et al.—improved depressive symptoms and anxiety with maritime pine extract (SMD: -0.41; 95% CI: -0.73 to -0.09 and SMD: -0.81; 95% CI: -1.14 to -0.48)110; Mucci—improved depressive symptoms and anxiety with a combination of isoflavones and magnolia bark (SMD: -0.72; 95% CI: -1.15 to -0.28 and SMD: -0.96; 95% CI: -1.40 to -0.52)111; Heger et al—improved anxiety and global psychological well-being with rheum rhaponticum (SMD: -0.77; 95% CI: -1.16 to -0.38 and SMD: -0.50; 95% CI: -0.88 to -0.12)112; Uebelhack et al.—improved depressive symptoms and global psychological well-being with a combination of black cohosh and St. John’s wort (SMD: -1.32; 95% CI: -1.57 to -1.07 and SMD: -0.39; 95% CI: -0.62 to -0.16);128 and Osmers et al.—improved global psychological well-being with black cohosh (SMD: -0.28, 95% CI: -0.51 to -0.04).216
Nonprescription Compared With Nonprescription
One trial (Table 36) compared isoflavones with isoflavones plus magnolia bark and reported depressive symptoms and anxiety outcomes.138 and one trial compared two different doses of isoflavones.137 The isoflavones plus magnolia bark trial was rated poor quality and found no difference in depressive symptom scores or anxiety scores between the two groups.138 The trial comparing different doses of isoflavones reported that both doses significantly improved the Greene psychological scale scores, with no difference between the groups.137
SSRI/SNRIs Compared
One randomized double-blind trial (Table 37) compared flexible-dose desvenlafaxine (100 to 200 mg/d) with flexible-dose escitalopram (10 to 20 mg/d) and reported Hamilton depression and anxiety scores.217 The trial was rated good quality. The antidepressants were equally effective in reducing both depressive symptoms and anxiety scores (SMD: -0.10; 95% CI: -0.30 to 0.10, and SMD: -0.05; 95% CI: -0.25 to 0.15, respectively).
SSRI Compared With Nonprescription
One trial (Table 38) compared black cohosh with fluoxetine, reporting depressive symptoms and global psychological measures.84 After 12 weeks of followup, Oktem et al. reported that both treatments were accompanied by similar improvements in the SF-36 global mental health score and the Beck Depression Score. The trial was rated poor quality.
Trials With No Quantifiable Data
Seven trials did not allow determination of standardized effect estimates because of reporting. Five reported depressive symptom outcomes and two reported global psychological outcomes. Results of these trials would not have affected the overall outcomes presented above.
Gupta et al. conducted a one-year trial, comparing a standard dose of oral estrogen alone (n=25), DHEA (n=25), and placebo (n=25). At baseline, no women reported depressive symptoms. At followup (unspecified time), 4 percent of the estrogen alone treatment group, 0 percent of the DHEA group, and 16 percent of the placebo group reported depressive symptoms.143
In a subset of women enrolled in the Kronos Early Estrogen Prevention Study (KEEPS) group, Raz et al. reported changes in the Profile of Mood States among the placebo and low-dose estrogen/progestogen groups. For this particular analysis, the oral and patch low-dose estrogen/progestogen groups were combined. Depressive symptom scores improved in 15 percent of the placebo group and in 42 percent of the treatment group.218
Yalamanchili et al. conducted a four-arm trial with placebo, calcitriol, standard dose estrogen/progestin, and standard dose estrogen/progestin plus calcitriol. The Geriatric Depression Scale measured depressive symptoms among the four groups. None of the treatment groups experienced significant differences compared to placebo: calcitriol (p=0.77); estrogen/progestin (p=0.46), and estrogen/progestin plus calcitriol (p=0.98).219
Liske et al. performed a 12-week trial comparing black cohosh with placebo and reported median Self-Rating Depression Scale scores. The placebo group had a baseline median of 44.5 and a 12 week median of 37.0. The black cohosh group had a baseline median of 44.0 and a 12 week median of 36.0.155
Stricklet et al. conducted a four-arm randomized trial of two different doses of raloxifene, conjugated equine estrogen, and placebo. Women’s Health Questionnaire anxiety and depressive symptoms scores were measured. Estrogen alone improved psychological scores more than placebo, but statistical significance is unknown because analysis was not conducted on these arms of the trial separately.199
Auerbach et al. conducted a randomized trial comparing pomegranate seed oil with placebo, reporting MRS II global mental health scores. The women receiving pomegranate seed oil had a baseline median score of 4.0 and a 12 week followup score of 2.0. The women in the placebo group had a baseline median score of 6.0 and a 12 week followup score of 4.5. The baseline median scores were significantly different. There was not a significant difference in change scores between the two groups.118
Davis et al. performed a randomized crossover trial that compared a standard-dose estrogen spray with a standard-dose estrogen patch.91 Both treatments significantly improved global psychological well-being scores. No significant difference between the two treatments was found. No quantifiable data between the groups were provided.
Sexual Function
Key Points
- A total of 94 including over 28,000 women, reported sexual function outcomes of treatment with hormones, SSRI/SNRIs or nonprescription agents such as isoflavones, DHEA and herbal extracts.
- Study quality was generally rated poor (75 percent). Funding was provided by industry for 49 trials, public sources in 17 trials, industry and public sources in two trials, and the type of funding was not stated for 25 trials.
- Sexual function outcomes were reported using a variety of scales, representing four domains of sexual function: global, pain, interest or activity frequency.
- Strength of evidence of relative effectiveness of agents in ameliorating symptoms of sexual function is as follows:
- There is high strength of evidence that vaginal estrogen reduced pain during sex compared with placebo: SMD -0.54 (95% CI -0.73 to -0.34; 10 trials, n=3,205).
- There is moderate strength of evidence that oral estrogen reduces pain compared with placebo: SMD -0.22 (95% CI: -0.35 to -0.09; 4 trials, n=1,661).
- There is high strength of evidence that estrogen improves global measures of sexual function compared with placebo: SMD 0.27 (95% CI: 0.19 to 0.35; 15 trials, n=4,228).
- There is insufficient strength of evidence that an SSRI or SNRI improves global measures of sexual function compared with placebo.
- There is low strength of evidence that isoflavones improve global measures of sexual function compared with placebo: SMD 0.24 (95% CI: -0.12 to 0.61; 4 trials, n=586).
- There is moderate strength of evidence that estrogens improves measures of sexual interest compared with placebo: SMD 0.18 (95% CI: 0.01 to 0.26; 7 trials, n=2,213).
- There is insufficient strength of evidence that SNRIs improve measures of sexual interest compared with placebo.
- There is insufficient strength of evidence that isoflavones improve global measures of sexual interest compared with placebo.
- There is moderate strength of evidence that testosterone improves measures of sexual activity compared with placebo: SSE/4 weeks 1.17 (95% CI: 0.88 to 1.46; 8 trials, n=2,820).
Included Trials
Of the 283 trials included in this review, 94 trials (33.2 percent) trials reported sexual function outcomes (39 trials specified sexual function as a primary outcome). Sixty-one trials examined hormone treatment effects and sexual function, with the following comparators: placebo (34 trials), other hormones (23 trials), and nonprescription treatments (three trials). Twenty-eight trials examined the effects of nonprescription treatments compared with placebo; nonprescription treatments included isoflavones, DHEA, herbal extracts, and ginseng. Five trials compared SSRI or SNRIs with placebo.
Trials were conducted in more than 22 countries and 18 trials were multinational. Single country trials were conducted in the United States (n=20), Australia (n=8), Italy (n=5), Canada (n=4), China (n=4), United Kingdom (n=4), Taiwan (n=4), Denmark (n=3), Brazil (n=3), and Germany (n=3), with two or fewer trials conducted in Hong Kong, India, Sweden, Turkey, Croatia, Ecuador, Japan, Netherlands, Norway, Singapore, Spain, Thailand, and Ukraine. The trials were conducted at over 2,300. Length of followup ranged from 8 to 260 weeks. Additional trial characteristics are shown in Table 40.
Sexual function was reported using a variety of measures and scales. The domains of sexual activity assessed fell into four broad categories: global (i.e., assessed two or more domains), pain (dyspareunia), interest, or activity frequency. If results for more than one domain were reported in a trial, both were included. Forty-four trials reported a global measure (MENQOL, WHQ, MRS, and McCoy scales were most common, though others were also used); 29 reported pain during intercourse, 23 interest, and eight reported frequency of satisfying sexual episodes (activity). Specific items in the different scales include:
- Greene Climacteric Scale rated a single question, “loss of interest in sex,” scaled from zero (none) to three (severe)—15 trials.
- Menopause-specific Quality of Life (MENQOL) assessed sexual function in three questions scaled from zero (not bothered) to eight (extremely bothered)—22 trials.
- Women’s Health Questionnaire assessed sexual function using three questions on interest, pain, and activity, rated in a 4-point scale, with higher scores indicating more severe symptoms—10 trials.
- Self-reported dyspareunia (yes/no)—21 trials.
- Satisfying sexual episodes—eight trials.
- The remaining trials used other sexual function scales.
Study quality was generally poor (74.5 percent), with 14 trials judged good and 10 trials to be fair quality. Length of followup ranged from 8 weeks to 260 weeks. Industry funding was indicated in 51 trials, public funding in 17 trials, and two trials reported both industry and public funding. Table 40 describes additional trial and patient characteristics.
Evidence Synthesis for Sexual Function
Standard mean differences were calculated to allow comparisons of outcomes from different sexual function scales. Analyses were conducted by domain (pain, global, activity and interest), by route of administration (oral or vaginal), and by uterine status (all intact, all absent, or mixed) when possible. Pooling was considered possible for pairwise comparisons where evidence included at least three trials. Pooling of the following comparators and conditions was performed:
- Pain: vaginal estrogens versus placebo (n=10); oral estrogens versus placebo (n=4); all estrogens (either vaginal or oral) versus placebo (n=14)
- Global: all estrogens (either vaginal or oral) versus placebo (n=15); SSRI/SNRI versus placebo (n=2); isoflavones versus placebo (n=4)
- Activity: testosterone versus placebo in trials with women with/without uteri mixed or trials with women with intact uteri (n=4); testosterone versus placebo in trials with all women without intact uteri (n=4); testosterone versus placebo all trials combined (n=8)
- Interest: all estrogens versus placebo (n=7); isoflavones versus placebo (n=5); SNRI versus placebo (n=2)
Results are shown in Table 41, Figure 11 and Figure 12.
Estrogen Compared With Placebo (Pain)
Fourteen trials compared estrogens with placebo and reported pain during sex. Ten trials compared vaginal estrogens with placebo (two fair and eight poor quality)179, 220-227 and four trials compared oral estrogens with placebo (all poor quality).187, 228-230 In the pooled result, any estrogen improved reported pain during sex compared with placebo (SMD -0.45; 95% CI: -0.61 to -0.29; tau2=0.07).
Analyses by route of administration was consistent with a larger effect for vaginal estrogens (SMD -0.54; 95% CI: -0.73 to -0.34; tau2=0.07), than for oral estrogens (SMD -0.22; 95% CI: -0.22 to -0.09; tau2=0.01).
The strength of evidence that vaginal estrogens improve reported pain during sex among menopausal women compared with placebo is rated high. The strength of evidence that oral estrogens compared with placebo improve reported pain during sex among menopausal women is rated moderate.
Estrogen Compared With Placebo (Global)
Fifteen trials compared estrogens with placebo and reported a global measure for sexual function (two good, four fair, and nine poor quality).35, 173, 177, 181-184, 186, 187, 192, 198, 199, 231-233 Because various routes of administration were used—oral, topical, nasal, and vaginal (10, three, one, and one respectively)—all trial results were combined for analysis. Estrogens significantly improved global measures of sexual function compared with placebo (SMD 0.27; 95% CI: 0.19 to 0.35; tau2=0.00). The strength of evidence that estrogens improve a global assessment of sexual function compared with placebo is rated high.
SSRI/SNRI Compared With Placebo (Global)
Two trials compared antidepressants with placebo and reported sexual function outcomes as a global measure (one good and one fair quality).164, 167 The pooled SMD was 0.27 (95% CI: 0.01 to 0.52) tau2=0.00. The strength of evidence that SNRIs improve a global assessment of sexual function compared with placebo is rated insufficient.
Isoflavones Compared With Placebo (Global)
A global measure of sexual function was reported in four trials comparing isoflavones with placebo (two good, one fair, and one poor quality).152, 204, 206, 211 The pooled SMD was 0.24 (95% CI: -0.12 to 0.61) tau2=0.10), accompanied by substantial heterogeneity. The strength of evidence that SNRIs compared with placebo improve a global assessment of sexual function compared with placebo is rated low.
Estrogens Compared With Placebo (Interest)
Seven trials compared estrogens with placebo and assessed interest in sex (one fair and six poor quality).175, 179, 180, 197, 228, 230, 234 Routes of estrogen administration included oral, vaginal, and topical (oral in five trials). The pooled SMD was consistent with an increase in reported sexual interest—0.18 (95% CI: 0.10 to 0.26; tau2=0.00). The strength of evidence that estrogens improve sexual interest compared with placebo is rated moderate.
SNRI Compared With Placebo (Interest)
Two trials compared desvenlafaxine with placebo (both good quality).168, 235 The combined SMD from the trials was 0.16 (95% CI: -0.07 to 0.39; tau2=0.02). The strength of evidence that desvenlafaxine improves sexual interest compared with placebo is rated insufficient.
Isoflavones Compared With Placebo (Interest)
Five trials compared isoflavones with placebo and assessed sexual interest (one good and four poor quality).87, 201, 205, 236, 237 The pooled effect was not statistically significant, the confidence interval wide, and there was substantial heterogeneity—SMD 0.26 (95% CI: -0.001 to 0.52; tau2=0.52). The strength of evidence that isoflavones improve sexual interest compared with placebo is rated insufficient.
Testosterone Compared With Placebo (Activity)
Eight trials compared testosterone with placebo and assessed satisfying sexual episodes (one fair and seven poor quality). The outcome was the number of episodes per four-week period. One episode per four-week period is the suggested minimal clinically important improvement.73 Four trials, administering testosterone by patch, included only women without intact uteri and ovaries,238-241 two trials, one patch and one oral testosterone, included only women with intact uteri and ovaries,242, 243 and two trials, both using patches, included women with and without intact uteri and ovaries.244, 245 Combining the eight trials showed that testosterone significantly improved sexual activity compared with placebo by 1.17 episode/4 weeks (95% CI: 0.88 to 1.46; tau2=0.00). Analyses limited to the four trials including only women without intact uteri and ovaries also showed significant improvements in episodes compared with placebo (1.05; 95% CI: 0.64 to 1.45).
Compared with placebo, the strength of evidence that testosterone increases the number of satisfying sexual episodes compared with placebo is rated moderate.
Trials Not Pooled
Estrogen Compared With Placebo
One trial compared an ultralow dose estrogen patch with a placebo patch.231 The MENQOL sexual subscore decreased in both groups: -0.8 (SD: 1.6) in the placebo group; -1.0 (SD: 1.7) in the estrogen group. The difference between the groups was not significant (Table 42).
Estrogen Compared With Estrogen or Other Hormones
Five trials compared different doses of estrogen. Four of the five trials compared standard with low doses246-249 and one trial compared standard with a high dose.250 Two trials measured global sexual function, two measured sexual interest, and one measured pain during sexual activity. In all five trials, there were improvements in sexual function with estrogens, with no statistically significant differences among the estrogen doses (Table 43).
One trial randomized women to either 0.625 mg esterified estrogens or 0.625 esterified estrogens plus 1.25 mg methyltestosterone.142 The outcome was a global measure of sexual function. After 16 weeks’ followup, the group receiving testosterone with estrogen improved significantly compared with the estrogen alone group, with a standardized mean difference of 0.39 (95% CI: 0.12 to 0.66).
Due to the variety in outcome measures, synthesizing these data was not possible; because of treatment heterogeneity, strength of evidence was not rated.
Different Routes of Estrogen Administration
Ten trials (Table 44) compared similar estrogen doses using different routes of administration. Two trials used a vaginal ring in one treatment group and vaginal cream in another251, 252; two trials used oral estrogens in one arm and estrogen patches in another98, 253; one trial used patches, either adding progestogen combined or sequential146; and one trial each used the following pairs of routes of administration: patch/spray,91 oral/ring,254 ring/tablet,255 oral/cream,256 and ring/pessary.257 Five trials reported a global sexual function outcome, four reported pain, and one reported sexual interest. No trial found a significant difference in outcomes between routes of administration. These results on route of administration combined with the findings from the analysis on vaginal and oral estrogens compared with placebo in diminishing pain during sex, suggest global and pain outcomes also do not differ according to route of administration (strength of evidence moderate, Table 49).
Other Prescription Agents Compared With Placebo
One placebo-controlled trial examined ospemifene, an estrogen receptor agonist/antagonist, and measured change in severity of pain during intercourse. The ospemifene group experienced a significant decrease in pain compared with the placebo group (Table 45).258
Estrogen Compared With Nonprescription Agents
Two trials (Table 46) compared estrogen/progestogen therapy with nonprescription treatments. One examined pueraria mirifica for the treatment of pain relating to sexual function.134 Pueraria mirifica is an herb considered highly estrogenic, found in Thailand. This small study with a sample size of 60 women, did not find a significant difference between groups. The other trial compared two arms of estrogen/progestogen therapy with black cohosh.214 The hormone therapy arms experienced more improvement in MENQOL sexual subscores compared with the black cohosh arm, but the differences between the groups was not significant.
SSRI/SNRIs Compared
One trial compared a serotonin-norepinephrine reuptake inhibitor (desvenlafaxine) with a selective serotonin reuptake inhibitor (escitalopram) and reported Change in Sexual Functioning Questionnaire as an outcome (Table 47).148
Nonprescription Agents Compared With Placebo
Eighteen trials (Table 48) compared nonprescription agents with placebo and reported sexual function outcomes. The domains of the outcomes were global (n=11), interest (n=4), and pain (n=3).
Three trials compared isoflavones with placebo, and measured pain during intercourse. Two of the trials reported statistically significant improvements in pain,87, 237 while one trial reported no difference in pain compared with placebo.207
Two trials compared ginseng with placebo, with one trial reporting a global sexual function outcome88 and one reporting on sexual interest.89 Neither trial reported significant improvements in either outcome.
Two trials compared maritime pine extract with placebo and reported global sexual function outcomes. The trial administering 200 mg pine extract reported significant improvements in sexual function compared with placebo110, and the trial administering 30 mg pine extract reported no difference in sexual function compared with placebo.125
Two of the 18 trials compared DHEA with placebo and reported global sexual function outcomes.149, 150 One was a four-arm trial with increasing doses of DHEA which were administered through a vaginal ovule and the other was a two-arm trial administering DHEA orally. The trial using vaginal ovules showed significant improvements in global sexual function in the two higher doses compared with placebo,149 while the trial using orally administered DHEA did not show a difference compared with placebo.150 Due to the variety of dosages and treatments, pooling was not appropriate.
The remaining nine trials tested different treatments compared with placebo: Dang Gui Buxue Tang,107 black cohosh with plant extracts,109 Jiawei Qing’e Fang,120 St. John’s wort,129 rheum rhaponticum,112 dioscorea alata,117 a homeopathic remedy121 Chinese medicinal herbs,122 and Er-Xian decoction.123 None of these trials reported a significant improvement in the sexual function outcome measured.
Trials With No Quantifiable Data
Six trials did not have data that could be analyzed by the standardized effect size methods. Results of these trials would not have affected the overall outcomes presented above.
In a double-blind trial, women were randomized to either a progesterone skin cream (n=38) or a placebo skin cream (n=42), and were followed for 12 weeks. Sexual function outcomes were measured by the Greene sexual function subscore and reported as baseline median and post-treatment median. Similar improvements were seen in both study groups.103
In a trial comparing a mixture of 12 Chinese herbs (n=28) with placebo (n=27), the sexual function subscore for the MENQOL was reported. Followup was 12 weeks. Baseline measures were provided for both the placebo and treatment groups, but followup measures were provided for only the group treated with the Chinese herbs. The authors report that there was no statistical difference in sexual function between the two groups.114
Nathorst-Boos et al. conducted a 26-week, double-blind, crossover trial of 53 women, adding a testosterone skin gel or a placebo gel to already existing hormone treatments. Median values of components of the McCoy sex questionnaire were reported. Pain during intercourse did not improve significantly with the testosterone treatment compared with placebo. However, frequency of sexual activity increased significantly more in the testosterone treatment group.213
Long et al. conducted a 12-week randomized trial on hysterectomized women, comparing a standard dose of oral estrogen alone (n=37) with a standard dose of estrogen administered through a vaginal cream (n=36). The oral estrogen group reported 63 percent dyspareunia at baseline, 33.3 percent at followup. The estrogen vaginal cream group reported 66.7 percent dyspareunia at baseline, 20.0 percent at followup. Neither route of administration increased the number of satisfying sexual episodes per week.256
Lima et al. conducted a 12-week randomized trial, comparing an isoflavone vaginal gel with a placebo vaginal gel. At baseline, 100 percent of the women reported dyspareunia. At followup, 40% of the placebo group reported dyspareunia and 3.3 percent of the women receiving the isoflavone gel reported dyspareunia.259
Gupta et al. conducted a one-year trial, comparing a standard dose of oral estrogen alone (n=25), DHEA (n=25), and placebo (n=25). At baseline, no women reported a loss of libido. At followup (unspecified time), 4 percent of the estrogen alone treatment group, 0 percent of the DHEA group, and 36 percent of the placebo group reported a loss of libido.143
Urogenital Atrophy
Key Points
- Seventy-one trials including more than 20,000 women, reported on urogenital atrophy outcomes of treatment with estrogen, ospemifene, or nonprescription agents such as isoflavones, black cohosh and herbal extracts.
- Study quality was typically rated as poor (80 percent). Industry was the only funding source for 31 trials and public sources for 9 trials. Both public and industry funding was reported for 2 trials and support not stated for 29 trials.
- Results were reported using a variety of scales. The most common outcome was vaginal dryness.
- Strength of evidence of relative effectiveness of agents in ameliorating symptoms of vaginal atrophy is as follows:
- There is high strength of evidence that vaginal estrogens improve urogenital atrophy symptoms compared with placebo: SMD -0.44 (95% CI: -0.65 to -0.23; 12 trials, n=3,419).
- There is high strength of evidence that nonvaginal estrogens improve urogenital atrophy symptoms compared with placebo: SMD -0.36 (95% CI: -0.35 to -0.26; 14 trials, n=5,141).
- There is high strength of evidence that ospemifene improves urogenital atrophy symptoms compared with placebo: SMD -0.75 (95% CI: -1.05 to -0.45; 3 trials, n=1,889).
- There is low strength of evidence that isoflavones improve symptoms of urogenital atrophy compared with placebo.
- There is insufficient evidence to determine whether any other nonprescription agent improve symptoms of vaginal atrophy compared with placebo.
Included Trials
Of the 283 total included trials in this review, 71 (25.1 percent) reported urogenital atrophy outcomes (40 trials specified urogenital atrophy symptoms as a primary outcome). Forty-seven trials examined effects of hormones including the following comparators: placebo (28 trials), other hormones (16 trials), and nonprescription treatments (two trials). Twenty trials examined the effects of nonprescription treatments such as isoflavones, black cohosh, and herbal extracts.
Ten trials were multinational and the remainder performed in over 25 different countries including the United States (n=14), Italy (n=7), Germany (n=6), Brazil (n=2), Hong Kong (n=2), South Korea (n=2), Taiwan (n=2), Thailand (n=2), United Kingdom (n=2), and single trials in 12 other countries (Austria, Belgium, Canada, China, Croatia, Ecuador, France, Netherlands, Norway, Spain, Turkey, Ukraine). The trials were conducted at over 1,900 sites with followup ranging from 12 to 260 weeks.
Urogenital atrophy outcomes were reported using a variety of metrics, the most common were:
- Vaginal dryness on a dichotomous scale.
- Vaginal dryness severity score, ranging from 0 (none) to 3 (severe).
- The Menopause Rating Scale (MRS) with a single item rating vaginal dryness on a five-point scale from 0 (none) to 4 (extremely severe).
- Several researchers devised their own outcome measurement for urogenital symptoms, either patient or physician assessed. Different researchers used different combinations of the following symptoms, assigning scores, resulting in an overall urogenital score: vaginal discomfort, loss of libido, dyspareunia, vaginal dryness, vaginal itching, and incontinence.
- Dryness improvement.
- The Modified Greene Climacteric Scale including a single item assessing vaginal dryness on a scale from 0 (none) to 3 (most severe).
- Visual analog scale
- The Kupperman Menopausal Index vaginal dryness on a scale from 0 (none) to 3 (most severe).
Forty-nine trials (69.0 percent) reported some measure of vaginal dryness, 16 (22.5 percent) vaginal atrophy, 4 (45.6 percent) the Greene domain, 6 (8.4 percent) menopause rating scale, and 12 (16.9 percent) included or reported a different urogenital outcome measure.
Study quality was generally rated poor (80.3 percent), with nine fair and five high quality trials. Industry funding was indicated in 31 trials and public funding was reported in 11 trials. Table 50 describes other trial and patient characteristics.
Evidence Synthesis for Urogenital Atrophy
SMDs were calculated to allow comparing outcomes across the different scales. Pooling was performed for pairwise comparisons where evidence included three or more trials. Pairwise analyses of estrogen treatments were conducted separately for vaginal and nonvaginal administration. Pooling was performed for the following comparators versus placebo: vaginal estrogens according to dose, nonvaginal estrogens according to dose, ospemifene, isoflavones, and black cohosh. Results are displayed in Figure 13. Forest plots for pairwise comparisons are displayed in Appendix J.
Estrogen Compared With Placebo
Vaginal Estrogens
There were 13 trials that examined vaginal estrogens compared with placebo (Table 51). The routes of administration in the trials included creams, rings, ovules, and pessaries. One trial compared high-dose estrogens with placebo,179 three trials compared standard-dose estrogens with placebo,221, 226, 232 and nine trials compared low/ultralow dose estrogens with placebo.156, 220, 222-225, 227, 260 One trial was rated high quality, two fair, and thirteen poor. Pooled results (Table 51) showed any vaginal estrogen significantly improved reported urogenital atrophy symptoms compared with placebo (SMD -0.44; 95% CI: -0.65 to -0.23; tau2=0.11; 12 comparisons). One potential outlier222 was apparent (Appendix J); including it increased the estimated effect size and heterogeneity (SMD -0.54; 95% CI: -0.77 to -0.31; tau2=0.15). Pooled effects for standard and low/ultralow dose estrogens (Table 51) were consistent with significant improvement in urogenital atrophy symptoms compared with placebo. There was a single high-estrogen dose trial (two estrogen arms versus placebo ring);179 in one arm a significant effect was noted (SMD -0.36; 95% CI: -0.63 to -0.09) but not the other (the result for both arms combined is shown in Table 51). SMDs for standard and low/ultralow dose vaginal estrogens compared with placebo were -0.42 (95% CI: -0.61 to -0.23; tau2 = 0.00; three comparisons), and -0.46 (95% CI: -0.73 to -0.18; tau2=0.18; eight comparisons), respectively. Although heterogeneity was present, the strength of evidence that vaginal estrogens improve urogenital atrophy symptoms compared with placebo symptoms is rated high.
Nonvaginal Estrogens
Fourteen trials comparing nonvaginal estrogens with placebo were pooled (Table 52). Routes of administration included oral, transdermal patch, and skin gel. One trial examined high-dose estrogens,187 six trials standard-dose estrogens,35, 176, 190, 228, 261, 262 and eight trials low/ultralow dose.176, 182, 220, 229, 231, 263-265 One trial included three arms, comparing placebo with both a standard and low estrogen dose.176 Two trials were rated good quality, one fair, and eleven poor. Analyses by estrogen dose (high, standard, and low/ultralow) showed improvement in all alleviating urogenital atrophy symptoms (Table 52) for any estrogen dose with little heterogeneity, SMD -0.35 (95% CI: -0.44 to -0.26); tau2=0.01 (14 trials). The strength of evidence that nonvaginal estrogens improve urogenital atrophy symptoms compared with placebo is rated high.
Isoflavones Compared With Placebo
Five trials compared isoflavones with placebo.87, 111, 207, 236, 266 Isoflavones doses ranged from 60 mg per day to 350 mg per day. Treatment arm enrollment ranged from 44 to 60 women. The pooled estimate was consistent with improved urogenital atrophy symptoms among women taking isoflavones (Table 52) SMD -0.48, 95% CI: -0.77 to -0.18; tau2=0.07). However, all trials were rated poor quality. The strength of evidence that isoflavones compared with placebo improve urogenital atrophy symptoms is rated low.
Ospemifene Compared With Placebo
Three trials compared ospemifene with placebo for its effect on clinical signs of vulvar vaginal atrophy.258 The trials were rated fair quality. The magnitude of pooled SMD was greater than for any other agent (-0.75, 95% CI: -1.05 to -0.45; tau2=0.06). The strength of evidence that ospemifene compared with placebo improve urogenital atrophy symptoms is rated high.
Trials Not Pooled
Estrogen Compared With Placebo
One trial (Table 53) compared low-dose estrogen alone with placebo, administered through vaginal rings.224 The estrogen treatment did not improve urogenital symptoms compared with placebo.
Estrogen Compared With Other Hormones
One trial (Table 54) compared estrogen/progestin versus estrogen/progestin plus testosterone.243 Estrogen/progestin doses were identical in both groups, with the experimental group receiving 2 mg testosterone. Both groups reported significant improvements in vaginal dryness. There was no difference in the magnitude of improvement between the groups.
Different Routes of Estrogen Administration
Eight trials compared similar estrogen doses administered by different routes (Table 55)90, 99, 146, 251, 255-257, 267 (see Appendix G for dose categorization by route). One trial showed a significant improvement in urogenital symptoms administering estrogen via pessary compared with tablet.267 A three-armed trial reported that a vaginal gel and a patch significantly improved urogenital symptoms compared with oral estrogens.99 All other trials reported no difference between the routes of administration. Given the heterogeneity among routes of administration in these trials, no strength of evidence ratings were assigned.
Nonprescription Agents Compared With Placebo
Eleven trials (Table 56) compared nonprescription agents with placebo. Four trials examined plant extracts,112, 115, 119, 123 two dehydroepiandrosterone (DHEA),130, 149 and one trial each tested isoflavones gel,259, isoflavones and berberine111 St. John’s wort and black cohosh mix,128 a homeopathic remedy,121 and black cohosh alone.216 The two isoflavones trials report significant improvements in urogenital symptoms compared with placebo.111, 259 Findings among the two DHEA trials were inconsistent, with one trial noting significant improvements in urogenital symptoms149 and the other reporting a nonsignificant finding.130 Heger et al. reported significant improvements using rheum rhaponticum,112 Uebelhack et al. reported significant improvements with the St. John’s wort and black cohosh combination,128 and Osmers et al. reported significant improvements with black cohosh alone.216 Due to the variety of dosages and treatments, pooling was not appropriate.
Nonprescription Agents Compared With Nonprescription Agents
One trial (Table 57) compared isoflavones versus isoflavones combined with pine bark extract,138 one trial compared two different doses of isoflavones,137 and one trial compared different dosages of pueraria mirifica.136 Agosto et al. reported a minimal improvement with the addition of pine bark extract to isoflavones compared with isoflavones alone. The other two trials reported no difference between dosages.
Trials With No Quantifiable Data
Publications from six trials lacked sufficient data to estimate effect sizes (SMD or other). Results of these trials would not have affected the overall outcomes presented above. One trial was rated fair quality268 and the remainder rated poor quality.
Schulman et al.268 compared placebo with a low dose estrogen patch given with two different progestin doses. Only post-treatment data were reported, and after 12 weeks, vaginal dryness was less frequent in both treatment arms compared with placebo (p=0.013 and p=0.016).
Le Donne et al. conducted a 3-month, randomized, double-blind trial comparing 5 mg hyaluronic acid (n=31) with 97 µg genistein (n=31), administered through vaginal suppository.269 Outcomes were reported as median genital score and both treatments provided significant relief of symptoms.
A randomized, double-blind trial compared the effect of pomegranate seed oil (n=43) with placebo (n=38).118 Outcomes were reported as pre- and post-median scores in the urogenital domain of the Menopause Rating Scale. Women in the treatment and placebo arms experienced the same improvement in scores.
A trial comparing menopausal hormone therapy (n=30) with pueraria mirifica (n=30) reported that neither treatment affected vaginal dryness significantly.134 The outcome was measured by the modified Greene Climacteric Scale.
Buckler et al. conducted a 24-week trial comparing a low-dose estrogen/progestin oral treatment with a high-dose estrogen/progestin vaginal ring. Both treatments lowered the vaginal dryness symptom intensity score, but no variance estimates or p-values were provided.
Gupta et al. conducted a trial comparing a standard dose of oral estrogen, 25 mg of DHEA, and placebo. The authors report a lower frequency of vaginal dryness in the treatment groups compared with placebo. No baseline frequencies were provided.143
Sleep Disturbance
Key Points
- A total of 56 trials including over 44,000 women reported on sleep outcomes in women treated with prescription agents (estrogen, SSRIs, gabapentin) and nonprescription agents (isoflavones, St. John’s Wort, pine bark extract, rheum rhaponticum, ginseng, dioscorea alata, DHEA, pomegranate seed oil, and herbal extract)
- Forty-five of 56 trials were rated as poor quality. Eighteen trials reported only industry funding, 10 were publicly funded, seven trials were funded by both industry and public sources, and funding support was not noted in 21 trials.
- Results were reported from a variety of scales. The most common outcome reported was the proportion with insomnia (13 trials).
- Strength of evidence of relative effectiveness of agents on improving measures of sleep is as follows:
- There is high strength of evidence that estrogens are accompanied by improved measures of sleep compared with placebo: SMD 0.32 (95% CI: 0.24 to 0.46; 22 trials)
- There is moderate strength of evidence from placebo-controlled trials and direct comparisons that there is no significant difference between standard and low/ultralow dose estrogens in their effect on sleep measures.
- There is low strength of evidence that SSRIs, gabapentin, or isoflavones are accompanied by improved measures of sleep compared with placebo.
- There is insufficient evidence to determine whether any other agent, prescription or nonprescription, is effective in improving measures of sleep compared with placebo or other agent.
Included Trials
Of the 283 trials included in this review, 56 (20.5 percent) reported sleep outcomes (26 as a primary outcome). The most common nonplacebo comparators included hormones (n=28), isoflavones (n=6), and nonprescription agents such as ginseng and herbal extracts (Table 59).
Seven trials were multinational and the others performed in over 22 different countries including Australia (n=3), and South American (n=3), with the most from Europe (n=17), and the United States (n=11). The trials were conducted at over 1,551 sites with followup ranging from 4 weeks in the gabapentin trials to 260 weeks.
Sleep outcomes were reported using a variety of measures and scales. The most commonly reported was the proportion with insomnia (13 trials). Other measurements included subscales of the Women’s Health Questionnaire (WHQ) (10 trials), Kupperman Menopausal Index (10 trials), Greene Climacteric Scale (eight trials), WHI Insomnia Rating Scale (two trials), and Menopausal Rating Scale (MRS) (two trials). Other trials reported sleep using graphic rating scales. Following are brief descriptions of the most commonly used scales:
- WHQ consists of nine domains, with three questions comprising the sleep domain: waking early, sleeping badly for the rest of the night, and difficulty in falling asleep. A 4-point scale is used to answer the questions, the answers are converted to binary scores, then the total score is divided by number of questions per domain. WHQ domain scores range from 0 to 1, with higher scores indicating more severe symptoms.270
- Kupperman Index assesses 11 menopausal symptoms, including insomnia. Each symptom is scored from 0 (no symptoms) to 3 (most severe).163
- Greene Climacteric Scale has a single question about difficulty in sleeping, which is scored on a 4-point scale, from 0 (none) to 3 (severe).
- WHI Insomnia Rating Scale consists of four questions: trouble falling asleep, waking several times at night, waking up earlier than planned, and trouble falling back asleep. A 5-point scale is used to answer the questions and is coded so that the higher score indicates more severe insomnia.72
- MRS includes one question encompassing difficulty in falling asleep, difficulty in sleeping through the night, and waking up early, scaled from 0 (none) to 4 (extremely severe).
Study quality was generally rated as poor (45 of the 56 trials). Funding sources were unreported in 21 trials, industry funding was noted in 25 trials, and solely public funding was cited in 10 trials. Table 59 summarizes trial and patient characteristics.
Evidence Synthesis for Sleep Disturbance
Standardized mean differences were calculated to allow comparing outcomes across different sleep scales. Pairwise pooling was performed for comparisons where evidence was available from two or more trials. To summarize the body of evidence, results from trials and treatments included in the pairwise analyses, with estrogens included as a single category, were incorporated in a network meta-analysis providing both direct and indirect estimates. To facilitate clinical interpretation for reference we included results from the single eszopiclone trial—an agent approved for use in insomnia. Forest plots are displayed in Appendix K.
Estrogen Compared With Placebo
Estrogen-placebo comparisons were performed in 22 trials (24 comparisons). One trial compared high-dose estrogens with placebo,187 14 trials compared standard-dose estrogens with placebo (two good, two fair and 10 poor quality),35, 144, 145, 154, 173, 181, 183, 186, 190, 191, 196, 199, 228, 261 and nine trials compared low-dose estrogens with placebo (two fair and seven poor quality).181, 183, 186, 192, 198, 263-265, 271 Analyses according to estrogen dose showed improvements in sleep compared with placebo in each category—standard dose SMD of 0.24 (95% CI: 0.17 to 0.31) or low/ultralow dose SMD of 0.46 (95% CI: 0.29 to 0.64) (Table 60). Excluding trials focused on disease prevention from the standard estrogen dose category yielded an SMD of 0.45 (95% CI: 0.29 to 0.62; tau2=0.03). When any estrogen dose was compared with placebo the estimated SMD was 0.32 (95% CI: 0.24 to 0.46; tau2=0.02). The strength of evidence that estrogen improves sleep disturbance compared with placebo is rated as high.
Estrogen Compared With Estrogen
Five trials included comparisons of standard with low/ultralow dose estrogen (two fair and three poor quality).181, 183, 186, 249, 272 No difference was apparent in effect on sleep metrics with a confidence interval including 0—SMD -0.08 (95% CI: -0.16 to 0.01; tau2=0.00). The strength of evidence that standard and low/ultralow dose estrogens do not differ in improving sleep disturbance is rated as moderate.
SSRIs Compared With Placebo
Two trials167, 169 comparing SSRIs with placebo and assessed sleep outcomes (one rated good and one rated poor quality). Sleep metrics were improved with treatment compared with placebo—SMD 0.46 (95% CI: 0.24 to 0.68). The strength of evidence that SSRIs improve sleep disturbance compared with placebo is rated as low.
Gabapentin Compared With Placebo
Two trials 42, 273 compared gabapentin treatment with placebo (one fair and one poor quality) yielding a pooled SMD of 0.37 (95% CI: 0.18 to 0.49). The strength of evidence that gabapentin improve sleep disturbance compared with placebo is rated low.
Isoflavones Compared With Placebo
Six trials compared isoflavones with placebo (one good and five poor quality).87, 207, 210, 211, 266, 274 The pooled SMD (0.37, 95% CI: 0.10 to 0.67) was consistent with better reported sleep, with some heterogeneity (tau2=0.06 or tau=0.25) and a wide confidence interval. The strength of evidence that isoflavones improve sleep disturbance compared with placebo is rated as low.
Ginseng Compared With Placebo
Two trials88, 89 compared ginseng with placebo (both rated poor quality). The pooled SMD suggested no effect on measures of sleep disturbance—SMD 0.13 (95% CI: -0.05 to 0.32). The strength of evidence that ginseng improves sleep disturbance compare with placebo is rated as insufficient.
Eszopiclone Compared With Placebo
One randomized, double-blind trial compared eszopiclone, a treatment used for insomnia (n=30), with placebo (n=29) and reported Insomnia Severity Index scores.105 The trial was rated as poor quality with a substantial effect (SMD: 1.08; 95% CI: 0.53 to 1.62).
Network Meta-Analysis
Table 61 and Figure 14 summarize SMDs from the network meta-analysis and Table 62 displays treatment rankings. Although the effect of eszopiclone on sleep is direct, for the other agents impact might be plausibly exerted through treatment of menopausal symptoms alone (e.g., estrogens) or by both symptom relief and sedative effect (e.g., SSRI and gabapentin). The SMDs and ranking results suggest that whatever the mechanism, effects on sleep disturbances are similar when estrogens, SSRIs, or gabapentin, are used to treat menopausal symptoms.
Trials Not Pooled
Estrogens
One trial compared estrogen in similar doses and reported sleep outcomes (Table 63). The trial compared a vaginal ring and a vaginal tablet, both delivering low estrogen doses. The authors report that neither improved sleep outcomes significantly.255
Other Nonprescription Agents Compared With Placebo
Nine trials compared nonprescription agents with placebo (Table 64): St. John’s wort,129 pine bark extract,110 rheum rhaponticum,112 isoflavones,111 dioscorea alata,117 DHEA,130 herbal extract,119 ovaria bovis,121 and black cohosh.86 St. John’s wort, DHEA, and ovaria bovis did not improve sleep outcomes significantly, compared with placebo.121, 129, 130 Pine bark extract,110 rheum rhaponticum,112 isoflavones,111 dioscorea alata,117 herbal extract,119 and black cohosh (two different doses)86 were reported to significantly improve sleep compared with placebo.
Nonprescription Agents Compared
Two trials (Table 65) compared nonprescription treatments with other nonprescription treatments. One trial compared isoflavones with isoflavones plus magnolia bark, and reported that the treatment group with magnolia bark experienced marginally significant improvements in sleep compared with the group treated with isoflavones alone.138 Another trial compared two different dosages of isoflavones combined with vitamin E. The group treated with higher doses of isoflavones experienced better sleep outcomes compared to the lower isoflavones dose group.141
Trials With No Quantifiable Data
Publications from seven trials lacked sufficient data to estimate effect sizes (SMD or other). Results of these trials would not have affected the overall outcomes presented above. One trial was not rated because it was an abstract,139 the remaining trials were rated poor quality.
Lubbert et al. compared two standard dose estradiol/progestogen transdermal patches, one continuous and one cyclical.146 Using the Menopause Rating Scale, the groups reported similar percentages in sleep improvement: 84.6 percent in the continuous group and 84.1 percent in the cyclical group.
In a trial comparing standard dose estrogen/progestin with 50 mg pueraria mirifica, both groups reported improvements in the modified Greene insomnia subscale.134 The estrogen group experienced a mean change score of -1.8 and the pueraria mirifica group reported a mean change score of -1.2; there was not a significant difference between the groups.
Zervoudis et al. compared isoflavones with vitamin E and reported insomnia after 52 weeks of followup.139 Insomnia decreased in 35.4 percent of the isoflavones group and in 16.1 percent of the vitamin E group. The difference between the groups was not significant.
Auerbach et al. compared pomegranate seed oil with placebo and used the Menopause Rating Scale sleeping disorder score as an outcome.118 Median scores were reported at baseline and at 12 weeks followup. Both groups had a median score of 3.0 at baseline, with the placebo group reporting a score of 2.0 and the pomegranate seed oil group reporting a score of 1.0 after 12 weeks of followup.
Pandit et al. compared a micronutrient supplement with placebo and reported insomnia rates as an outcome.133 The placebo group reported 64 percent with insomnia at baseline and 60 percent after 12 weeks. The micronutrient supplement group reported 51.7 percent insomnia at baseline and 24.1 after 12 weeks of treatment.
Gupta et al. compared a standard dose of conjugated equine estrogen, DHEA, and placebo and reported insomnia rates as an outcome.143 The placebo group reported 0 percent insomnia at baseline, and 20 percent after followup. Both the estrogen group and the DHEA group reported 0 percent at baseline and 0 percent after followup. The trial was conducted for 52 weeks, though the time the followup measures were taken was not specified.
Kohama et al. compared 30 mg maritime pine extract with placebo and used the WHQ sleep domain (four items) as an outcome.125 The placebo group experienced 21 percent improvement in sleep scores, which was statistically significant. The maritime pine extract group experienced 27.8 percent improvement in sleep scores, which was also statistically significant. The authors report that the difference between groups was also statistically significant (p=0.0025).
Key Question 2. Long-Term Effects of Menopausal Hormone Therapy Preparations
This Key Question addresses the long-term effects of hormone therapies on breast cancer; gallbladder disease; colorectal cancer; coronary heart disease, stroke, and venous thromboembolism; endometrial cancer; osteoporotic fractures; and ovarian cancer among women taking hormone therapies for menopausal symptom relief. Systematic reviews and meta-analyses provided the evidence base.
As detailed in the Methods, selection was based on AHRQ guidance on incorporating existing SRs in comparative effectiveness reviews275 and on a modified version of the AMSTAR tool.52 First, systematic reviews and meta-analyses identified from the literature search were screened for relevance. Next, the selected AMSTAR criteria were added as inclusion criteria to enable the assessment of potential bias: (1) at least two electronic sources were searched and key words and/or MeSH® terms were stated, (2) trial inclusion/exclusion criteria were adequately described, and (3) trial quality (risk for bias) of included studies was assessed and documented. Thirty SRs met these criteria. Out of the 30 systematic reviews, that with the most current literature search was the 2012 review conducted by Nelson et al. for the U.S. Preventive Services Task Force (USPSTF) comparing menopausal hormone therapy with placebo for the prevention of chronic conditions.28 This report was comprehensive, addressing most outcomes included in this Key Question. Accordingly, this report was adopted as the primary source for KQ2.
The Nelson et al. systematic review included 51 publications from nine RCTs collectively enrolling over 36,000 participants: the Women’s Health Initiative (WHI) combination estrogen plus progestin trial (referred to hereafter as “estrogen/progestin,” 15, 276-279 WHI estrogen-alone trial,277, 280, 281 WHI Memory Study (WHIMS),282 WHI Study of Cognitive Aging (WHISCA),283 Heart and Estrogen/Progestin Replacement Study (HERS and HERS-II),284, 285 Women’s International Study of Long Duration Oestrogen After Menopause (WISDOM),286 Oestrogen in the Prevention of Reinfarction Trial (ESPRIT),287 Estrogen Memory Study (EMS),288 and Ultra-Low-Dose Transdermal Estrogen Assessment (ULTRA).289 The report also included a WHI followup published subsequent to the literature search.290
Among the trials identified by Nelson et al., four met our inclusion criteria for this Key Question: WHI estrogen/progestin, WHI estrogen-alone, HERS/HERS-II, and ESPRIT. WHIMS and WHISCA were excluded because outcomes were not those included in this Key Question. ULTRA was excluded due to a sample size of less than 250 women per arm, and WISDOM and EMS were excluded because of short followup periods. Hazard ratios and 95% confidence intervals were abstracted from nine articles from the four trials (Table 67). Nelson et al. rated the overall quality of the body of evidence as fair, based on the number, quality, and size of studies; consistency of results between studies; and directness of effect.51 Details of the study quality ratings from the nine articles included can be found in Appendix L, quality assessments.
Women enrolled in the trials were on average older than the target population of this review. Although there is overlap in the age groups, women seeking symptom relief are in general younger than the populations of WHI (mean age of 63 years) and HERS (mean age of 67 years). We identified observational studies from the original literature search enrolling peri- and recently menopausal women in order to inform the applicability discussion. The clinical content expert was also queried regarding relevant publications. Consistency among trials with older populations and observational studies with younger populations was addressed in the strength of evidence discussion. These steps were added to those outlined in AHRQ guidance (which notes “the exact process needs to be flexible and will likely evolve”).37
Subsequent to our initial literature search, the Cochrane Collaboration published a review of long-term menopausal hormone therapy effects.291 Although the literature search included trials through February 2012, three months later than the Nelson report, this review derived a majority of data (70 percent) from the WHI and HERS trials, just as the Nelson report did. Attributable risks calculated in the Cochrane review were similar to those reported by Nelson et al. For those reasons, the Nelson report remained the primary source for this Key Question.
Also subsequent to our initial literature search, the Danish Osteoporosis Prevention Study (DOPS) results were published. DOPS was a prospective, randomized, open-label, blinded-endpoint, or PROBE design,292 controlled trial of recently postmenopausal white women, mean age 50 years. Half of the women enrolled were randomized to treatment or no treatment, and the remainder made a personal choice to use menopausal hormones or not. Treatment duration was 11 years and mean followup 16 years. Only results from the randomized population and only those reported separately for estrogen-only and estrogen/progestin groups were included here.
Finally, evidence concerning potential long-term benefits of hormone therapy was included as part of the decision-making process selecting treatments for menopausal symptoms. However, this review does not address the use of hormone therapy for preventing chronic conditions.
Breast Cancer
Summary
Three trials in the Nelson report provided data on breast cancer incidence: WHI estrogen/progestin,276 WHI estrogen-only,281 and HERS-II.285 All three trials administered oral conjugated equine estrogens (CEE) with the addition of medroxyprogesterone acetate in the estrogen/progestin trials. Mean followup ranged from 5.2 years in the WHI estrogen/progestin trial to 6.8 years in the HERS-II trial.
In the WHI trial, estrogen/progestin increased breast cancer risk compared with placebo whereas estrogen alone reduced the risk (Table 68 and Table 69). HERS-II found no significant increase in breast cancer risk in women using estrogen/progestin (Table 68).
Using only WHI data, the review by Nelson et al. estimated that the use of estrogen/progestin increased invasive breast cancer incidence by eight additional events per 10,000 woman-years (95% CI: 3 to 14). However, the use of estrogen-only reduced invasive breast cancer incidence by eight fewer events per 10,000 woman-years (95% CI: 1 to 14).28 A 2012 update to the WHI report found consistent results for both estrogen/progestin and estrogen-only therapies.290 The authors of this update caution that despite the risk reduction found in the estrogen-only trial, the use of estrogen for breast cancer risk reduction remains unsupported, particularly among the subgroup of women at increased breast cancer risk.
DOPS reported breast cancer incidence rates for women with natural menopause (Table 68), and for women undergoing hysterectomy (Table 69). Women experiencing natural menopause in the treatment arm received a standard dose of estradiol with the progestin NETA, and women in the treatment arm who had undergone a hysterectomy received a standard dose of estradiol. After 11 years of treatment and a total 16 years followup, compared with no treatment there were no significant differences in breast cancer incidence among those receiving estrogen/progestin or estrogen alone.293
Applicability
Evidence informing breast cancer risk in younger populations can be found in secondary analyses of the WHI trial296 and in the Million Women Study, a large observational study.297 In addition to focusing on younger women, these studies also explored potential treatment factors modifying breast cancer risk, including hormone treatment duration and time from menopause onset to hormone initiation—the so-called “gap time” (findings summarized in Table 70).
In an analysis combining the WHI estrogen/progestin trial and the WHI observational study, women using estrogen/progestin therapy with a gap time of less than five years were at greater risk of breast cancer compared to women initiating therapy later.296 However, there was no evidence in the WHI estrogen-only trial that women starting therapy soon after menopause were at increased breast cancer risk.298
The Million Women Study conducted in the United Kingdom also examined gap time and breast cancer risk, but reported some findings inconsistent with the WHI. Women taking estrogen/progestin experienced increased risk of breast cancer, whether gap time was less than five years (RR: 2.04; 95% CI: 1.97 to 2.12) or greater than five years (RR: 1.53; 95% CI: 1.38 to 1.69).297 Women taking estrogen alone, with a gap time less than five years, experienced increased risk of breast cancer (RR: 1.43; 95% CI: 1.36 to 1.49), but did not experience an increased risk if gap time was greater than five years (RR: 1.05; 95% CI: 0.89 to 1.23).297
When assessing treatment duration, the WHI combined trial and observational study reported that longer use combined with a short gap time was associated with increased breast cancer risk. Among women who initiated estrogen/progestin therapy soon after menopause and had 10 years of use, the estimated HR was 2.19 (95% CI: 1.56 to 3.08).296
The Million Women Study reported that women using estrogen/progestin longer than five years, regardless of gap time, were at increased risk of breast cancer. However, the study also found that women using estrogen alone for longer than five years were at increased breast cancer risk only if gap time was less than 5 years.297
Trends in breast cancer incidence in relation to trends in hormone use should be noted. The WHI published a report in July 2002 explaining that the trial was stopped early because the number of invasive breast cancer events indicated that risks of menopausal hormone therapy were exceeding benefits.15 Subsequently, the number of prescriptions for estrogen/progestogen dropped 66 percent and for estrogen dropped 33 percent in January to June 2003 compared to the previous year.299 In 2003, invasive breast cancer incidence decreased 10.6 percent in women 60 to 64 and 14.3 percent in women 65 to 69.300
Conclusions
Two large RCTs, WHI276 and HERS-II,285 and one smaller RCT, DOPS,293 examined breast cancer risk accompanying estrogen/progestin treatment. WHI and HERS were rated fair and DOPS poor quality. The hazard ratios are consistent showing an increased risk of breast cancer, although statistical significance was demonstrated only in the WHI trial. The measures were direct and precise. The strength of evidence is rated high that estrogen/progestin therapy increases breast cancer risk.
One large RCT, the WHI estrogen-alone trial,281 and one small RCT, DOPS,293 examined breast cancer risk associated with estrogen-alone treatment. The hazard ratios are consistent showing a decreased risk of breast cancer, although statistical significance was demonstrated only in the larger WHI trial. Trial quality was rated fair. An update to the WHI study cautions that results may not apply to subgroups of women, such as those at increased risk of breast cancer. The point estimate from the DOPS trials indicated a decreased breast cancer risk, but the sample size was small, resulting in a large confidence interval. The findings are also inconsistent with the results of the observational Million Women Study. The strength of evidence is rated low that estrogen alone decreases breast cancer risk.
Gallbladder Disease
Summary
Two trials reported gallbladder disease incidence: WHI estrogen/progestin277 and WHI estrogen-only.277 Oral conjugated estrogens (CEE) were administered in both trials with the addition of medroxyprogesterone acetate in the estrogen/progestin trial. Women with prior gallbladder disease or cholecystectomy were excluded. Both trials found an increased incidence of gallbladder disease with estrogen/progestin and estrogen alone compared to placebo (Table 71 and Table 72).
Using WHI data, Nelson et al. calculated additional gallbladder disease events—defined as cholecystitis and cholelithiasis—attributable to menopausal hormone therapy. Estrogen/progestin use was associated with an additional 20 gallbladder disease events per 10,000 women-years (95% CI: 11 to 29); and estrogen-only therapy with an additional 33 events per 10,000 women-years (95% CI: 20 to 45).28
Applicability
Though the WHI trials enrolled an older population, the increased risk of gallbladder disease among women using menopausal hormone therapy is supported by results from large observational cohort studies of younger populations. The Nurses’ Health Study found a relative risk for gallbladder disease of 2.1 (95% CI: 1.9 to 2.4)301 and the Million Women Study 1.64 (95% CI: 1.58 to 1.69) for all current menopausal hormone therapy users.302 In the Atherosclerosis Risk in Communities Study, compared to women who never used menopausal hormone therapy, former users had an age-adjusted relative risk for gallbladder disease of 1.84 (95% CI: 1.3 to 2.6) and current users had a risk of 1.76 (95% CI: 1.3 to 2.4).303 Finally, risks may differ according to route of administration. In an analysis of the Million Women Study, transdermal administration was found to confer a lesser relative risk (1.17, 95% CI: 1.10 to 1.24) of gallbladder disease than all users (1.64, 95% CI: 1.58 to 1.69).302
Conclusions
The evidence for estrogen/progestin treatment and gallbladder disease risk consists of one large RCT, the WHI trial.277 Trial quality was rated as fair. Consistency is unknown, but results from the trial are supported by the results of several large observational studies. The measures are direct and precise. The strength of evidence is rated moderate that estrogen/progestin increases gallbladder disease risk.
The evidence for treatment with estrogen alone and gallbladder disease risk consists of one large RCT, the WHI trial.277 Trial quality was rated fair. Consistency is unknown, but the results of the trial are supported by the results of several large observational studies. The measures are direct and precise. The strength of evidence is rated moderate that estrogen alone increases gallbladder disease risk.
Colorectal Cancer
Summary
Three trials reported colorectal cancer incidence: WHI estrogen/progestin,278, 294 WHI estrogen-only,281 and HERS-II.285 Oral conjugated equine estrogen (CEE) was used in all three trials with the addition of medroxyprogesterone acetate in the estrogen/progestin trials. The WHI estrogen/progestin trial showed a protective effect on colorectal cancer incidence, while the other two trials (HERS and WHI estrogen-only) reported no effect of menopausal hormone therapy on colorectal cancer incidence (Table 73 and Table 74).
Applicability
Several large observational studies following younger populations also examined menopausal hormone therapy and colorectal cancer risk: the Breast Cancer Detection Demonstration Project (BCDDP),304 the Nurses’ Health Study,305 and the Molecular Epidemiology of Colon Cancer Study.306
The BCDDP reported that women treated with estrogen/progestogen for 2 to 5 years had a relative risk for colorectal cancer of 0.52 (95% CI: 0.32 to 0.87), but results for women treated fewer than 2 years and women treated more than 5 years were nonsignificant.304 Women treated with estrogen alone for more than 10 years had a relative risk of 0.69 (95% CI: 0.56 to 0.96), but no association was evident in women treated for fewer than 10 years (e.g., 5 to 9 years of use RR 0.74 [95% CI: 0.53 to 1.02]).304 Current hormone users (75 percent of person-time was estrogen alone and 25 percent estrogen/progestogen) in the Nurses’ Health Study had a colorectal cancer relative risk of 0.65 (95% CI: 0.50 to 0.83). This same relationship was not found in past users.305 The Molecular Epidemiology of Colon Cancer Study reported an odds ratio for colon cancer among hormone users of 0.37 (95% CI: 0.22 to 0.62), adjusting for age, sex, aspirin use, statin use, sports activities, family history of colon cancer, ethnic group, and vegetable consumption level.306
A meta-analysis including observational studies as well as the two trials cited here (WHI and HERS), reported a relative risk for colorectal cancer of 0.83 (95% CI: 0.79 to 0.86) for ever users of estrogen alone, and a relative risk for colorectal cancer of 0.81 (95% CI: 0.75 to 0.87) for ever users of estrogen/progestogen.307
Although the meta-analysis showed a protective effect of menopausal hormone use, the observational studies show either no effect or a protective effect for certain subgroups of hormone users. Two of the large studies combined estrogen/progestogen and estrogen-only users into one broad category of hormone users in the analyses.
Conclusions
The evidence for estrogen/progestin therapy and colorectal cancer risk consists of two large RCTs, the WHI trial278, 294 and HERS-II.285 The quality of both trials was rated as fair. Results are inconsistent, with WHI reporting a protective effect and HERS-II reporting no effect. The evidence is direct. The estimates are imprecise (HERS-II with a wide confidence interval). The strength of evidence is rated low that estrogen/progestin therapy decreases colorectal cancer risk.
The evidence informing estrogen therapy alone and colorectal cancer risk consists of one large RCT, the WHI trial.281 Trial quality was rated as fair. The results do not show a significant relationship between estrogen therapy and colorectal cancer risk. Consistency is unknown with only one trial, though intervention, postintervention, and overall measures, all show no effect. The measures are direct and precise. The strength of evidence is rated moderate that estrogen therapy alone does not affect colorectal cancer risk.
Coronary Heart Disease, Stroke, and Venous Thromboembolic Events
Summary
Three trials examined the incidence of coronary heart disease, stroke or venous thromboembolic events: WHI estrogen/progestin,278 WHI estrogen-only,281and ESPRIT.287 Oral conjugated estrogen (CEE) was administered in the WHI trials and estradiol valerate (E2V) in the ESPRIT trial. The WHI trial found that neither hormone therapies increased mortality due to coronary heart disease or myocardial infarction. However, both therapies were associated with an increased incidence of stroke (Table 75 and Table 76). Using WHI data, Nelson et al. calculated that estrogen/progestin therapy resulted in nine more strokes per 10,000 woman-years (95% CI: 2 to 15), and estrogen-only therapy resulted in 11 more strokes per 10,000 woman-years (95% CI: 2 to 20). Deep-vein thromboembolic (DVT) events were also increased with both estrogen/progestin and estrogen-only therapies. Estrogen/progestin resulted in 12 more DVT events per 10,000 woman-years (95% CI: 6 to 17) and estrogen-only therapy results in seven more DVT events per 10,000 woman-years (95% CI: 1 to 14).28
ESPRIT did not find significant relationships between estrogen-only treatment and stroke, pulmonary embolism, deep venous thrombosis, or mortality due to coronary heart disease, possibly due to a smaller sample size (n=1,017).
Applicability
Administering hormones with goals of primary or secondary CHD prevention, the WHI and HERS trials enrolled older women with ages overlapping the target population of this review. Consequently, hormone therapy was often initiated later following menopause than when used to treat menopausal symptoms. In the WHI trials, hormone therapy was begun more than 5 years after menopause in 16 percent of women previously using hormones and in 90 percent of women without prior hormone use.47 The potential modifying effects of age and time since menopause of hormone therapy initiation on CHD incidence has been examined in secondary analyses of the WHI trials,308 the WHI trials and observational study combined,47 and in the Nurses’ Health Study.45
In the WHI estrogen-only trial the hazard ratios for CHD among women less than 10 years, 10 to 19 years, and 20 or more years since menopause were 0.48 (95% CI: 0.20 to 1.17), 0.96 (95% CI: 0.64 to 1.44) and 1.12 (95% CI: 0.86 to 1.46) respectively (p=0.15 for trend).308 In the estrogen/progestin trial, corresponding hazard ratios were 0.88 (95% CI: 0.54 to 1.43), 1.23 (95% CI: 0.85 to 1.77), and 1.66 (95% CI: 1.14 to 2.41) (p=0.05 for trend). Trends in CHD risk were not significantly modified by age at randomization in the estrogen-only (p=0.12) or estrogen-progestin (p=0.70) trials. Stroke risks were unaffected by age or years since menopause in either the WHI estrogen-only or estrogen/progestin trial.
Prentice et al47 subsequently reexamined both the WHI trials and WHI observational study in further detail—individually and combined—according to years since menopause (less than 5, 5 to 14, 15 or more years) and whether prior hormone therapy had been taken. In the combined trial and observational study analysis, there was no evidence for modification of CHD risk by time since menopause with estrogen alone or estrogen/progestin for prior or first time hormone users. In women with prior hormone use fewer than 2 years menopausal, estrogen/progestin therapy was accompanied by an increased CHD risk (HR 3.03, 95% CI: 1.36 to 6.75).
In a novel reanalysis to account for the potential biases of observational studies, Hernán et al examined the association between estrogen/progestin therapy and CHD incidence in 35,575 women initiating estrogen/progestin in the Nurses’ Health Study. CHD risk was increased in the two years following initiation (HR 1.42, 95% CI: 0.92 to 2.20) compared with 0.96 (95% CI: 0.78 to 1.18) over the entire follow-up examined. Among women with prior hormone use and fewer than 10 years since menopause, during the two years after starting estrogen/progestin the hazard ratio for CHD was 1.33 (95% CI: 0.66 to 2.64) versus 0.77 (95% CI: 0.54 to 1.09) subsequently; among women 10 or more years from menopause corresponding hazard ratios were 1.48 (95% CI: 0.83 to 2.64) and 1.05 (95% CI: 0.77 to 1.43). For women without prior hormone use generally similar findings were noted with the exception of a significant protective effect among women fewer than 10 years postmenopausal after two years of estrogen/progestin (HR 0.58, 95% CI: 0.37 to 0.90).45
Overall, these results including age and time since menopause support concluding that WHI CHD risks are applicable to recently menopausal women. Finally, although the WHI did not address route of administration, observational data from the Million Women Study found no increased relative risk of VTE with transdermal estrogen-only administration (0.82, 95% CI: 0.64 to 1.06).309
Conclusions
The evidence for estrogen/progestin therapy and coronary heart disease consists of one large RCT, the WHI trial.278 The trial did not find a significant relationship between treatment and overall coronary heart disease, myocardial infarctions, or death from coronary heart disease. Trial quality was rated as fair. The strength of evidence is rated moderate that estrogen/progestin increases coronary heart disease risk.
The evidence for estrogen/progestin therapy and venous thromboembolic events consists of one large RCT, the WHI trial.278 There were significant relative increases in all three outcomes: stroke, pulmonary embolism, and DVT. Trial quality was rated as fair. With one trial, consistency is unknown, although all three measures show increased risk. The strength of evidence is rated moderate that estrogen/progestin therapy increases stroke, pulmonary embolisms, and DVT risk.
The evidence concerning estrogen therapy and coronary heart disease consists of one large RCT, the WHI trial281 and one small RCT, the ESPRIT trial.287 The WHI trial reported total MI, CHD death, and overall CHD. The ESPRIT trial reported only CHD death. All four measures show no effect of estrogen therapy. Both trials were rated fair quality. Consistency is unknown for total MI and overall CHD because only one trial reported those measures. CHD death was consistent between the two trials. The strength of evidence is rated moderate that estrogen does not affect coronary heart disease risk.
The evidence for estrogen therapy and venous thromboembolic events consists of one large RCT, the WHI trial281 and one small RCT, the ESPRIT trial.287 The WHI trial found significant increases in stroke and DVT. ESPRIT also found increases in stroke and DVT events, though the increases were not significant, possibly due to the small sample size. Both trials were rated fair quality. The strength of evidence is rated high that estrogen therapy increases venous thromboembolic risk.
Endometrial Cancer
Summary
Two trials (Table 77) reported the incidence of endometrial cancer: WHI estrogen/progestin278 and HERS/HERS-II.285 Both trials administered oral conjugated equine estrogen (CEE) with medroxyprogesterone (MPA). Followup ranged from 5.2 years in WHI to 6.8 years in HERS/HERS-II. No significant differences in endometrial cancer incidence were observed in the trials of estrogen/progestin therapies. The increased risk of endometrial cancer when using estrogen-only therapies has previously been established.48
Applicability
Two large observational studies of younger women, the Nurses’ Health Study310 and the European Prospective Investigation into Cancer and Nutrition311 reported that the risk of endometrial cancer accompanying menopausal hormone therapy differed, depending on whether the progestin was administered continuously or sequentially when added to estrogen therapy. The European study showed an increased risk of endometrial cancer when progestin was administered sequentially (HR: 1.52; 95% CI: 1.00 to 2.29), and a decreased risk of endometrial cancer when progestin was administered continuously (HR: 0.24; 95% CI: 0.08 to 0.77).311 The Nurses’ Health Study reported a RR of 3.00 (95% CI: 1.43 to 6.28) when progestin was added sequentially 1 to 8 days; a RR of 1.25 (95% CI: 0.76 to 2.04) when progestin was added sequentially 9 to 18 days; and a RR of 1.34 (95% CI: 0.88 to 2.04) when progestin was added continuously.310 Further research in this area is necessary.
Conclusions
The evidence concerning estrogen/progestin therapy and endometrial cancer included two large RCTs, the WHI trial278 and HERS/HERS-II.285 Both trials administered estrogen with progestin added continuously. Point estimates from both trials showed a protective effect, but small numbers of cases resulted in wide nonsignificant confidence intervals. Both trials were rated as fair quality. Results are consistent between these trials. The measures are imprecise with wide confidence intervals. The strength of evidence is rated moderate that estrogen with continuous progestin therapy does not increase endometrial cancer risk.
Osteoporotic Fractures
Summary
Three trials reported the incidence of osteoporotic fractures: WHI estrogen/progestin,15 WHI estrogen-only,280 and HERS/HERS-II.285 Oral conjugated estrogen (CEE) was administered in all three trials. Followup ranged from 5.2 years in the WHI trial to 6.8 years in the HERS/HERS-II trial.
The HERS/HERS-II trial did not detect an effect on fracture incidence with estrogen/progestin therapy. In the WHI trials, both estrogen/progestin and estrogen alone were associated with lowered osteoporotic fracture incidence (Table 78 and Table 79). Based on the WHI estimates, estrogen/progestin therapy resulted in 46 fewer fractures per 10,000 woman-years (95% CI: 29 to 63), and estrogen-only therapy resulted in 56 fewer fractures per 10,000 woman-years (95% CI: 37 to 75). Decreased incidences of hip and vertebral fractures were observed for both therapies as well. Estrogen/progestin therapy resulted in 6 fewer hip fractures (95% CI: 1 to 10) and six fewer vertebral fractures (95% CI: 1 to 11). Estrogen-only therapy resulted in seven fewer hip fractures (95% CI: 1 to 12) and six fewer vertebral fractures (95% CI: 1 to 12).28
Applicability
The WHI and HERS trials have older but overlapping populations compared to the target population of this review. Additional evidence for younger populations was not identified.
Conclusions
The evidence concerning estrogen/progestin therapy and osteoporotic fractures consists of two large RCTs, the WHI trial15 and HERS/HERS-II.285 The WHI trial found significant decreases in hip, vertebral, other, and total fractures. The HERS trial did not find significant relationships, possibly due to a small sample size, as seen with the wide confidence intervals in the estimates. Both trials were rated as fair quality. While results were inconsistent, the measures were direct, and the WHI estimates were precise. The strength of evidence is rated moderate that estrogen/progestin therapy decreases osteoporotic fracture risk.
The evidence for estrogen therapy and osteoporotic fractures consists of the WHI trial.280 The trial reported significant reductions in hip, vertebral, and total osteoporotic fractures. Trial quality was rated as fair. Consistency is unknown with one trial. The measures are direct and precise. The strength of evidence is rated moderate that estrogen therapy decreases the risk of osteoporotic fractures.
Ovarian Cancer
Summary
One trial reported ovarian cancer incidence: WHI estrogen/progestin.279 This trial administered oral conjugated estrogen (CEE) with the addition of medroxyprogesterone acetate. The hazard ratio was consistent with an increased risk for ovarian cancer, although the wide confidence interval includes 1.00 (Table 80).
No RCTs in the Nelson report provided evidence for an association between estrogen alone and ovarian cancer.
Applicability
Two large observational studies with younger populations have reported on risks of ovarian cancer among women treated with estrogen/progestogen: the European Prospective Investigation into Cancer and Nutrition and the Cancer Prevention Study II (CPS-II). Both studies found a nonsignificant relationship between estrogen/progestogen use and ovarian cancer incidence: the European study an adjusted HR of 1.20 (95% CI: 0.89 to 1.62)312 and CPS-II an adjusted RR for former estrogen/progestogen users of 1.40 (95% CI: 0.86 to 2.28) and for current estrogen/progestogen users of 1.18 (95% CI: 0.79 to 1.76).313
A systematic review and meta-analysis of menopausal hormone therapy and ovarian cancer risk was conducted by Greiser et al.295 The review included 30 case control studies, seven cohort studies, four cancer registry studies, and one randomized controlled trial. The risk of ovarian cancer with use of estrogen/progestogen is 1.1 (95% CI: 1.0 to 1.2), according to the meta-analyses by Greiser et al.295
The evidence reviewed was judged consistent with the WHI results.
Conclusions
The evidence concerning estrogen/progestogen therapy and ovarian cancer consists of one large RCT, the WHI trial.279 The trial reported an increased risk of ovarian cancer, but the findings were not statistically significant. Trial quality was rated as fair. Consistency is unknown with one trial, but results from two large observational studies and a meta-analysis, also show increased but nonsignificant findings. Measures were direct. Evidence is imprecise (wide CI) due to few events. The strength of evidence is rated low that estrogen/progestin therapy increases ovarian cancer risk.
Key Question 3. Nonhormone Other Benefits/Harms
This Key Question addresses the long-term effects of nonhormone therapies on the following conditions: breast cancer; gallbladder disease; colorectal cancer; coronary heart disease, stroke, venous thromboembolism; endometrial cancer; osteoporotic fractures; and ovarian cancer. Eight randomized controlled trials, two cohort studies and four case-control studies formed the evidence base (Table 82 through Table 85) (detailed inclusion criteria listed in Methods section, Tables 2 and 3). We excluded population-based dietary studies and studies reporting intermediate outcomes.
Evidence examining associations of nonhormone therapies with breast cancer, colorectal cancer, coronary heart disease, stroke, and venous thromboembolism, osteoporotic fractures, and ovarian cancer was identified314-327 (Table 82). No evidence was identified evaluating associations with endometrial cancer or gallbladder disease.
Also addressed are agent-specific harms of nonhormone therapies, summarized following the analyses on long-term effects.
Breast Cancer
Summary
Many studies evaluating soy or herbal preparations and breast cancer incidence were identified, but did not meet inclusion criteria, being either population based dietary studies or reporting only intermediate outcomes (Appendix B). Included studies and results are summarized in Table 86. Two case control studies326, 327 and one cohort study327 examining isoflavones, black cohosh, ginseng, St. John’s wort, and Dong Quai met inclusion criteria. Three studies on vitamin E intake,317-319 and two studies on SSRI/SNRIs use,314, 316 also met inclusion criteria.
The Health Outcomes Prevention Evaluation (HOPE) and its extension, HOPE—The Ongoing Outcomes trial (HOPE-TOO), examined vitamin E (400 IU daily) and breast cancer incidence.317 The trial population enrolled women with vascular disease or diabetes (n=9541 in HOPE, with n=7030 continuing in HOPE-TOO). Followup in HOPE was 6 years, with an additional 4 years in HOPE-TOO. A second RCT,318 the Women’s Health Study (WHS), enrolled healthy women aged 45 years or older with a 10-year average followup. Participants in the treatment group took 600 IU of vitamin E every other day. A third RCT, the Women’s Antioxidant Cardiovascular Study, administered 600 IU of vitamin E every other day to women at high risk for cardiovascular disease. Followup averaged 9.4 years.319
A number of studies have investigated a possible antidepressant-breast cancer association. We excluded those enrolling women of all ages because of difficulty assessing modification by age on any exposure-disease association. We also excluded studies that reported results for any antidepressants and not specifically for SSRI/SNRIs. Two case-control studies met inclusion criteria. Chien et al.314 enrolled women aged 65 to 79 diagnosed with invasive breast cancer. Information on history of antidepressant use in the 20 years prior to the cancer diagnosis was collected, and results were reported for all antidepressants and for subgroups of antidepressants: tricyclics (TCA), SSRIs, and triazolopyridines. Wernli et al.316 investigated women 20 to 69 years of age, but subgroup analyses for women aged 50 years or older were provided. Results were reported for all antidepressants combined, as well as for specific types of antidepressants (SSRI, TCA, and SNRI).
Rebbeck et al. conducted a population based case control study on the association of hormone related supplements and breast cancer risk.327 Cases were women in Pennsylvania and New Jersey, 50 to 79 years of age with newly diagnosed breast cancer (n=949), matched by age and race with 1,524 controls. Prior to telephone interviews, postcards were mailed to participants with names of hormone related supplements commonly used to relieve menopausal symptoms, such as isoflavones, black cohosh, dong quai, and ginseng.327 MARIE (Mammary carcinoma Risk Factor Investigation), a case control study in Germany, investigated associations between herbal preparations used to alleviate menopausal symptoms and breast cancer risk.326 Cases (n=3,257) were women 50 to74 years of age identified through the Hamburg cancer registry, matched through population registries by age and region to controls (n=6,646). Breast cancer risk and the use of isoflavones, black cohosh, and St. John’s wort were assessed.326 A subset of the VITAL (Vitamins And Lifestyle) cohort study investigated the long term use of supplements and breast cancer risk.325 Women aged 50 to 76 years, residing in the western Washington state area, were followed for a mean of six years. Vitamin and supplement use during the ten year period prior to baseline was determined. Breast cancer risk and the use of isoflavones, black cohosh, dong quai, and St. John’s wort were assessed.325
Conclusions
The evidence concerning vitamin E and breast cancer risk consists of three large RCTs.317-319 The population of one trial318 was healthy women over 45 years of age and the other two trials focused on women with vascular disease or diabetes.317, 319 Participants received vitamin E supplements or placebo. The trials—with followups of up to 10 years and sample sizes of 7,030,317 39,876,318 and 8,171319—found no significant benefit of vitamin E for preventing breast cancer. All trials were rated as good quality (Table 83). The results are consistent among all three trials. The measures are direct and the narrow confidence interval around the null in the larger trial318 indicates precision. The strength of evidence is rated high that vitamin E does not affect breast cancer risk.
The evidence for SSRI/SNRI use and breast cancer risk consists of two case-control studies.314, 316 The two observational studies were poor quality (Table 85). Results are consistent and direct. One study had a small sample size and imprecise measures. The strength of evidence is rated insufficient that SSRIs affect breast cancer risk.
The evidence for isoflavones and breast cancer risk consists of two case control studies326, 327 and one cohort study.325 None of the studies detected an association between isoflavones supplement use and breast cancer risk (Table 86). The case control studies were rated poor quality (Table 85) and the cohort study was rated fair quality (Table 84). The results among all three studies are consistent. The measures are direct, but the wide confidence intervals indicate imprecision. The strength of evidence is rated insufficient that isoflavones affect breast cancer risk.
The evidence for black cohosh and breast cancer risk consists of two case control studies326, 327 and one cohort study.325 One case control study reported a decreased breast cancer risk among black cohosh users.327 The other case control study reported a point estimate suggesting a decreased risk, but the confidence interval upper limit is 1.0.326 The cohort study finds no association between black cohosh use and breast cancer325 (Table 86). The case control studies were rated poor quality (Table 85) and the cohort study was rated fair quality (Table 84); in the context of observation studies all associations were weak.329 Results were inconsistent among the three studies, but measures were direct. Wide confidence intervals indicate imprecise measures. The strength of evidence is rated insufficient that black cohosh affects breast cancer risk.
The evidence examining St. John’s wort and breast cancer risk consists of one case control study326 and one cohort study.325 Neither study detected an association between St. John’s wort and breast cancer risk (Table 86). The case control study was rated poor quality (Table 85) and the cohort study was rated fair quality (Table 84). The results are consistent between the two studies. The measures are direct, but wide confidence intervals indicate imprecision. The strength of evidence is rated insufficient that St. John’s wort affects breast cancer risk.
The evidence concerning dong quai and breast cancer risk consists of one case control study327 and one cohort study.325 Neither of the studies reported an association between dong quai and breast cancer risk (Table 86). The case control study was rated poor quality (Table 85) and the cohort study was rated fair quality (Table 84). The results are consistent. The measures are direct and imprecise. The strength of evidence is rated insufficient that dong quai affects breast cancer risk.
The evidence for ginseng and breast cancer risk consists of one case control study.327 The study reports no association between ginseng and breast cancer (Table 86). The study was rated poor quality (Table 85). Consistency is unknown with a single study. The measure was direct. The confidence interval is wide indicating imprecision. The strength of evidence is rated insufficient that ginseng affects breast cancer risk.
Gallbladder Disease
No studies evaluating associations between nonhormone therapies used for menopausal symptom relief and gallbladder disease were identified.
Colorectal Cancer
Summary
Two included studies (Table 87) evaluated vitamin E and colorectal cancer. Dietary studies of soy and colorectal cancer incidence and one study reporting results in men and women combined were excluded.
One large RCT,318 the Women’s Health Study (WHS), investigated the long-term effects of taking 600 IU of vitamin E every other day. The trial population was healthy women aged 45 years or older and followup an average of 10 years. The second trial, the Women’s Antioxidant Cardiovascular Study, also administered 600 IU of vitamin E every other day. The trial enrolled women aged 40 years or older with cardiovascular disease risk factors. Average followup was 9.4 years.319
Conclusions
Two large RCTs examined the effect of vitamin E on colorectal cancer. One trial, with a sample size of 39,876 and a followup of 10 years, found no statistically significant benefit of vitamin E in the prevention of colon cancer (RR=1.00).318 The second trial with a sample size of 8171 and a followup of 9.4 years, reports a protective effect (RR=0.63), but the estimate was not statistically significant (95% CI: 0.34 to 1.15).319 The trials were rated as good quality (Table 83). The estimates were consistent and direct. The measure for the large study was precise, though the smaller study had a larger confidence interval. The strength of evidence is rated high that vitamin E does not affect colorectal cancer risk.
Coronary Heart Disease, Stroke, or Venous Thromboembolism
Summary
The literature examining the potential effect of soy (isoflavones) on the prevention of cardiovascular disease is large, but limited to population based dietary studies or those reporting intermediate outcomes. Consequently, the studies were excluded. Three RCTs were identified that met inclusion criteria: two administered vitamin E318, 324 and one examined desvenlafaxine (Table 88).328
The Women’s Health Study,318 examined vitamin E supplementation and cardiovascular disease among healthy women, aged 45 years or older. The average length of followup was 10 years. Outcomes included overall cardiovascular events, myocardial infarction, stroke, and cardiovascular death. In the Women’s Antioxidant Cardiovascular Study 600 IU vitamin E was prescribed every other day to women over age 40 at increased risk for cardiovascular disease.324 The average followup was 9.4 years and outcomes included myocardial infarction, stroke, and cardiovascular death.
One RCT investigated the safety of desvenlafaxine given to healthy postmenopausal women who were seeking treatment for vasomotor symptoms.328 This phase 3 RCT administered desvenlafaxine 100 mg per day and followed the participants for one year. Safety outcomes measured were: coronary heart disease related deaths, new myocardial infarctions, new onset unstable angina requiring hospitalization, and unscheduled revascularization procedures.
Conclusions
The evidence comparing vitamin E with placebo and the risk for cardiovascular events, myocardial infarction, stroke, and cardiovascular death consists of two trials. The samples were large with mean followups of 9.4 and 10 years. Neither trial found a statistically significant benefit of vitamin E in the prevention of overall cardiovascular events, including myocardial infarction and stroke.318, 324 The WHS report found a significant protective effect on cardiovascular death,318 but the Women’s Antioxidant Cardiovascular Study did not.324
Both trials were rated good quality (Table 83). Consistent results were reported for cardiovascular events overall, as well as for myocardial infarction and stroke when analyzed separately. The measures are direct and precise. The strength of evidence is rated high that vitamin E does not affect overall cardiovascular event risk, including myocardial infarction and stroke.
The WHS trial reported a statistically significant benefit of vitamin E in the prevention of cardiovascular death318 whereas the Women’s Antioxidant Cardiovascular Study did not.324 There are uncertainties with the WHS result because it is inconsistent not only with the other trial, but with the WHS results which showed no difference in number of overall cardiovascular events. Additionally, there are well-described inaccuracies in the ascertainment of cardiovascular deaths, as coded in death certificates.330 Although the trial is of good quality, the outcome may have inaccuracies and be potentially biased. The strength of evidence is rated low that vitamin E decreases cardiovascular death risk.
The evidence for an association between desvenlafaxine and the risk for cardiovascular events consists of one phase 3 RCT.328 Followup was one year, with one woman in the placebo group experiencing an acute myocardial infarction and one woman in the desvenlafaxine group experiencing a probable stroke and another woman in the desvenlafaxine group experiencing a probable transient ischemic attack. The trial was rated fair quality. Consistency is unknown with one trial. The measures are direct, but imprecise with small numbers of events resulting in large confidence intervals. The strength of evidence is rated insufficient that desvenlafaxine affects cardiovascular event risk.
Endometrial Cancer
Summary
No studies meeting inclusion criteria evaluating the effect of nonhormone agents on endometrial cancer were identified. However, we briefly note a report from a working group of 22 clinical and research experts in the field of women’s health and botanicals convened by the North American Menopause Society.331 The group evaluated current evidence on health effects of isoflavones in peri- and postmenopausal women, including both menopausal symptom relief and long-term benefits and harms. There was no description provided on how articles were chosen for inclusion in the report. The publication discusses several large population based studies on soy consumption and the risk of endometrial cancer, which are not applicable for this current review.332-334 The Society paper also reviewed several RCTs on soy treatment and endometrial hyperplasia—an intermediate outcome.
Conclusions
The strength of evidence is rated insufficient that treatment with soy products affects endometrial cancer risk.
Osteoporotic Fractures
Summary
We identified three trials evaluating the effect of soy (isoflavones) on osteoporotic fractures321-323 (which were incorporated in a meta-analysis335) and one observational study of the association between antidepressants and osteoporotic fractures.320
Spangler et al. (2008) analyzed data from participants of the Women’s Health Initiative Observational Study, focusing on depressive symptoms, antidepressant use, and bone fractures.320 After adjusting for depressive symptoms, as well as demographic, lifestyle, and reproductive factors, the investigators found SSRI use associated with an increased risk of fractures at any site. Analysis by fracture site found SSRI users with increased fracture risk in spine and other sites.
Bolaños et al. (2010) performed an indirect treatment comparison, comparing a meta-analysis of three isoflavones versus placebo trials with a meta-analysis of ten hormone replacement therapy versus placebo trials, for the reduction of vertebral fractures. A search through the trials register of Cochrane Osteoporosis Treatment Trial Group, Cochrane Controlled Trials, MEDLINE®, EMBASE®, ProQuest, BIREME, Trip Database, LILACS, and Scielo through September 2009 was conducted. The Jadad scale336 was used to assess the quality of the RCTs. The three isoflavones trials compared ipriflavone, at a dosage of 600 mg/day plus a calcium supplement versus a calcium supplement alone. The pooled estimate for isoflavones versus did not show a significant reduction in vertebral fractures. The authors concluded that isoflavones therapy was “similar” to menopausal hormone therapy for preventing vertebral fracture using a simple calculation of the indirect odds ratios, but did not apply methods necessary to appropriately obtain estimated indirect effects and assess consistency.337 Because the appropriate statistical methods were not used, the meta-analysis is not included in our evidence table. The three RCTs321-323 are included in our review (Table 89).
Conclusions
The evidence for SSRI use and osteoporotic fractures consists of one large prospective cohort study (n=93,675) with 7212 SSRI users followed for a mean of 7.4 years.320 Hazard ratios were consistent with an increased risk for fractures in all sites, but the risks were only significant for wrist, other, and all sites. The study was rated fair. Consistency is unknown with a single study. The measures were direct and precise. The strength of evidence is rated low that SSRIs increase osteoporotic fracture risk.
The evidence for soy (isoflavones) effect on osteoporotic fractures consists of three trials. Two trials enrolled samples fewer than 100 participants who were followed for two years,321, 322 and one trial of 474 women had a followup of three years.323 One trial was rated fair quality and two trials rated poor quality. The results were inconsistent, with the larger trial reporting no effect and the two smaller trials showing a potential protective effect of isoflavones. The measures were direct, but imprecise due to the small sample sizes. The strength of evidence is rated insufficient that isoflavones affect osteoporotic fracture risk.
Ovarian Cancer
Summary
One trial (Table 90) examining the effect of vitamin E on ovarian cancer was identified.319 The Women’s Antioxidant Cardiovascular Study, a double blind placebo-controlled trial, administered 600 IU of vitamin E every other day to women aged 40 years or older and at risk for cardiovascular disease. The study found that vitamin E had no effect on ovarian cancer incidence.
Conclusions
The evidence for vitamin E and ovarian cancer consists of one RCT. The single trial, with a sample size of 8171, reports a protective, though insignificant, effect.319 The trial is rated good quality. Consistency is unknown with one trial. The measure is direct, but imprecise due to the small number of cases resulting in a wide confidence interval. The strength of evidence is rated insufficient that vitamin E affects ovarian cancer risk.
Strength of Evidence—Nonhormone Other Benefits/Harms
Table 91 summarizes strength of evidence ratings for the effects of nonhormone menopausal therapies on breast, ovarian, endometrial, and colorectal cancers, cardiovascular disease, and gallbladder disease.
Compounded Hormone Therapies
There is insufficient evidence regarding the safety and efficacy of compounded “bioidentical” hormone therapy for treatment of menopausal symptoms. We were unable to identify any clinical trials comparing compounded hormone therapy for menopausal symptoms that met our criteria for inclusion. One randomized trial compared the pharmacokinetics of estrogen containing compounded “bioidentical” cream and a conventional “bioidentical” patch, but the outcome did not include a discussion of vasomotor symptoms or other harms/benefits, the study length was less than the 12-week duration for hormone trials and the number of participants was too low for inclusion in this review (NCT00864214).338 Four evidence-based position statements from professional societies and special committee reports were reviewed and included in the report to illustrate the general consensus indicating that evidence-based research on compounded hormone therapy is lacking19, 32, 33, 38-40 Due to growing interest and an increase in prescriptions of compounded hormones, the limitations in the evidence base regarding the safety and efficacy of these therapies emphasizes the priority that should be given to future research. Many claims regarding the safety, efficacy, and superiority of compounded hormones have not been supported and FDA has voiced concern over pharmacies misleading women and practitioners by unsupported claims of safety and greater efficacy than FDA-approved menopausal hormone therapies.
Adverse Events
Summary
Among KQ1 trials of nonhormone prescription therapies used to treat menopausal symptoms, 12 trials reported adverse events. Six trials reported adverse events for desvenlafaxine,148, 166, 168, 171, 235, 339 three reported events for gabapentin,200, 340 two reported events for escitalopram,148, 167 and one reported events for clonidine.106(Appendix M, Table M-1a and Table M-1b) The most common adverse events reported were in the following categories: nervous system (12 of 12 trials), gastrointestinal (11 of 12 trials), general disorders and administration site conditions (10 of 12 trials), and eye (6 of 12 trials). The highest incidence of reported events was from a trial with desvenlafaxine (47.8 percent gastrointestinal)339 and from a trial with clonidine (52.4 percent nervous system).106 (Appendix Table M-1a and Table M-1b)
Among KQ1 trials of nonprescription therapies to treat menopausal symptoms, 16 trials reported adverse events. Nine trials reported adverse events with the use of soy (isoflavones) treatments,112, 201, 203, 205, 207, 208, 341-343 three with black cohosh,128, 216, 344three with plants or multibotanicals,108, 134, 344 one with St. John’s wort,128 and one with DHEA.150(Appendix M, Table M-2a and Table M-2b) One trial reported adverse events for both a nonhormone prescription therapy (fluoxetine) and a nonprescription therapy (black cohosh) and this trial’s results were added to Appendix M, Tables M-1a and M-1b. The most common adverse events reported were in the following categories: gastrointestinal (15 of 16 trials), nervous system (11 of 16 trials), musculoskeletal (10 of 16 trials), reproductive system/breast (10 of 16 trials), and general disorders and administration site conditions (8 of 16 trials). The highest reported events were from a trial for soy (52.5 percent gastrointestinal)342 and (25.4 percent reproductive system/breast).342
In addition to adverse events reported among KQ1 trials, a systematic review of black cohosh adverse events345 and a meta-analysis of black cohosh and hepatotoxicity346 were identified. The systematic review did not focus on postmenopausal women, but the authors discussed several case reports of potential liver problems, such as acute hepatitis and autoimmune hepatitis, related to the use of black cohosh when used to treat menopausal symptoms. Causal associations were difficult to discern because in some cases, herbal preparations were taken with black cohosh, and in one case, a relapse occurred after the black cohosh treatment had been stopped months earlier.345 The meta-analysis included five RCTs with a total of 1,020 peri- and postmenopausal women. There was no significant difference in liver function parameters (alanine aminotransferase, aspartate aminotransferase, and γ-glutamyltranspeptidase) among treatment groups and placebo groups.346
Key Question 4. Effectiveness of Treatments for Menopausal Symptoms in Selected Subgroups
This Key Question addresses the effectiveness of therapies for menopausal symptoms among subgroups of women. The evidence base consisted of the randomized controlled trials from that also included subgroup analyses. Subgroups of interest included age, BMI, race, severity of menopausal symptoms, time since menopause, and uterine status.
Twenty-seven trials reported relevant subgroup analyses.93, 94, 114, 127, 144, 145, 167, 172, 175, 183, 199, 201, 216, 234, 244, 248-250, 266, 347-354 Results of the subgroup analyses are presented by outcome category: vasomotor symptoms, sexual function, psychological symptoms, quality of life, sleep disturbance, and urogenital symptoms. Within each outcome category, there is an evidence base table of trials by subgroup and type of treatment, trial quality assessments, and summaries. Detailed results tables are in Appendix N. Strength of evidence could not be assigned owing to the variety of treatments, outcome measures, and subgroup definitions.
Vasomotor Symptoms
Nineteen trials reported subgroup analyses for vasomotor outcomes: ten were hormone therapy trials,175, 183, 234, 249, 250, 348-351, 353 one was an SSRI (escitalopram) trial,167 and eight were nonprescription therapy trials.114, 127, 201, 216, 266, 347, 352, 354 Results are summarized in Table 92.
Vasomotor Symptoms by Age
Three trials included subgroup analyses on vasomotor symptoms by age (Appendix N, Table N-1). Two administered estrogens348, 349 and one Chinese medicinal herbs.114
In a trial comparing three doses of estrogen skin gel (all low dose) with placebo, Hedrick et al. reported significant improvements in number of moderate to severe hot flushes and night sweats in all treatment arms. When analyzed by age (<50, 50 to 59, and ≥60), significant improvements were observed only in women aged 50 to 59 years. Significant improvements with the younger and the older age groups may not have been detected due to smaller sample sizes in those subgroups.348
Rigano et al. compared a standard dose estrogen patch with placebo on menopausal symptoms and reported significant improvements among women using the patch. Subgroup analyses by age (48 to 50, 51 to 53, and 54 to 56) found improvements in proportions with hot flushes in all groups.349
In a trial comparing Chinese medicinal herbs with placebo, Davis et al. reported improved vasomotor symptoms for both the treatment group and the placebo group, with no between-group difference. Analyses for women younger than 55 years and women 55 years of age or older, showed significant improvement in the MENQOL vasomotor score only among younger women treated with medicinal herbs.114
Vasomotor Symptoms by BMI
Two trials conducted subgroup analyses on vasomotor symptoms by BMI. Both interventions were nonprescription, one using two different doses of isoflavones201 and one using Chinese medicinal herbs (Appendix N, Table N-2).114
In the isoflavones trial, Tice et al. reported equivalent improvements in vasomotor symptoms among the placebo and two isoflavones treatment groups. Subgroup analyses on women with a BMI <25 and ≥25 kg/m2 found no effect modification by BMI. The numbers in each subgroup were not provided and significance tests were not performed.201
The trial with Chinese medicinal herbs reported that MENQOL vasomotor scores were similar between the treatment and placebo groups in the overall study population. Subgroup analyses on women with BMI ≤25 and >25 kg/m2 found that women with BMI ≤25 kg/m2 experienced significantly reduced vasomotor symptoms compared with placebo, while women with higher BMI did not.114
Vasomotor Symptoms by Race
One trial conducted a subgroup analysis on vasomotor symptoms by race (African-American, White, and other) (Appendix N, Table N-3).167 The trial compared an SSRI (escitalopram, 10 to 20 mg) with placebo and reported significant improvements in vasomotor symptoms for the treatment group compared with the placebo. In the subgroup analysis, compared with placebo, daily total hot flushes and night sweats decreased among White, but not African-American women, in the SSRI group.
Vasomotor Symptoms by Severity of Symptoms
Nine trials included subgroup analyses on vasomotor symptoms according to severity of symptoms (Appendix N, Table N-4). Three trials administered estrogen/progestin — one included a placebo comparator234 and two compared different estrogen/progestin doses.249, 250 Two trials compared isoflavones with placebo;347, 354 a three-arm trial compared estrogen/progestin, isoflavones, and placebo;353 one trial compared black cohosh with placebo;352 one trial compared equol with placebo;347 and one trial compared black cohosh plus isoflavones with placebo.127
In the trial comparing standard dose estrogen/progestin with placebo, Maki et al. presented mean change in total hot flushes for women with symptom severity scores over and below the overall mean at baseline. A significant decrease in vasomotor symptoms was observed only in the subgroup that was above the mean—possibly a floor effect in women with lesser symptoms.234
In a three arm trial with two standard doses of estrogen/progestin and one high dose of estrogen/progestin, Pitkin et al. measured weekly moderate to severe hot flushes for women and reported that all doses were accompanied by significant reductions. Subgroup analyses for women with 30 or more hot flushes per week at baseline and in those with fewer than 30 at baseline found significant hot flush reductions, regardless of estrogen dose or severity of symptoms at baseline.250
In a three-arm trial comparing three doses of estrogen (ultralow, low, and standard), mean daily total hot flushes decreased significantly among all treatment groups. Subgroup analysis on only women experiencing three or more hot flushes per day also showed that all estrogen doses were equally effective.249
The two isoflavones trials conducted analyses focusing only on women with more severe symptoms: a Kupperman Index score greater than 20266 and women with 4 or more hot flushes and night sweats per day.354 One trial reported a significant improvement in moderate to severe hot flushes among women treated with isoflavones compared with placebo,266 but the other trial found equally significant improvements in total hot flushes and night sweats in both isoflavones and placebo groups.354
The three-arm trial of estrogen/progestin, isoflavones, and placebo reported significant reductions in daily total hot flushes in the two treatment groups compared with placebo for the whole trial population. A subgroup analysis limited to women with more severe symptoms (a hot flush score >5), also showed significant improvements in total hot flushes among both the treatment groups compared with placebo.353
One study compared black cohosh with placebo and reported similar reductions in total hot flushes among both the black cohosh and placebo groups in the larger study sample. However, in a subgroup analysis limited to women with more severe symptoms (Kupperman Index ≥20), women treated with black cohosh experienced significant improvements compared with placebo.352
In the equol trial, Aso et al. reported significant reductions in hot flushes in the treatment group compared with placebo.347 Separate analyses on women with fewer than 3 hot flushes per day and women with 3 or more per day were performed. Both subgroups experienced decreases in daily hot flushes, but the difference was only significant in the subgroup with more severe symptoms.347
Verhoeven et al. conducted a trial comparing the effects of a supplement containing both isoflavones and black cohosh with placebo. Reductions in total hot flushes between the groups were similar in the whole trial sample, as well as in the subgroup with more severe symptoms (≥9 hot flushes per day).127
Vasomotor Symptoms by Time Since Menopause
Six trials conducted subgroup analyses on vasomotor symptoms by time since menopause (Appendix N, Table N-5). Two trials compared estrogen plus bazedoxifene with placebo;183, 351 one trial compared estrogen/progestin with placebo;175 one trial compared estrogen with placebo;350 and two trials compared nonprescription treatments (Chinese medicinal herbs114 and black cohosh216) with placebo.
The two estrogen/bazedoxifene trials were part of the Selective Estrogens, Menopause, and Response to Therapy (SMART) trials. In these trials, low and standard dose estrogens, were combined with bazedoxifene, and compared with placebo. Women in both treatment groups experienced significant reductions in MENQOL vasomotor scores compared with women in the placebo group. When subgroup analyses were conducted on women less than 5 years and 5 years or more since menopause, the estrogen groups in both trials experienced significant reductions in vasomotor scores compared with placebo, regardless of time since menopause.183, 351
In the Baerug et al. (1998) trial, two low-dose estrogen/progestin groups were compared with placebo. All three groups experienced significant improvements in vasomotor symptoms, and the differences between the estrogen groups compared with the placebo group were also significant. Subgroup analysis comparing late perimenopausal and postmenopausal women show mean weekly hot flushes were similarly improved in both treatment groups compared with placebo.175
Simon et al. (2001) compared standard dose estrogen with placebo and found significant improvements in vasomotor symptoms over placebo. Subgroup analysis was conducted on four subgroups (0 to ≤6 months since last menses; 6 to ≤12 months since last menses; 12 to ≤36 months since last menses; and >36 months since last menses). Fewer daily moderate-to-severe hot flushes were observed in all subgroups with estrogen—significant only in the two later menopausal groups (12 to ≤36 months since last menses and >36 months since last menses). Significant improvements with the earlier menopausal groups may not have been detected due to smaller population sizes in those subgroups.350
Davis et al. (2001) compared Chinese medicinal herbs with placebo and reported improvements in vasomotor symptoms in both groups. Subgroup analysis for women experiencing less than 4 and 4 or more years of amenorrhea was performed. MENQOL vasomotor score and total daily hot flushes and night sweats were reported. There were no significant differences in vasomotor outcomes between the two subgroups.114
Osmers et al. (2005) compared black cohosh with placebo and noted significant improvements in vasomotor symptoms in the black cohosh group compared with placebo. A subgroup analysis on early and late climacteric women was performed. The difference in changes from placebo on the Menopause Rating Scale for hot flushes was significant in both early (p<0.002) and late (p<0.006) climacteric women.216
Vasomotor Symptoms by Uterus Status
One trial reported subgroup analyses by uterus status (absent or intact) and reported vasomotor outcomes (Appendix N, Table N-6). Three estrogen doses (all low dose: 0.25 mg, 0.50 mg, and 1.0 mg) of estrogen skin gel were compared with placebo.348 No vasomotor outcomes for the study groups as a whole were reported. Among women with absent uteri, the number of moderate to severe hot flushes decreased significantly in women treated with 0.25 mg and 1.0 mg estrogen gel, and severity of flushes decreased significantly only in the 1.0 mg estrogen gel group. Among women with intact uteri, number of moderate to severe hot flushes decreased significantly in the 0.50 mg and 1.0 mg treatment groups, and severity of hot flushes decreased significantly in all treatment groups.348
Sexual Function
Seven trials included subgroup analyses of sexual function outcomes (Table 93). Six were estrogen therapy trials144, 145, 183, 244, 349, 351 and one was a nonprescription therapy trial.114
Sexual Function by Age
Four trials conducted subgroup analyses on sexual function by age (Appendix N, Table N-7). Two trials compared estrogen alone treatment with placebo;144, 349 one trial compared estrogen/progestin with placebo;145 and one trial compared a nonprescription treatment with placebo.114
Two trials were part of the Women’s Health Initiative. One trial examined standard dose estrogen with placebo144 and one trial standard dose estrogen/progestin with placebo.145 Both conducted subgroup analyses on women aged 50 to 54 years with moderate to severe vasomotor symptoms. There were significant improvements in sexual satisfaction scores compared with placebo in the estrogen/progestin, but not estrogen alone, groups.
Rigano et al. (2001) compared a standard dose estrogen patch with placebo and assessed sexual activity by age subgroups (48 to 50, 51 to 53, and 54 to 56). Estrogen treatment resulted in more women reporting decreased sexual activity compared with placebo, with the strongest effect in the oldest age group.349
Davis et al. (2001) compared Chinese medicinal herbs with placebo. There was no significant difference in MENQOL sexual score in the trial population as a whole. In subgroup analyses for women younger than 55 years and 55 years or older, improvement in MENQOL sexual score was seen in both age groups treated with herbs, but not statistically distinguishable compared with placebo.114
Sexual Function by BMI
Davis et al.114 (Appendix N, Table N-8) compared Chinese medicinal herbs with placebo and found no difference in MENQOL sexual score. A subgroup analysis was conducted for women with BMI ≤25 and >25 kg/m2. Neither BMI subgroup experienced a significant difference in MENQOL sexual score with treatment compared with placebo.114
Sexual Function by Time Since Menopause
Three trials reported subgroup analyses for sexual function according to time since menopause (Appendix N, Table N-9). Two compared estrogen plus bazedoxifene with placebo (SMART trials),183, 351 and one Chinese medicinal herbs with placebo.114
In the estrogen/bazedoxifene trials, low or standard dose estrogens were combined with bazedoxifene and compared with placebo. MENQOL sexual scores were compared in analyses for women less than 5 and 5 or more years since menopause. In both trials, the estrogen/bazedoxifene treatment significantly improved sexual scores only in women menopausal for 5 years or more.183, 351
Davis et al. compared Chinese medicinal herbs with placebo and did not detect a difference in the study groups in MENQOL sexual score. Subgroup analyses of women who were amenorrheic for less than 4 or 4 or more years were performed. The difference in change over placebo in MENQOL sexual score was slightly lower in participants with more than 4 years amenorrhea, but the difference was not statistically significant.114
Sexual Function by Uterus Status
A single trial conducted subgroup analyses on sexual function by uterus status (Appendix N, Table N-10).244 In this three arm trial of testosterone (0.15 mg and 0.30 mg) compared with placebo, number of satisfying sexual episodes per week did not differ among the three groups. Subgroup analyses were conducted among women with natural menopause and women with surgical menopause. Among women with natural menopause, significant improvements in number of satisfying sexual episodes per week were reported for both the 0.15 mg testosterone group (p=0.02) and the 0.30 mg testosterone group (p<0.001) compared with placebo. In women with surgical menopause, no significant differences from placebo were observed.244
Psychological Symptoms
Eight trials with subgroup analyses reported psychological outcomes (Table 94). Five were hormone therapy trials,144, 145, 183, 199, 351 one was a desvenlafaxine trial,172 one was a Chinese medicinal herb trial,114 and one was a black cohosh trial.216
Psychological Symptoms by Age
Three trials conducted subgroup analyses on psychological symptoms by age.
Two of the trials were part of the Women’s Health Initiative trials. One tested standard dose estrogen with placebo144 and one tested standard dose estrogen/progestin with placebo.145 Both trials conducted subgroup analyses on women aged 50 to 54 with moderate to severe vasomotor symptoms. The researchers combined the Center for Epidemiological Studies Depression Scale plus two items from the Diagnostic Interview Schedule as a psychological outcome measure. Neither of the trials found a significant difference in psychological measures in the treatment groups compared with placebo within this subgroup.
One trial compared Chinese medicinal herbs with placebo. A subgroup analysis was conducted on women younger than 55 years of age and women 55 years of age or older. Neither age group showed a statistically significant difference in MENQOL psychological score between the treatment and placebo groups.114
Psychological Symptoms by BMI
One trial conducted subgroup analyses on psychological symptoms by BMI (Appendix N, Table N-12).114 Davis et al. compared Chinese medicinal herbs with placebo in women with a BMI ≤25 and women with a BMI >25. Neither subgroup experienced significant differences in MENQOL psychological score between the treatment and placebo groups.
Psychological Symptoms by Time Since Menopause
Six trials conducted subgroup analyses on psychological symptoms by time since menopause (Appendix N, Table N-13). Two trials compared estrogen plus bazedoxifene with placebo;183, 351 two trials compared nonprescription treatments (Chinese medicinal herbs114 and black cohosh216) with placebo, one trial compared estrogen with placebo,199 and one trial compared an SNRI (desvenlafaxine) with placebo.172
The two estrogen/bazedoxifene trials were part of the Selective Estrogens, Menopause, and Response to Therapy (SMART) trials. In these trials, low-dose and standard-dose estrogens were combined with bazedoxifene and compared with placebo. When subgroup analyses were conducted on women who were less than 5 years menopausal compared with women menopausal for 5 years or more, none of the treatment groups in either of the subgroups experienced significant reductions in MENQOL psychological scores compared with placebo.183, 351
Strickler et al. compared a standard dose of conjugated equine estrogen with placebo and reported no difference in WHQ anxiety scores among the two study groups. A subgroup analysis on women less than 4 years postmenopausal and women who were postmenopausal 4 years or more was conducted. The WHQ anxiety scores did not change significantly in either subgroup of the treatment groups compared with placebo.199
Kornstein et al. compared an SNRI (10 mg desvenlafaxine) with placebo and reported significant improvements in Hamilton depression scores in the treatment group compared with placebo. A subgroup analysis on perimenopausal and postmenopausal women found that both subgroups experienced significant improvements in depressive symptom scores following desvenlafaxine treatment compared with placebo.172
Davis et al. compared Chinese medicinal herbs with placebo among subgroups of women experiencing amenorrhea for less than 4 years and women experiencing amenorrhea for 4 years or more. The MENQOL psychological scores did not change significantly in either of the subgroups.114
Osmers et al. compared black cohosh (40 mg) with placebo and reported a significant improvement in Menopausal Rating Scale psychological scores for the black cohosh group compared with the placebo group. A subgroup analysis by time since menopause found a marginally significant improvement in psychological scores among the early climacteric women (p=0.05), but no significant change among the late climacteric women (p=0.08).216
Psychological Symptoms by Comorbidities
One trial conducted subgroup analyses on psychological symptoms by comorbidities. (Appendix N, Table N-14).199 Strickler et al. (2000) reported no difference in WHQ anxiety scores among women treated with standard dose estrogen compared with placebo. Subgroup analyses were conducted on women with a baseline anxiety score of less than 3.5 and women with a baseline anxiety score 3.5 or more. A significant reduction in WHQ anxiety scores was observed only in the subgroup with higher baseline anxiety scores.199
Quality of Life
Nine trials conducting subgroup analyses reported quality-of-life outcomes (Table 95). Five were hormone therapy trials,93, 94, 183, 234, 248, 351 two were black cohosh trials,216, 352 and one was an isoflavones trial.266
Quality of Life by Severity of Symptoms
Five trials conducted subgroup analyses on quality of life by severity of symptoms (Appendix N, Table N-15). One trial compared estrogen/progestin with placebo;234 one trial compared an estradiol spray with an estradiol patch;93 one trial compared oral estradiol plus dydrogesterone with spray estradiol plus dydrogesterone;94 one trial compared isoflavones with placebo;266 and one trial compared black cohosh with placebo.352
Maki et al. compared standard dose estrogen/progestin with placebo and performed subgroup analyses on symptomatic women (hot flush severity score of ≥1.2 at baseline) and asymptomatic women (hot flush severity <1.2 at baseline). Two different quality-of-life scales were used as outcomes: total Greene Climacteric Scale (GCS), a menopause-specific quality of life scale, in which a lower score indicates a better quality of life and Utian Quality of Life (QOL) Scale, a general health quality-of-life scale, in which a higher score indicates a better quality of life. A significant improvement in quality of life was reported with the Utian QOL among symptomatic women in the treatment group compared with placebo. There was no difference using the GCS scale between the subgroups.234
Lopes et al. compared an estradiol patch with an estradiol spray and reported equivalent significant improvements in total Kupperman Index scores with both routes of administration in the whole study population. A subgroup analysis on women with more severe symptoms, more than seven hot flushes per day, also found both routes of administration providing significant improvements in total Kupperman Index scores, with no difference between the routes.93
Mattsson et al. compared standard doses of oral estradiol with standard doses of estradiol spray and found equivalent significant improvements in total Kupperman Index scores with both routes of administration. A subgroup analysis on women with more severe symptoms, more than seven hot flushes per day, also found both routes of administration providing significant improvements in total Kupperman Index scores, with no difference between the routes.94
In a trial comparing isoflavones with placebo, Lee et al. report a significant improvement in total Kupperman Index score in the treatment group compared with placebo. A subgroup analysis on women with more severe symptoms, a greater than 20 Kupperman Index score at baseline, also found the isoflavones group with a significant improvement in quality of life compared with the placebo group.266
In a trial comparing black cohosh with placebo, Frei-Kleiner et al. report no difference in median Kupperman Index score among the study groups. When a separate analysis on the subgroup of women with a greater than 20 Kupperman Index score at baseline is conducted, the authors report a significant improvement in quality of life among those treated with black cohosh compared with placebo.352
Quality of Life by Time Since Menopause
Four trials conducted subgroup analyses on quality of life by time since menopause (Appendix N, Table N-16). Two trials compared estrogen plus bazedoxifene with placebo;183, 351 one trial compared low dose and standard dose estrogen/progestin therapy;248 and one trial compared black cohosh with placebo.216
The two estrogen/bazedoxifene trials were part of the Selective Estrogens, Menopause, and Response to Therapy (SMART) trials. In these trials, low dose and standard dose estrogens were combined with bazedoxifene and compared with placebo. Subgroup analyses were conducted on women who were less than 5 years menopausal compared with women menopausal for 5 years or more, all the treatment groups experienced significant reductions in total MENQOL scores compared to placebo, regardless of time since menopause.183, 351
Loh et al. (2002) compared low-dose estrogen/progestin with standard-dose estrogen/progestin and used Kupperman Index as an outcome. Both hormone treatments were efficacious in improving overall Kupperman Index scores in the whole study population. Subgroup analyses were performed on women whose time since menopause was less than 3 years and women whose time since menopause was 3 years or more. Total Kupperman Index scores were improved equally in both subgroups by both low-dose and standard-dose groups.248
In the Osmers et al. (2005) trial, black cohosh was compared with placebo and significant improvements in the black cohosh group in total Menopause Rating Scale (MRS) score was reported. Subgroup analyses were performed on early climacteric women and late climacteric women. For both early and late climacteric women, significant improvements in the black cohosh group compared with the placebo group were observed.216
Sleep Disturbance
Five trials conducting subgroup analyses reported sleep disturbance outcomes (Table 96). All five trials tested hormone therapies.144, 145, 183, 349, 351
Sleep Disturbance by Age
Three trials conducted subgroup analyses on sleep disturbance by age categories (Appendix N - Table N-17). Two trials compared estrogen with placebo144, 349 and one trial compared estrogen/progestin with placebo.145
Two of the trials were part of the Women’s Health Initiative trials. One tested standard dose estrogen alone with placebo144 and one tested standard dose estrogen/progestin with placebo.145 Both trials conducted analyses on the subgroup of women aged 50 to 54 with moderate to severe vasomotor symptoms. The researchers used mean change in WHI sleep score as an outcome measure. Women treated with estrogen alone experienced significant improvements in sleep scores in the treatment group as a whole, but did not have significant improvements in sleep scores in the subgroup of younger women with more severe symptoms.144 Women treated with estrogen/progestin experienced significant improvements in sleep scores in the treatment group as a whole, as well as in the subgroup of younger women with more severe symptoms.145
Rigano et al. compared a standard dose estrogen transdermal patch with placebo. Sleep disturbance measures were not provided for the population as a whole, only by age subgroups (48 to 50, 51 to 53, and 54 to 56). Women receiving hormone therapy in all age groups reported less insomnia compared to the women receiving placebo. Significance between subgroups was not calculated.349
Sleep Disturbance by Time Since Menopause
Two trials, comparing estrogen plus bazedoxifene with placebo, conducted subgroup analyses on sleep disturbance by time since menopause (Appendix N, Table N-19). The two estrogen/bazedoxifene trials were part of the Selective Estrogens, Menopause, and Response to Therapy (SMART) trials. In these trials, low dose and standard dose estrogens were combined with bazedoxifene and compared with placebo.183, 351 Subgroup analyses were conducted in both trials on women who were less than 5 years menopausal and on women who were menopausal for 5 years or more. Lobo et al. measured mean difference in Quality of Sleep Score, and reported no significant improvements in sleep among women less than 5 years menopausal; however, significant improvements were found in women menopausal for 5 years or more who received the estrogen/bazedoxifene treatments.351 Utian et al. reported significant improvements in Medical Outcome Survey sleep scores among the treatment groups as a whole, and in both subgroups of early and late menopausal women.183
Urogenital Atrophy
One trial conducted subgroup analyses and reported urogenital atrophy outcomes (Table 97). The trial compared black cohosh with placebo216
Urogenital Symptoms by Time Since Menopause
One trial conducted subgroup analyses on urogenital symptoms by time since menopause (Appendix N, Table N-19). The trial compared black cohosh with placebo and found improved Menopausal Rating Scale scores in the treatment group. When subgroup analysis was conducted on early and late climacteric women, Osmers et al. found that in both early and late climacteric women, black cohosh improved urogenital atrophy compared with placebo.216
- Results - Menopausal Symptoms: Comparative Effectiveness of TherapiesResults - Menopausal Symptoms: Comparative Effectiveness of Therapies
- cytochrome c oxidase polypeptide I+III [Sulfurisphaera tokodaii]cytochrome c oxidase polypeptide I+III [Sulfurisphaera tokodaii]gi|2704357688|dbj|GAA5420707.1||gnl BAAB|GAA5420707Protein
- Preface - Testing of CYP2C19 Variants and Platelet Reactivity for Guiding Antipl...Preface - Testing of CYP2C19 Variants and Platelet Reactivity for Guiding Antiplatelet Treatment
- Detailed Assessment of the Strength of Evidence - Testing of CYP2C19 Variants an...Detailed Assessment of the Strength of Evidence - Testing of CYP2C19 Variants and Platelet Reactivity for Guiding Antiplatelet Treatment
- MRSA Full-Text Screening Form - Screening for Methicillin-Resistant Staphylococc...MRSA Full-Text Screening Form - Screening for Methicillin-Resistant Staphylococcus Aureus (MRSA)
Your browsing activity is empty.
Activity recording is turned off.
See more...