Intended for healthcare professionals


St John's wort for depression—an overview and meta-analysis of randomised clinical trials

BMJ 1996; 313 doi: (Published 03 August 1996) Cite this as: BMJ 1996;313:253
  1. Klaus Linde, scientific assistanta,
  2. Gilbert Ramirez, codirectorb,
  3. Cynthia D Mulrow, professor of medicineb,
  4. Andrej Pauls, consultant psychiatristc,
  5. Wolfgang Weidenhammer, biostatisticiana,
  6. Dieter Melchart, project leadera
  1. a Projekt “Munchener Modell,” Ludwig-Maximilians-Universitat, Kaiserstrasse 9, 80801 Munich, Germany
  2. b San Antonio Cochrane Center, Audie L Murphy Memorial Veterans Hospital, San Antonio, TX 78284, USA
  3. c Private Practice for Neurology and Psychiatry, 80796 Munich
  1. Correspondence to: Dr Linde
  • Accepted 24 April 1996


Objective: To investigate if extracts of Hypericum perforatum (St John's wort) are more effective than placebo in the treatment of depression, are as effective as standard antidepressive treatment, and have fewer side effects than standard antidepressant drugs.

Design: Systematic review and meta-analysis of trials revealed by searches.

Trials: 23 randomised trials including a total of 1757 outpatients with mainly mild or moderately severe depressive disorders: 15 (14 testing single preparations and one a combination with other plant extracts) were placebo controlled, and eight (six testing single preparations and two combinations) compared hypericum with another drug treatment.

Main outcome measures: A pooled estimate of the responder rate ratio (responder rate in treatment group/responder rate in control group), and numbers of patients reporting and dropping out for side effects.

Results: Hypericum extracts were significantly superior to placebo (ratio = 2.67; 95% confidence interval 1.78 to 4.01) and similarly effective as standard antidepressants (single preparations 1.10; 0.93 to 1.31, combinations 1.52; 0.78 to 2.94). There were two (0.8%) drop outs for side effects with hypericum and seven (3.0%) with standard antidepressant drugs. Side effects occurred in 50 (19.8%) patients on hypericum and 84 (52.8%) patients on standard antidepressants.

Conclusion: There is evidence that extracts of hypericum are more effective than placebo for the treatment of mild to moderately severe depressive disorders. Further studies comparing extracts with standard antidepressants in well defined groups of patients and comparing different extracts and doses are needed.

Key messages

  • There is evidence from randomised trials that such extracts are more effective than placebo for the treatment of depressive disorders, but it is not known whether they are more effective for certain disorders than others

  • Current evidence is inadequate to establish whether hypericum is as effective as other antidepressants and if it has fewer side effects

  • Additional trials should be conducted to compare hypericum with other antidepressants in well defined groups of patients; to investigate long term side effects; and to evaluate the relative efficacy of different preparations and doses


Depression is a common disorder with an estimated life time prevalence of 17% in the United States.1 Many people with depression are treated in primary care settings.2 3 Given the complexity of differential diagnosis and treatment of depression, it is often difficult for primary practitioners if antidepressant drugs are indicated. Some practitioners and patients are reluctant to use antidepressants because of associated side effects. Additional treatment modalities with little risk, credible benefit, and moderate costs could be a useful addition to depression management in primary care settings.

Extracts of the plant Hypericum perforatum (popularly called St John's wort), a member of the Hypericaceae family, have been used in folk medicine for a long time for a range of indications including depressive disorders (fig 1). Extracts of St John's wort are licensed in Germany for the treatment of anxiety and depressive and sleep disorders. In 1993 more than 2.7 million prescriptions were counted for the seven most popular preparations in Germany.4 Hypericum extracts contain at least 10 constituents or groups of components that may contribute to its pharmacological effects. These include naphthodianthrons (for example, hypericins, on whose content most of the available preparations are standardised), flavonoids (for example, quercetin), xan-thones, and bioflavonoids.5 The mechanism of action of the postulated antidepressant effects is unclear.6

Fig 1
Fig 1

Blossom of St John's wort (Hypericum perforatum)

In the past 10 years several randomised clinical trials have compared the effects of pharmaceutical preparations of St John's wort with placebo and common antidepressants. Recently, a systematic review on these trials has been published in a phytomedical journal7; this review, however, focused on the assessment of methodological quality and did not include an analysis of effect sizes. Our objective was to provide a comprehensive overview including a quantitive meta-analysis of the existing evidence of the antidepressant activity of extracts of St John's wort. Specifically we investigated whether extracts of hypericum are more effective than placebo in the treatment of depression, are as effective as standard antidepressive treatment, and have fewer side effects compared with standard antidepressant drugs.



Published and unpublished eligible trials were searched for by full text searches in Medline SilverPlatter CD-ROM 1983-94 (screening titles and available abstracts of all hits from searches for the terms “St John's wort,” Johanniskraut, “hyperic*”); full text searches in Psychlit and Psychindex 1987-94 CD-ROM; additional online searches in Medline (1966 onwards) and Embase (1974 onwards); searches in the private database Phytodok, Munich; checking bibliographies of obtained articles; and contacting pharmaceutical companies and authors. There were no language restrictions.


The following parameters had to be met for study inclusion: firstly, design—randomised or quasi-randomised (for example, alternation) controlled trials; secondly, types of participants—people with depressive disorders; thirdly, types of interventions—comparison of preparations of St John's wort (alone or in combination with other plant extracts) with placebo or other antidepressants; and, finally, outcome measures—all clinical outcome measures such as depression scales or symptoms. Trials which measured physiological parameters only were excluded. At least two reviewers assessed the eligibility of each trial, and there were no disagreements. A complete list of all of the included trials is available from us.


The methodological quality of each trial was assessed by at least two reviewers using a scale developed by Jadad et al8 (with items on random allocation, blinding, and description of dropouts and withdrawals) and a scale developed by ourselves (additional items on concealment of randomisation and comparability at baseline).


Primary study characteristics and results were extracted by at least two independent reviewers. Questionnaires were then sent to authors or sponsors, or both, of all studies for checks of the correctness of extracted data and to obtain missing information (response rate 13/23 from both authors and drug manufacturers).

The trials used various methods to measure treatment effects. The most consistently used instruments were the Hamilton depression scale9 (used in 17 trials) and the clinical global impressions index9 (used in 12 trials).

The Hamilton depression scale is an observer rated scale focusing mainly on somatic symptoms. The original version includes 17 items, but a more recent one with 21 items is also in use. Most studies using this scale also report the number of “treatment responders” (patients with a score less than 10 or less than 50% of the baseline score, or both). If available, we extracted means (SD) before, during, and after treatment as well as the number of “responders.”

The clinical global impressions index is an observer rated instrument with three items (severity of illness, global improvement, and an efficacy index). We extracted the number of patients rated as “much improved” or “very much improved” for global improvement.

Additionally, we extracted all reported means (SD) for other rating scales and numbers for “treatment responders” from other global assessments.


For analyses comparing hypericum with placebo and standard antidepressants, numbers of treatment responders and treatment failures according to the Hamilton depression scale (first preference), the clinical global impressions index subscale for global improvement (second preference), or another global responder criteria were entered in a 2 × 2 table. Odds ratios and rate ratios (relative risks) were calculated on an intention to treat basis (with ratios greater than 1 representing a superiority of hypericum v control). A Mantel-Haenzsel method was used for odd ratio measures and a variance weighted procedure for rate ratios. Estimations of summary estimates were preceded by homogeneity testing by using an (alpha) level of 0.10. Summary measures of both fixed effects and random effects were estimated. For the present report the rate ratio represents the effect estimate.

Cohen's d (standardised mean difference) was calculated for all studies contributing to a continuous data hypothesis with a pooled SD. The summary effect sizes from these analyses were converted to rate ratios by using techniques for converting effect sizes to a binominal effect size display.10 Additionally, variance weighted mean differences were calculated for scores on the Hamilton depression scale.


We originally identified 37 randomised or possibly randomised trials that evaluated preparations containing extracts of hypericum (V Wienert et al, third phytotherapy congress, Lubeck-Travemunde, 1991; M Bernhardt et al, fifth phytotherapy congress, Bonn, 1993).11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Most of the trials were identified through reviews (especially through that of Harrer and Schulz46), bibliographies of papers, and the complementary database Phytodok, while our original online and CD-ROM searches in Medline, Embase, Psychlit, and Psychindex revealed less than one third of the trials. Fourteen trials were excluded from our analysis as they included only healthy volunteers and investigated physiological parameters34 35 36 37 38 (V Wienert et al, third phytotherapy congress, Lubeck-Travemunde, 1991); studied disorders other than depression39 40 41 42; investigated only pharmacodynamics43; or did not include a placebo or antidepressant control group44 45 (M Bernhardt et al, fifth phytotherapy congress, Bonn, 1993).

Table 1 gives an overview of the 23 randomised clinical trials that compared extracts of hypericum with placebo or another treatment in depressive patients. Fifteen trials with 1008 patients were placebo controlled (14 on single preparations,11 12 13 14 15 16 17 18 19 20 21 22 23 24 one on a combination with four other plant extracts25) and eight trials (six on single preparations26 27 28 29 30 31 and two on a combination of hypericum and valeriana32 33) with 749 patients compared hypericum with other antidepressant or sedative drugs. With the exception of two trials,30 31 all had treatment and observation periods of four to eight weeks.

Table 1

Description of randomised double blind controlled trials of hypericum for depression: patients and interventions (grouped for preparations)

View this table:

A heterogeneous group of patients were included in the trials, and classification of depressive disorders was inconsistent. Most reports stated that patients suffered from mild to moderately severe depression, but this statement did not always correlate with the severity of symptoms according to the Hamilton depression scale or other scales (see, for example, Schmidt et al22 in table 1). The trials were performed in private practices of psychiatrists (explicitly stated in nine trials), internists (seven trials), general practitioners (six trials), and obstetricians (one trial). The number of trial centres varied between one and 50 (table 1).

Seven different single preparations and two combinations were tested. Daily doses of hypericin, which is the reference substance for pharmaceutical standardisation, and the dose of total extract varied considerably (between 0.4 and 2.7 mg and 300 and 1000 mg, respectively).

Most trials had reasonable to good methodology. Ten trials scored 80% or more of possible points in both of the assessment systems used.11 12 15 16 19 24 25 27 29 32 For seven trials there was an explicit statement in the published reports that concealment of randomisation was done by consecutively numbered preparations; for a further 10 trials this information was obtained from the authors or the sponsors. Twenty trials were described as double blind (success of blinding was not tested in any trial), one was single blind, and two were open. Six of the double blind trials used fluid preparations. As hypericum extracts have a characteristic taste a certain degree of unblinding seems possible.


Thirteen trials that compared a single hypericum preparation with placebo provided data on “treatment responders.” There were 94 (22.3%) responders in the placebo groups versus 225 (55.1%) in the hypericum groups (pooled rate ratio 2.67; 95% confidence interval 1.78 to 4.01; fig 2). The results were similar if studies presenting data for responders to the Hamilton depression scale (rate ratio 2.71; 1.97 to 3.74; n = 11) or those to the clinical global impressions index (2.54; 1.61 to 4.00; n = 5) were analysed separately. In the only trial that investigated a combination the rate ratio was 2.00 (1.14 to 3.52) for responders to the clinical global impressions index.

Fig 2
Fig 2

Treatment responders and rate ratios in randomised controlled trials of hypericum extracts

Analyses of the scores on the Hamilton depression scale at different weeks after treatment indicated a significant effect of hypericum over placebo (table 2. Mean scores after treatment of the patients receiving hypericum in the nine trials providing data for analysis were 4.4 points (95% confidence interval 3.5 to 5.3) better than those of patients receiving placebo.

Table 2

Overview of effect size estimates for outcomes measured in randomised clinical trials of hypericum

View this table:


Three trials of single preparations and two trials of combinations comparing hypericum and standard antidepressants provided numbers of “treatment responders.” There were 101 (63.9%) responders with single hypericum preparations and 93 (58.5%) with standard antidepressant treatment (1.10; 0.93 to 1.31), and 88 (67.7%) with combinations versus 66 (50%) with standard antidepressants (1.52; 0.78 to 2.94; fig 2). The scores on the Hamilton depression scale after treatment were slightly better in patients on single hypericum preparations than in those on standard antidepressants (weighted mean difference 1.01; −0.4 to 2.4). One trial21 was analysed separately as its study model was completely different from all other trials. It compared depressive symptoms in 30 patients treated with 50 mg imipramin or hypericum over two weeks in patients who were informed that they had to undergo an amputation (mean Hamilton depression scale after treatment 5.0 with hypericum and 4.7 with imipramine).


In the six trials of single hypericum preparations two (0.8%) patients in the test groups and seven (3.0%) patients in the groups receiving standard medications dropped out from the study because of side effects (odds ratio 0.56; 0.15 to 2.08). Total rates of drop out were 4.0% for hypericum and 7.7% for standard antidepressants (0.61; 0.27 to 1.38). The numbers of patients reporting side effects were 50 (19.8%) and 84 (35.9%) (0.39; 0.23 to 0.68).

In the two trials of the combination of hypericum and valeriana only one patient on amitriptylin dropped out because of side effects, and total drop out rates were similar in test and control groups (15.4% v 14.3%; 1.18; 0.46 to 3.05). Nineteen (14.6%) patients reported side effects with the tested combination and 35 (26.5%) with amitriptylin or desipramin (0.49; 0.23 to 1.04).

In the placebo controlled trials 0.4% of hypericum patients and 1.6% of placebo patients dropped out because of side effects (total drop outs 8.2% v 10.0%; percentages of patients reporting side effects 4.1% v 4.8%)


This overview summarises the evidence from randomised clinical trials for a treatment for mild to moderately severe depression which is highly popular in German speaking countries but virtually unknown in the English speaking world. If we had restricted our literature search to English language publications, as often done in meta-analyses,47 we would not have identified a single trial in our initial search in spring 1994.

During our search we tried to identify unpublished trials. Personal communication with researchers revealed two further trials for which data were not available to us and which are unlikely to be published. One trial was reported as having had negative, the other positive results. A matter of concern is the amount of double publication in our trial set. Several trials were published more than once without reference to previous publication; one trial was even published five times with two different first authors.24

The studies compared in our review were quite heterogeneous regarding patients and interventions. The classification of depression was not uniform and in some studies quite vague. Patients were not only recruited by psychiatrists but also by general practitioners, internists, or gynaecologists. A great variety of patients have probably been studied in the trials summarised.

Pooling studies of different preparations of a herb is problematic, even when, as in the case of hypericum, extracts are “standardised” on a characteristic component. As hypericins are probably not the only relevant component in hypericum preparations it could be that different preparations vary in their content of substances contributing to the antidepressive effects. Furthermore, daily doses of extract and amount of total hypericin varied considerably among trials. Given the large number of possible sources of variation on one side and the relatively small number of trials, we refrained from performing subset analyses.

For the reasons cited above and despite promising results of reported trials, interpretation of the evidence is still difficult. We believe there is good evidence that hypericum is better than placebo in treating some depressive disorders. We do not yet know if hypericum is better in treating certain depressive disorders than others, and neither do we know if different preparations of hypericum are equally effective or the optimum dosages. Hypericum preparations may work as well as other antidepressants, but the evidence is still insufficient because of the limited number of patients included in trials. Hypericum seems to have fewer short term side effects than some other antidepressants. Phototoxicity in animals has been reported after ingestion of extremely high doses of hypericum (about 30 to 50 times higher than therapeutical doses).48 Drug monitoring studies suggest that side effects are rare and mild,49 50 51 52 although observation periods did not exceed eight weeks. Information on long term side effects is lacking.

Future clinical trials on hypericum should compare its effects with those of other antidepressants and not with placebo. Different preparations of hypericum have to be compared and dose response investigations should be carried out. Longer trials with formal standard mechanisms for the assessment of side effects are needed to evaluate relative efficacy and safety compared with other antidepressants. Types of depression among study participants should be delineated better to determine whether hypericum works for milder forms of depression or for major and severe forms as well. Because available clinical trials suggest that hypericum might become an important tool for the management of depressive disorders, especially in primary care settings, such further research is highly desirable.


  • Funding No specific funding.

  • Conflict of interest None.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.