- Kathleen M Griffiths, visiting fellow (, )
- Helen Christensen, senior fellow
- Correspondence to: K M Griffiths
- Accepted 17 July 2000
Objectives: To evaluate quality of web based information on treatment of depression, to identify potential indicators of content quality, and to establish if accountability criteria are indicators of quality.
Design: Cross sectional survey.
Data sources: 21 frequently accessed websites about depression.
Main outcome measures: (i) Site characteristics; (ii) quality of content—concordance with evidence based depression guidelines (guideline score), appropriateness of other relevant site information (issues score), and subjective rating of site quality (global score); and (iii) accountability—conformity with core accountability standards (Silberg score) and quality of evidence cited in support of conclusions (level of evidence score).
Results: Although the sites contained useful information, their overall quality was poor: the mean guideline, issues, and global scores were only 4.7 (range 0-13) out of 43, 9.8 (6-14) out of 17, and 3 (0.5-7.5) out of 10 respectively. Sites typically did not cite scientific evidence in support of their conclusions. The guideline score correlated with the two other quality of content measures, but none of the content measures correlated with the Silberg accountability score. Content quality was superior for sites owned by organisations and sites with an editorial board.
Conclusions: There is a need for better evidence based information about depression on the web, and a need to reconsider the role of accountability criteria as indicators of site quality and to develop simple valid indicators of quality. Ownership by an organisation and the involvement of a professional editorial board may be useful indicators. The study methodology may be useful for exploring these issues in other health related subjects.
The web represents an unprecedented opportunity to provide high quality, accessible healthcare information to consumers and health providers. In the absence of editorial controls, however, the information may be of low quality and potentially harmful.1
In an influential paper Silberg et al proposed that accountability standards (disclosure of authorship, ownership, and currency of information) may be useful indicators of the quality of web based health information.2 These accountability criteria have been widely assumed to reflect website quality,3-5 but their validity as indicators of quality of content have not been investigated. Moreover, there have been few systematic studies of the actual quality of the content of health information on the web,6-9 and these studies have typically used textbook summaries5 or author opinion8 as the gold standard for assessing content quality rather than meta-analyses of the available evidence.10 Finally, no published studies have systematically evaluated the quality of mental health websites even though mental disorders are a common cause of disability and the World Health Organisation has predicted that depression will be the second largest cause of disability within 20 years.11 Since only a minority of people with depressive disorders receive treatment,12 websites are potentially useful for encouraging depressed people to seek help.
In this study we aimed to survey websites that a “typical” user might access when searching for information on depression. We evaluated the quality of the information on the treatment of depression (including comparison with evidence based guidelines and meta-analyses) and the relation between content quality and accountability indicators and other site characteristics.
Selection of sites
To identify potential sites for our survey, we used two search engines, DirectHit (www.directhit.com/) and MetaCrawler (www.go2net.com/search.html), to conduct searches in March 1999 using the key word “depression.” DirectHit returns 10 “popular” sites based on analyses of previous user activity for a query (primarily frequency of “clickthroughs” from a result list). MetaCrawler integrates the results for a query from several well known search engines including Alta Vista, Excite, Infoseek, Lycos, WebCrawler, Yahoo, LookSmart, Thunderstone, and Mining Co. The usefulness of Direct Hit and MetaCrawler in identifying popular sites has not been the subject of formal independent evaluation. However, in the absence of any other suitable search engine tools, the list of sites yielded by our search methodology provided the best available approximation to a list of depression sites that would be most commonly encountered by a “typical” user.
We excluded sites not relevant to depression or no longer active and one site concerned solely with seasonal affective disorder. All other sites identified by Direct Hit (n=9) and the highest ranked sites from the MetaCrawler search (n=11) were included in our analysis. We rated separately a “stand alone” book imported from a third party source by one site. We identified and printed out site material by systematically following all internal links. We excluded external links, news sections (typically internally contradictory), sections relating to bipolar disorder and schizophrenia, and book reviews.
We each independently evaluated the sites' characteristics, content, and accountability. We resolved any disagreements soon after rating by discussion and reference to site material.
Characteristics of the site
We rated each site as to its purpose, scope, ownership, country of origin and for involvement of a drug company, professional editorial board, or health professional. We also rated sites according to whether they promoted products or services and whether they contained a disclaimer or qualifier regarding information provided.
Quality of content
Guideline score—We assessed concordance between site information and best practice by using a 43 item rating scale developed from the evidence based guidelines on clinical practice for treating depression published by the Agency for Health Care Policy and Research (AHCPR).13 These guidelines are one of a set of US federal guidelines developed according to the general principles outlined in the US Institute of Medicine's guidelines for developing evidence based guidelines on clinical practice.14 The guidelines were developed by a multidisciplinary panel from systematic reviews of the scientific evidence (meta-analyses of randomised controlled trials using modified “intention to treat” analyses) and underwent extensive review by all panel members, a methodologist, 28 scientific reviewers, and 73 organisations. Each item in our rating scale corresponded to one of the statements in the guidelines. The scale covered the use of drugs, psychotherapy, combined drugs and psychotherapy, and electroconvulsive therapy. Topics included effectiveness, indications, selection within a treatment type, failure to respond, and frequency of visits. We calculated a guideline score for each site by counting the number of items on the scale for which site information agreed with the guidelines. We also calculated a core guideline score (out of 5) from a subset of key items relating to indications for and effectiveness of the four major treatment types (see box 1).
Box 1 : Core guidelines adapted from AHCPR clinical practice guidelines13
Antidepressant drugs are an effective treatment for major depressive disorder
Antidepressants are the first line treatment for moderate to severe depression or psychotic or melancholic atypical symptoms (overeating, oversleeping, weight gain), at patient's request, if psychotherapy unavailable, or if previous response to drug. (Site must have identified at least one of the above qualifiers to be rated as in agreement with guideline)
Psychotherapy can be an effective first line treatment for mild to moderate depression
Initial treatment with a combination of drugs and psychotherapy is reasonable in only some circumstances (such as chronic prior course of illness or poor recovery between depressive episodes, psychotherapy alone or drugs alone only partly effective, history of psychosocial problems both during and outside depressive episodes, history of poor adherence to treatment)
Electroconvulsive therapy may be effective in certain cases of severe depression
Issues score—We evaluated other treatment issues with a 17 item scale designed to assess the appropriateness of site information about important treatment and management issues not adequately or not directly evaluated by the guideline scale (such as the importance of seeking help, discussion of side effects, depression in young people, and the relation between depression and suicide risk).
Global score—We each provided a subjective judgment of the overall quality of a site (score out of 10) and then calculated an average score for each site. There was a moderately high correlation between our scores (r=0.69, P=0.001), and the mean scores for each of us did not differ significantly (mean difference 0.38 (SD 1.5), t 20=1.16), suggesting acceptable inter-judge agreement despite the unstructured and subjective nature of the task.
Interventions recommended—We rated the sites according to whether each of a range of interventions were mentioned; were said or implied to be effective or useful or were recommended as a first line, second line, or adjunct treatment for all or some groups; and were said to be ineffective or were not recommended. Interventions denoted effective but explicitly not recommended were coded as “not recommended.”
Sources of help recommended—We rated potential sources of help for depression as recommended, not recommended, or not mentioned.
Silberg score—Sites were rated on a 9 point scale according to Silberg et al's criteria of authorship (whether authors and their affiliations and credentials were clearly identified), attribution (whether sources and references were mentioned), disclosure (whether ownership of the site and sponsorship was disclosed), and currency (whether the site has been modified in the past month and year and whether the date the site was created or modified was specified).2
Level of evidence score—We recorded the stated level of evidence associated with each intervention using a 5 point scale adapted from a previously published scale of hierarchy of evidence (box 2).15 We based the evidence scores only on information explicitly provided by the site and not on our knowledge of the cited study or relevant literature.
Box 2 : Quality of evidence rating system (adapted from National Health and Medical Research Council15)
Level 1—Evidence obtained from a review of all relevant randomised controlled trials
Level 2—Evidence obtained from at least one randomised controlled trial
Level 3—Evidence obtained from controlled trials without randomisation
Level 4—Evidence obtained from multiple time series with or without intervention
Level 5—Other evidence (such as opinions or policies of respected authorities based on clinical experience, descriptive studies, or reports of expert committees; summary by writers using a variety of written material; expert testimony; reference to the philosophy of a particular practitioner; reference to personal experience)
We assessed site quality and accountability as a function of site characteristic using Mann-Whitney tests, Kruskall-Wallis analyses followed by Mann-Whitney tests, or Fisher's exact probability tests. We calculated non-parametric confidence intervals for the main findings using the procedure outlined by Campbell and Gardner.16 We calculated intercorrelations between variables using Pearson's correlation tests and Phi tests.
Of the 21 sites included in our analysis, 19 were US based, one was European, and the remaining site was of unknown origin. The principal purpose of the sites was to provide information or educational material (10 sites), links (4), a consumer forum (1), or information in combination with either links or consumer forum or both (6). Table 1 lists other characteristics of the sites.
Quality of content
The mean guideline, issues, and global scores were 4.7 out of 43, 9.8 out of 17, and 3 out of 10 respectively (table 1), indicating little concordance with guideline recommendations, inadequate consideration of management or treatment issues, and generally low overall ratings.
In part, the low guideline score reflected poor coverage: on average, the sites lacked material relevant to over two thirds of the guideline items. However, the information that the sites did provide was often inaccurate: in the case of the five core guidelines (box 1) most of the sites (average 58%) contradicted or provided material inconsistent with the guidelines.
Sites usually recognised that antidepressants and psychotherapy are effective but were often inaccurate in the specified indications for these treatments. For example, many sites emphasised one form of these treatments over the other regardless of the severity of the depression and other important factors; almost half of the sites recommended combined use of antidepressants and psychotherapy as a first line treatment when this is not recommended by the AHCPR guidelines. Sites were often internally inconsistent, especially when material was derived from more than one author or source.
Between eight and 13 of the sites failed to discuss contraindications for drugs, failed to recognise individual differences in the effects of antidepressants, or did not identify the importance of switching drugs as required. Few sites acknowledged that chronic and subsequent episodes of depression may require a different approach, that the management and treatment of depression in young people may differ from that for adults, or that the availability of treatment may be a factor in selecting treatments. However, most of the sites discussed side effects and the long term nature of antidepressant treatments, and most of those that mentioned herbal or dietary supplements included some discussion of their side effects.
All sites indicated that depression can be treated, most indicated that the depression should be treated, and only one failed to mention the risk of suicide in depression. Although most sites mentioned effective treatments such as selective serotonin reuptake inhibitors, tricyclic antidepressants, psychotherapy, and cognitive therapy, less than half mentioned several important evidence based conventional treatments (such as newer antidepressants, interpersonal therapy, behaviour therapy, and cognitive behaviour therapy) and only six recommended St John's wort despite level 1 evidence suggesting it is effective for mild depression.
All sites promoted consultation with a health professional for diagnosis or treatment, and most provided a list of contact organisations for further information or assistance. All sites recommended a doctor as a source of help (see table 2). Psychiatrists, psychopharmacologists, psychologists, and psychotherapists are professionals with expertise in delivering known effective treatments for depression, but six sites did not mention any of these professionals as potential sources of help. Sites were as likely to recommend websites, family members, the clergy, or friends as they were to recommend psychiatrists (table 2).
The mean Silberg score was 5.4 out of 9 (table 1). Most of the sites clearly specified the authors of the web content (13 sites) and their credentials (11 sites) and affiliations (11 sites). Nine of the sites mentioned at least some sources and references on the site (although such information was typically not comprehensive). All but one site disclosed an owner of the site, and three mentioned sponsors. Most sites indicated when the site had been created or modified: most had been modified in the past year, and nine had been modified in the past month.
The 21 sites mentioned a total of 53 different interventions, but sites typically did not provide supporting scientific information or refer in general terms to the level of evidence available to support their recommendations. Since most sites mentioned antidepressants and psychotherapy and these therapies are supported by level 1 evidence, we analysed the data for these interventions further (taking the highest level of evidence across generic and individual forms of each treatment type). Only five sites mentioned any scientific evidence in support of the use of antidepressants, and only one of these referred to level 1 evidence. Similarly, only three sites that recommended or did not recommend psychotherapy cited scientific evidence in support of their conclusions, and only one site cited level 1 evidence.
Association between quality of content, accountability, and site characteristics
The guideline score was significantly correlated with the other two measures of quality of content (with global score, r=0.53, P<0.05; with issues score, r=0.74, P<0.01). However, none of the measures of quality of content correlated significantly with the Silberg accountability score (r=-0.5 to 0.21). Of the sites offering recommendations about psychotherapy, those citing scientific evidence were more likely to achieve an above median guideline score (Phi=0.50, P=0.034) and showed a tendency to achieve above median issues scores (Phi=0.45, P=0.058). There was no comparable significant relationship for antidepressants.
As table 1 shows, sites owned by organisations had significantly higher guideline and issues scores than those owned by individuals (difference in mean guideline scores 3.8 (95% confidence interval 1 to 6), U=21, P=0.016; difference in mean issues scores 2.1 (0 to 4), U=26, P=0.043), as did sites with an editorial board compared with others (difference in mean guideline scores 3.8 (0 to 7), U=16, P=0.05; difference in mean issues scores 2.1 (0 to 4), U=16, P=0.05). Only sites owned by organisations reported scientific evidence to support their endorsement of antidepressants and psychotherapy. Sites owned by organisations were significantly more likely than individually owned sites to cite scientific evidence in support of antidepressants (50% v 0%, Fisher's exact test P=0.03), as were sites involving drug companies compared with others (75% v 13%, P=0.03).
However, there was no significant association between the total Silberg score and site characteristics. In fact, analyses of individual Silberg items showed that sites owned by organisations and those involving drug companies were less likely than their counterparts to indicate the author's identity, affiliation, and credentials. Thus, for sites owned by organisations, author's identity was given by 36% (v 90% of others, Fisher's exact test P=0.024), affiliation by 27% (v 80%, P=0.03), and credentials by 27% (v 80%, P=0.03). For sites involving drug companies, author's identity was given by none (v 77%, P=0.012), affiliation by none (v 65%, P=0.04), and credentials by none (v 65%, P=0.04).
There was no association between the characteristics of a site and the total number of sources of help endorsed or whether psychiatrists, psychopharmacologists, psychologists, or psychotherapists were specifically nominated as sources of help. However, sites involving health professionals were more likely than other sites to endorse the general category “mental health specialists/professionals” as a source of help: 71% of sites involving a psychiatrist or medical practitioner and 100% of sites involving a psychologist v 10% of other sites (P<0.05 for each case).
In our review of 21 popular websites containing information about treating depression we found that the quality of this information was poor. This finding reinforces concerns raised by other studies which have found inadequate quality6-9 or poor coverage17 of important health issues on the web. There is a need to improve the accuracy and coverage of information about depression on the web with regard to the relative effectiveness of different treatments, the main indications for particular treatments, important management issues such as duration of treatment, reviewing and changing treatments, and the relevance of professional expertise and patient preferences. Sites should also warn readers that tricyclic antidepressants are ineffective for adolescents and that drugs may not be the first line of treatment for this age group.18
Our findings raise questions about the usefulness of Silberg et al's accountability criteria as indicators of website quality2 and suggest that further investigation of indicators of quality is warranted. Particular site characteristics (such as ownership by an organisation or existence of a professional editorial board) may prove more useful indicators of content quality than disclosure of information per se. Our results also suggest that the number of different types of interventions mentioned may be a predictor of site quality, as may the citation of scientific evidence in support of recommended treatments.
The critical question is whether the attributes which were associated with the better quality sites about depression are valid indicators of the quality of other types of health related sites. Our methodology could be used to address this question and to identify those attributes that are common predictors of quality for different medical subjects. The methodology lends itself to replication in different subjects since any systematically produced set of guidelines can serve as a rating scale with which to evaluate websites.
It is possible that the inadequacies we documented are not restricted to websites but reflect the beliefs and level of knowledge of many health professionals. McClung et al have reported that even medical teaching centres disseminated inadequate reviews on the web.7 It is unlikely that the AHCPR guidelines are outdated or inadequate since a review of more recent evidence concluded that the major AHCPR conclusions are still applicable and that, when rigorously implemented, the guidelines result in improved outcomes compared with usual care.19 The guidelines have been criticised for their failure to recommend psychotherapy as a first line treatment for severe depression,20 but, although there is some evidence to support this criticism,21 few studies have directly compared the efficacy of different treatments for severe depression and the findings have been inconsistent. By contrast, a large number of randomised controlled trials have demonstrated the efficacy of antidepressants in treating severe depression.
Despite their generally low scores for content quality, many sites did contain important and potentially useful information. It is even possible that a formal evaluation might show that such sites improve the mental health outcomes of those who visit them. Silberg et al have referred to the importance of distinguishing the flowers from the weeds on the internet superhighway. However, a single site, whether owned by a consumer or a health professional, may grow both flowers and weeds. The real challenge is to devise strategies that selectively eliminate the weeds but leave the flowers to bloom.
What is already known on this topic
Depression is a major source of disability in the community
Websites offer an opportunity to disseminate information to the public about effective treatments
However, little is known about the quality of existing sites about depression or about indicators of a good health website
What this study adds
An audit of 21 popular websites revealed that the general quality of information on the treatment of depression is poor
Currently popular criteria for evaluating the quality of websites were not indicators of content quality, but sites with an editorial board and sites owned by organisations produced higher quality sites than others
Contributors: KG conceived and designed the study, rated websites, analysed and interpreted the data, and wrote the paper. HC conceived and designed the study, rated websites, analysed data, and edited the paper. Jo Medway collected and organised data and commented on the manuscript. Ailsa Korten calculated confidence intervals for non-parametric data and commented on statistical procedures. Andrew MacKinnon contributed to discussions during the design phase. KG and HC are guarantors for the paper.
Funding This work was supported by grant 973302 from the National Health and Medical Research Council.
Competing interests None declared
This article is part of the BMJ's randomised controlled trial of open peer review. Documentation relating to the editorial decision making process is available on the BMJ's website