Examination of instruments used to rate quality of health information on the internet: chronicle of a voyage with an unclear destination

BMJ 2002; 324 doi: http://dx.doi.org/10.1136/bmj.324.7337.569 (Published 9 March 2002)
Cite this as: BMJ 2002;324:569
  1. Anna Gagliardi, senior research associatea,
  2. Alejandro R Jadad (ajadad{at}uhnres.utoronto.ca), professorb
  1. a See Education and debate p 606 Graduate Department of Health Policy, Management and Evaluation, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
  2. b Departments of Health Policy, Management and Evaluation, and Anaesthesia, University Health Network, University of Toronto, Toronto, ON, Canada
  1. Correspondence to: A R Jadad, Director, Centre for Global eHealth Innovation, University Health Network, Toronto General Hospital, Fraser Elliott Building, 4th Floor, 190 Elizabeth Street, Toronto, ON M5G 2C4, Canada

    Abstract

    Objective: This study updates work published in 1998, which found that of 47 rating instruments appearing on websites offering health information, 14 described how they were developed, five provided instructions for use, and none reported the interobserver reliability and construct validity of the measurements.

    Design: All rating instrument sites noted in the original study were visited to ascertain whether they were still operating. New rating instruments were identified by duplicating and enhancing the comprehensive search of the internet and the medical and information science literature used in the previous study. Eligible instruments were evaluated as in the original study.

    Results: 98 instruments used to assess the quality of websites in the past five years were identified. Many of the rating instruments identified in the original study were no longer available. Of 51 newly identified rating instruments, only five provided some information by which they could be evaluated. As with the six sites identified in the original study that remained available, none of these five instruments seemed to have been validated.

    Conclusions: Many incompletely developed rating instruments continue to appear on websites providing health information, even when the organisations that gave rise to those instruments no longer exist. Many researchers, organisations, and website developers are exploring alternative ways of helping people to find and use high quality information available on the internet. Whether they are needed or sustainable and whether they make a difference remain to be shown.

    What is already known on this topic

    What is already known on this topic The rapid growth of healthcare websites in the 1990s was accompanied by initiatives to rate their quality, including award-like symbols on websites

    A systematic review of the reliability and validity of such rating instruments, published in 1998, showed that they were incompletely developed

    What this study adds

    What this study adds Few of the rating instruments identified in 1998 remain functional; 51 new instruments were identified

    Of the 51 newly identified instruments, 11 were not functional, 35 were available but provided no information, and five provided information but were not validated

    Many researchers, organisations, and website developers are exploring alternative ways of helping people to find high quality information on the internet

    Introduction

    The quality of health information on the internet became a subject of interest to healthcare professionals, information specialists, and consumers of health care in the mid-1990s. Along with the rapid growth of healthcare websites came a number of initiatives, both academic and commercial, that generated criteria by which to ensure, judge, or denote the quality of websites offering health information. Some of these rating instruments took the form of logos resembling “awards” or “seals of approval” and appeared prominently on the websites on which they were bestowed.

    In 1997 we undertook a review of “award-like” internet rating instruments in an effort to assess their reliability and validity.1 We hypothesised that if the rating instruments were flawed they might influence healthcare providers or consumers relying on them as indicators of accurate information. Instruments were eligible for review if they had been used at least once to categorise a website offering health information and revealed the rating criteria by which they did so. The rating instruments were evaluated according to, firstly, a system for judging the rigour of the development of tools to assess the quality of randomised controlled trials2 and, secondly, whether their criteria included three indicators suggested as appropriate for judging the quality of website content. 3 4 These indicators were authorship (information about authors and their contributions, affiliations, and relevant credentials), attribution (listing of references or sources of content), and disclosure (a description of website ownership, sponsorship, underwriting, commercial funding arrangements, or potential conflicts of interest). These criteria were selected for use in the original study because they could be rated objectively.

    Our original study found that of 47 rating instruments identified, 14 described how they were developed, five provided instructions for use, and none reported the interobserver reliability and construct validity of the measurements. The review showed that many incompletely developed instruments were being used to evaluate or draw attention to health information on the internet.

    The purpose of this study is to update the previous review of award-like rating instruments for the evaluation of websites providing health information and to describe any changes that may have taken place in the development of websites offering health information to practitioners and consumers with respect to the quality of their content.

    Methods

    We visited the websites describing each of the rating instruments noted in the original study to ascertain whether they were still operating. If internet service was disrupted for technical reasons or if sites were not available on first visit, we attempted a connection on one further occasion.

    The search strategies, inclusion and exclusion criteria, and techniques for data extraction were similar to those used in the original review.1 We used the following sources to identify new rating instruments:

    • A search to 7 September 2001 of Medline, CINAHL, and HealthSTAR (from December 1997) using [(top or rat: or rank: or best) and (internet or web) and (quality or reliab: or valid:)]

    • A search of the databases Information Science Abstracts, Library and Information Science Abstracts (1995 to September 2001), and Library Literature (1996 to September 2001) using [(rat: or rank: or top or best) and (internet or web or site) and (health:)]

    • A search to September 2001 using the search engines Lycos (lycos.com), Excite (excite.com), Yahoo (yahoo.com), HotBot (hotbot.com), Infoseek (go.com), Looksmart (looksmart.com), and Google (google.com) with [(rate or rank or top or best) and (health)]. Open Text (opentext.com) and Magellan (magellan.com), which were used in the first study, no longer function as internet search engines

    • A review of messages about rating instruments and the quality of health related websites posted to the Medical Library Association listserv medlib-l (listserv.acsu.buffalo.edu/archives/medlib-l.html) and the Canadian Health Libraries Association listserv canmedlib-l (lists.mun.ca/archives/canmedlib.html)

    • A search of the American Medical Informatics Association's 1998, 1999, 2000, and 2001 annual symposium programmes (http://www.amia.org/) for mention of health information on the internet

    • A search of the Journal of Medical Internet Research (September 1999 to September 2001) for mention of evaluations of the quality of health information on the internet (http://www.jmir.org/)

    • A search of the online archive of the magazine Internet World (http://www.internetworld.com/) (January 2000 to September 2001) for mention of health information on the internet.

    View this table:
    Table 1

    Summary of criteria for rating instruments

    We also reviewed relevant articles referenced in identified studies and links available on identified websites. We did not search the discussion list Public Communication of Science and Technology, which was consulted in the original study.

    We stopped searching for rating instruments on 22 September 2001. Rating instruments were eligible for inclusion in the review if it was possible to link from their award-like symbol to an available website describing the criteria used by an individual or organisation to judge the quality of websites on which the award was bestowed. We excluded rating instruments from review if they were used only to rate sites offering non-health information or did not provide any description of their rating criteria. In contrast to the initial study, we did not contact the developers of rating instruments to request information about their criteria if it was not publicly available on their website.

    We identified the website, group, or organisation that developed each eligible rating instrument, along with its web address. The two authors independently evaluated each rating instrument according to its validity (number of items in the instrument, availability of rating instructions, information on the development of rating criteria, and evaluation of interobserver reliability) and incorporation of the proposed criteria for evaluation of internet sites: authorship, attribution, and disclosure.24

    Results

    Fourteen rating instruments identified in the original study provided a description of their rating criteria and were therefore eligible for review. Six of these continued to function. Of the remaining eight instruments, four were no longer in operation and four had converted to a directory format. Table 1 summarises the review of the six functioning instruments. Our evaluation of one of these instruments, OncoLink's editors' choice awards, differed from that in the original study because the organisation does not provide information about the instrument on its website.

    Of the 33 rating instruments identified in the original study that were not eligible for review, three continued to function. These were Best Medical Resources on the Web (priory.com/other.htm), Dr Webster's website of the day (drWebster.com), and HealthSeek quality site award (healthseek.com). None of these rating instruments revealed its rating criteria, and they therefore remained ineligible for review. Of the remaining rating instrument websites, 10 were no longer in operation, five had been subsumed by or merged with another organisation and had a different name or purpose, and 15 still offered a website but did not function as a rating instrument.

    We newly identified 51 rating instruments. Eleven of these were identified as award-like symbols on a website offering health information, but the website of the organisation from which they originated was no longer operating (table 2). Of the remaining 40 rating instruments, 35 were associated with an active website but did not reveal the criteria by which they judge websites and were ineligible for evaluation (table 3). Five award sites discussed their evaluation criteria and were assessed (table 1). Although three of these five rating instruments exhibited one or more of the characteristics of authorship, attribution, and disclosure, none reported on the reliability and validity of the measurements or provided instructions on how to obtain the ratings.

    View this table:
    Table 2

    Newly identified award sites not available

    View this table:
    Table 3

    Newly identified available award sites not eligible for review

    Discussion

    During the past five years, we have identified a total of 98 different rating instruments that have been used to assess the quality of websites. Many of the rating instruments identified in the original study were no longer available. Fifty one additional rating instruments have been developed since 1997, and many of these had also stopped functioning. Of 51 newly identified rating instruments, only five provided some information by which they could be evaluated. As with the six rating instrument sites identified in the original study that remained available, none of these seems to have been validated. Many incompletely developed rating instruments continue to appear on websites providing health information, even when the organisations that gave rise to them no longer exist. Surprisingly, many of these rating instruments, of questionable utility and without association to an operable entity, are featured on the US Department of Health and Human Services Healthfinder website (www.healthfinder.gov/aboutus/awards.htm), which uses a detailed and rigorous selection process for the development of its own content.

    Our initial questions remain unanswered. Is it desirable or necessary to assess the quality of health information on the internet? If so, is it an achievable goal given that quality is a construct for which we have no gold standard? Some effort has been made to identify whether the presence of rating instrument awards influences consumers of health information,5 but whether validated rating instruments would have an impact on the competence, performance, behaviour, and health outcomes of those who use them remains unclear.

    Our search of the literature and the internet revealed that a large number of researchers, organisations, and website developers are exploring alternative ways to help people find and use high quality information available on the internet. Many reviews of healthcare information on the internet have been conducted, overall and for specific diseases or conditions.612 Examination of over 90 reviews concluded that the validity of health information available on websites is highly variable across different diseases and populations, and is in many cases potentially misleading or harmful (G Eysenbach, personal communication, 2001). Several organisations, including government and non-profit entities, have developed criteria by which to organise and identify valid health information (table 4). Other groups, such as the OMNI Advisory Group for Evaluation Criteria (omni.ac.uk) and the Collaboration for Critical Appraisal of Information on the Net (http://www.medcertain.org/), are refining technical mechanisms by which users of the internet can easily locate quality health information in a transparent manner based on evaluative meta-information labelling and indexing.1315 The impact of these efforts remains unclear.

    View this table:
    Table 4

    Initiatives to organise and identify valid health information on the internet

    More recently, a European project recommended the accreditation of healthcare related software, telemedicine, and internet sites.16 They suggested a mechanism similar to the marking of electrical goods for software, that national regulatory bodies should be identified for telemedicine, and that a European certification of integrity scheme should be developed for websites. Citing the many impediments to voluntary quality assurance for websites, the authors suggest the development of criteria, modifiable according to the needs of special interest groups, that would be used by accredited agencies to self label conforming websites (not only those offering health information) with a EuroSeal. Monitoring of integrity would be ongoing through cryptographic techniques.

    In conclusion, our updated study shows that award systems based on non-validated rating instruments continue to be produced but that most stop functioning soon after their release. Alternative strategies are now flourishing, and whether they are valid, needed, or sustainable and whether they make a difference is the subject of further research.

    Acknowledgments

    Contributors: AG conducted the searches, extracted relevant data, evaluated eligible instruments, and drafted the manuscript. ARJ developed the idea for the original study, independently evaluated eligible instruments, edited the manuscript, and is guarantor for this paper.

    Footnotes

    • Funding ARJ was supported by funds from the University Health Network, the Rose Family Chair in Supportive Care, and a Premier's Research Excellence Award from the Ministry of Energy, Science and Technology of Ontario.

    • Competing interests None declared.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.

    THIS WEEK'S POLL