Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Gunther Eysenbach Unit for Medical Informatics, Epidemiology, and Public Health,
Department of Dermatology, University Hospital Erlangen,
Hartmannstrasse 14, 91052 Erlangen, Germany
Correspondence to: Dr Eysenbach
Gunther.Eysenbach{at}derma.med.uni-erlangen.de
The principal dilemma of the internet is that, while its
anarchic nature is desirable for fostering open debate without
censorship, this raises questions about the quality of information
available, which could inhibit its usefulness. While the internet
allows "medical minority interest groups to access information of
critical interest to them so that morbidity in these rare conditions
can be lessened,"1 it also gives quacks such as the
"cancer healer" Ryke Geerd Hamer a platform
(http://www.geocities.com/HotSprings/3374/index.htm).2-4
Quality is defined as "the totality of characteristics of an
entity that bear on its ability to satisfy stated and implied needs."5 For quality to be evaluated, these needs have
to be defined and translated into a set of quantitatively or
qualitatively stated requirements for the characteristics of an entity
that reflect the stated and implied needs. So how can we define
consumers' "needs" in the case of medical information on the
internet?
The quality of medical information is particularly important because
misinformation could be a matter of life or death.6 Thus,
studies investigating the "quality of medical information" on the
various internet venues Most papers published so far about the problem of quality of medical
internet information focus on assessing reliability, but, as box 1 shows, this should be only one aspect of quality measures aiming for
"first, do no harm." Another should be to provide context. Although
these two problems are different in nature and different measures may
be proposed to solve them, we discuss a common measure that could solve
both aspects at the same time: assigning "metadata" to internet
information; both evaluative metadata to help consumers assess
reliability and descriptive metadata to provide
context.
Box 1
: Why internet information is different from printed
information
Characteristics of internet that make information and
communication over this medium "special"
Examples of "context deficit"
Benchmarks
Ideally, the success of methods of quality control and evaluation
would be tested by their impact on morbidity, mortality, and quality of
life. Such benchmarks would, however, be extremely difficult to
measure.12 Therefore, measures of process and
structure13 could be used as more indirect indicators of
quality Filtering and selecting information
Table 1 shows different systems for quality control of information
on the internet. If quality control at the time of production is not
possible or not desirable,14 it could be decentralised and
consist of selecting the products complying to the quality requirements
of a consumer. Such selection may consist of downstream filtering (by
consumers) and upstream filtering (by an
intermediary).
Table 1.
Selection by third parties (upstream filtering)
Box 2
: Drawbacks of upstream filtering
Volatility Questionable validity and reliability of rating
instruments Rating cannot take into account users' context and
needs Users have to check a review service explicitly before or
after reading a web page to check its rating
websites,7 mailing lists and newsgroups,
8 9
and in email communication between
patients and doctors10
are mostly driven by the concern
of possible endangerment for patients by low quality medical
information. Thus, quality control measures should aim for the
Hippocratic injunction "first, do no harm."
Summary points
The quality of information on the internet is extremely variable,
limiting its use as a serious information source
A possible solution may be self labelling of medical information by web
authors in combination with a systematised critical appraisal of health
related information by users and third parties using a validated
standard core vocabulary
Labelling and filtering technologies such as PICS (platform for
internet content selection) could supply professionals and consumers
with labels to help them separate valuable health information from
dubious information
Doctors, medical societies, and associations could critically appraise
internet information and act as decentralised "label services" to
rate the value and trustworthiness of information by putting electronic
evaluative and descriptive "tags" on it
Indirect "cybermetric" indicators of quality determined by computer
programs could complement human peer review
for example, reliability, provision of context, qualification
of authors, use or acceptance of this information by consumers, etc.
Today, many reviewed indexes (review services) rate medical
websites.
15 16
In this "upstream filtering" approach, third parties set quality criteria and also perform the
evaluations, usually by means of a few human reviewers. This is one
possible form of "distributed" quality management, but it has
problems (see box 2).
The internet is too dynamic and rapidly
changing to be reviewed by a few such filtering services. A solution
for this problem could be that more and more highly specialised
services could evolve, serving the special needs of certain user groups
and focusing on certain internet venues, including newsgroups and
mailing lists8
A recent systematic review assessing 47 rating
instruments for medical websites concluded that "many incompletely
developed instruments to evaluate health information exist on the
internet. It is unclear, however, whether they should exist in the
first place, whether they measure what they claim to measure, or
whether they lead to more good than harm."15 Many of
these services merely provide a badge or "seal of approval" or
assign stars, medals, apples, thumbs, or sunglasses to
websites,
15 16
which may, at best, give users a remote
idea on the reliability of the website (leaving aside that the rating
itself may be of questionable reliability and validity)
Quality criteria are fixed by third parties, and consumers
may have different requirements than the reviewers. A link to a
document written by an expert scientist and rated four stars by another
expert may be useless for a patient. Equally, a document written for
general practitioners may be of limited use for medical specialists
How many users who
end up directly on a website because they used a search engine take the
effort to make a second search of reviewed indexes for the rating of
that site? How many users further try to obtain the ratings from
different rating services in order to compare them and to estimate
their reliability and interobserver variance? And if they did so, how
should they interpret one service rating the website two stars and
another rating it three sunglasses?
Filtering by the user (manual downstream filtering)
An approach that circumvents some of the problems of upstream
filtering (especially that of the volatility of internet information)
is that of third parties communicating selection criteria to users
(without any attempt to rate internet information themselves) to help
consumers to evaluate ("filter") information "manually" on
their own.17 The huge drawback of this approach is that it
does not really help consumers to find high quality information
quickly, as they have to check manually each entity (website, email,
news article) against the given set of quality criteria.
Filtering by the user supported by software (automatic downstream
filtering)
We therefore propose to focus on a third approach, automatic
downstream filtering. Here, quality criteria are set up by third
parties and translated into a computer readable vocabulary, and the
filtering is done, at least partly, by users' software.
Electronic labels
The World Wide Web Consortium has recently developed a set of
technical standards called PICS (platform for internet content selection)18-21 that enable people to distribute
electronic descriptions or ratings of digital works across the internet
in a computer readable form. PICS was originally developed to support
applications for filtering out pornography and other offensive
material, to protect children. An information provider that wishes to
offer descriptions of its own materials can directly embed labels in electronic documents or other items (such as images)
for example, such
labels may indicate whether the content is appropriate for particular
audiences such as minors, patients, etc.
Perhaps even more important, independent third parties, so called label
services, can describe or evaluate material
human reviewers or
automatic software (see below) rate websites and create electronic
labels. An end user's software will automatically check at the label
bureau(s) that the user is subscribed to while accessing a website or
retrieving any other kind of digital information. The software further
interprets the computer readable labels and checks them against the
requirements defined by the user. It may then, for example, display a
warning if the information is aimed at a different audience or if the
website is known to contain misleading health information, etc.
The quality criteria (in PICS terms "rating categories") and their scales are together called rating vocabulary. We have developed a prototype core vocabulary, med-PICS, for possible use with medical information.22 This vocabulary contains descriptive categories such as the intended audience (from "kids" to "highly specialised researcher"), which could be used by authors to provide "context," and evaluative categories such as "source rating" (from "highly trustworthy" to "known to provide wrong or misleading information"), which could be used by third party label services.
The main advantages of automatic downstream filtering would be
The exact quality requirements can be set by the user, not by
the rating service alone. The rating service describes the information
with values on defined scales in different categories, and the user
determines the thresholds. For example, a user could tell the software,
"I want only material that is suitable for patients, which relates to
the healthcare setting in Britain, and which is rated of at least
medium reliability"
The software could automatically check one or more rating
services in the background, without the user having explicitly to
consult a rating service before or after entering a website or
retrieving any other kind of information.
The idea of assigning standardised metadata to medical information on the internet is not new,23 but the key difference of using an infrastructure such as PICS is that not only can authors include metadata but third parties can also associate metadata to all kinds of information (see table 2). Until now metadata were primarily thought of as descriptive (provided by authors), but in the future metadata could also be evaluative (provided by third parties).
|
Who should evaluate and how
PICS is merely an infrastructure for distributing metadata, not a method per se to evaluate information. The questions of who should evaluate and how still remain.
Today, most of the rating of medical information is done by organisations, publishers, and sometimes individuals. We think that in the future more people from the medical community should evaluate internet information while they surf the internet. We propose a collaboration of medically qualified internet users, consisting of volunteers who, for example, get a program or browser extension that allows them to rate medical websites in a standard format. These ratings could be transmitted to one or several medical label databases, which could be used by consumers.
If thousands of doctors continuously took part in a global rating
project we might be able to keep pace with the dynamics of the
internet. With this true "bottom up" approach, one could also
easily evaluate the rating instruments in terms of variation among
observers. Further, the heterogeneity of the reviewers would take
account of the many different perspectives and backgrounds that
consumers may have as well.
Beyond peer review: automatic and semiautomatic methods of assessing quality
Traditional peer review has many problems, such as that reviewers
are human and can make factually incorrect judgments and that peer
reviewing is very time consuming. We therefore propose that more work
should be made to explore the potential of computers to determine
indirect quality indicators by means of automatic (mathematical)
methods. Current research suggests that "web
surfing" follows strong mathematical patterns,24 and
work in the new discipline of "cybermetrics" has indicated
promising methods for measuring the impact of websites
distinguishing
low quality websites from high quality sites by analysis of user
behaviour, user pattern, complexity of the website, etc (box 3). Of
course, the specificity of such indicators is low (a popular website
with many users may still harm with unreliable information), but they
are sensitive and, once the methods are established and validated, easy
to obtain.
|
Conclusion and call for action
While suggestions for an agreed formal international standard for medical publications on the internet, enforced by appropriate peer or government organisations,26 are probably not realistic, there should at least be a core standard for labelling health related information. In our proposed collaboration for critical appraisal of medical information on the internet,22 organisations, associations, societies, institutions, and individuals interested in reviewing, assessing, and compiling medical information will be invited to join the discussion.
The internet
a decentralised medium by nature
not only allows access
to information distributed on various computers but also allows a
distributed management of quality with decentralised quality control
and evaluation. Filtering techniques and infrastructures such as PICS
may help to overcome the present oligarchic approach of a few review
services attempting to rate all the information of the internet towards
a truly distributed, democratic, collaborative rating.
Acknowledgments
Funding: Partly supported by a grant of the German Research Net Association (DFN-Verein), Berlin, and the German Research Ministry (BMBF), Bonn, grant No TK 598-VA/I3.
Conflict of interest: None.
References
Vocabulary.
2nd ed.
Geneva: International Organization for Standardization
, 1994(1994-04-01.)
acute renal failure caused by oil of wormwood purchased through the internet.
N Engl J Med
1997;
337:
825
a free market in information will conflict with a controlled market in health care [editorial].
BMJ
1996;
312:
3-4
let the reader and viewer beware.
JAMA
1997;
277:
1244-1245[Medline].
navigating to knowledge or to Babel?
JAMA
1998;
279:
611-614(Accepted 16 July 1998)
J A Muir Gray NHS Executive Anglia and Oxford,
Department of Health Institute of Health Sciences, Oxford OX3 7LF
graym{at}rdd-phru.cam.ac.uk
The Goldsmiths' Company was founded in London in 1327 and
has flourished for over 650 years. It never traded gold but specialised in the assay of gold and other precious metals. The Goldsmiths' Company has flourished because it has been an independent assay service, measuring the quality of gold and stamping the gold with a
hallmark to indicate to the public the purity of the metal with an
explicit system of measurement (the word "carat" derives from the
Arabic for the carob bean, for the beans of the carob are of uniform
size and can be used as standard weights).
Knowledge hallmarks are needed to perform the function of gold
hallmarks, and the Cochrane logo has already become a knowledge hallmark, clearly defining the quality of knowledge because readers can
look at the Cochrane Collaboration Handbook and see the
methods used to produce and appraise the Cochrane Reviews. Journal
titles have been another hallmark, but the dependability and
credibility of that hallmark is fading as doubts increase about the
rigour of the assay method called peer review and evidence shows that even in prestigious journals the assay procedure is flawed and unreliable. Worryingly, all the flaws in the assay procedure seem to
overemphasise the strength of the positive effect of new interventions and treatments, with a significant increase in the positive effect of
the treatment resulting from poor trial design (table 1) and biased
reporting (table 2).
This is a problem in the paper world and will be even more of a problem
in the electronic world, in part because
electronic journals are so easy to create. Every time information on
the world wide web has been critically reviewed or assayed, the quality has been shown to be very variable. Even more worrying, it is hard, and
sometimes impossible, to assess the quality of a website because the
necessary evidence is not present.
Table 1.
When the printing press was invented, there was concern that the printed word would give undue credibility to an idea or proposition. The same applied to the world wide web when it started, although people now have a healthier scepticism for anything on the web because of the rapid growth of electronic junk. However, the web is an important means of communication, and will become increasingly important when it becomes available on digital television. Already tools have been developed to monitor the quality of healthcare information: DISCERN and the National Centre for Information Quality are examples of initiatives taken to help the public appraise the quality of information provided to them. What is needed, however, is a common standard based on the intellectual equivalent of carob beans, with an Honourable Company of Healthcare Knowledgesmiths to run the assay procedure in an independent and disinterested way so that people can not only distinguish gold from a base metal but also know whether they are reading 24 carat or 18 carat knowledge.
|
References
Maurizio Bonati Laboratory for Mother and Child
Health, Istituto di Ricerche Farmacologiche "Mario Negri," Via
Eritrea 62, 20157 Milan, Italy
Correspondence to: Dr Bonati
mother_child{at}irfmn.mnegri.it
Interest in searching the world wide web for health
related information continues to increase, increasing the need for
internet resources to be accountable to doctors and the
public.1 Function, structure, and content of a website are
the main aspects used to evaluate material on the
internet.2 Although we have not yet developed reliable
methods for evaluating the effects (the impact) of such material on
clinical practice or on a user's behaviour, improved technology today
allows for the control of function and structure of a
website.
Eysenbach and Diepgen propose the use of a promising automatic
"downstream filtering" system of metadata based on PICS technology. This uses a rating vocabulary that contains descriptive and evaluative categories based on rating instruments already available for evaluating health information on the internet. The authors suggest that assessing quality of information depends not only on evaluating its reliability but also on the provision of context; a valid idea in that it resembles
the traditional system of submitting and publishing scientific
articles. Thus, providing descriptive tags (metadata) for context and
content The problem lies in assigning tags for reliability of information.
Guidelines for every aspect of health care do not exist, so each
"rater" in the authors' proposed collaboration for critical appraisal of medical information on the internet would assign his or
her own values. The benefits of having many raters need to be weighed
against the possibility of having unqualified or uninformed medical
workers (and lay people) judge web information incorrectly. Who would
then check the raters? It has, after all, been found that doctors are
also sources of incorrect, outdated information on the
internet.3
Thus far, more attention has been paid to presentation and reliability
than to the accuracy of the content material.4 To determine the accuracy of medical information on the internet we need
to compare it with the best evidence.2 The evidence based
methodology and the Cochrane Collaboration are two useful examples of
critical appraisal that should also characterise future evaluations of
websites. In the meantime, interaction and feedback may be markers of
high quality for websites: allowing a user to submit comments or
questions demonstrates a serious intention by the authors to both
improve the information supplied by them and to become respectable
sources of health information in the long run.
This is a just a starting point for the demystification of medicine and
the development of real partnerships between all parties concerned. We
must find ways of producing, validating, and diffusing appropriate
information in a manner that involves users (consumers) in order to
guarantee a non-authoritarian practice, access for all to healthcare
information, and high quality information on the internet.
References
Subbiah Arunachalam M S Swaminathan Research
Foundation, Taramani Third Cross Street, Chennai 600 113, India
subbiah_a{at}hotmail.com
Interest in how new information technology can be used to
improve health is growing steadily. Telemedicine is making it
possible to erase geographical constraints on the provision of health
care. However, the information revolution is not a worldwide
phenomenon: in India today there are fewer than two main telephone
lines per 100 people. Even in Western countries such as the United
States there is a wide disparity in terms of access to telephones
and computers between poor communities Access to technology is only a part of the problem. There are three
aspects to provision of information: collection, distribution and
dissemination, and authentication and quality control. While the
internet is good at the first two, the information it provides is not
thought to be very dependable or reliable. Eysenbach and Diepgen
address this problem with regard to medical information and suggest
"distributed quality management" as a possible solution. They argue
their case There are good examples of achieving quality assurance through a
combination of centralised and decentralised approaches in other
specialties. The United Nations Environment Programme has the maESTro
(Managing Environmentally Sound Technologies) program, which operates
from Japan and which verifies with the developer of the technology
as well as cross checks with databases
(http://www.unep.or.jp/ietc/ESTdir/maestro/introduction.html). In
physics the e-Print Archive, based in Los Alamos, works well. Usually,
if someone wants to comment on a preprint, he or she directs it to the
author, but some do forward their comments to the archive, thus making
it available to the worldwide audience.
Unlike in physics and technological information services, in
medicine a whole range of people, and not only experts, take part
in the information exchange, both inputting and searching. Well known
sites such as those of the BMJ,
JAMA, and Human Genome News
(http://www.ornl.gov/TechResources/Human_Genome/publicat/hgn/vgn3/01eyes.html) are dependable, but what about all the material in usenet groups, listservs, and email messages? In this respect medicine is closer to
astrology than to the hard sciences Certain new developments in searching the world wide web, such as the
"hyperlink induced topic search" developed by Jon Kleinberg of
Cornell University and being evaluated by IBM and Digital (now Compaq)
for implementation, can help to reduce the time taken to find relevant
medical information in an internet search (see http://www.almaden.ibm.com/cs/k53/clever.html.). This is similar to the
citation links in journal literature that form the basis of the
Science Citation Index. But it is still unclear whether the system for the internet will be as powerful as the citation indexes
in clustering related material through cognitive links.
Finally, we live in the real world, and there can be no ideal solution
to our problems. Every time we find a way to overcome a problem, those
that create the problem do things to make our solutions inadequate. But
scepticism should not hold us back from looking for ways to make the
internet the ultimate source of easily accessible and reliable
information.
like supplying keywords for articles submitted for
publication
would allow more accurate searches by web browsers.
Assuring quality and relevance of internet
information in the real world
inner city populations, blacks, and Hispanics
and the suburban elite.
that questions of both relevance and reliability can be
tackled by a common measure
very well. In particular, their proposal
that both "top down" and "bottom up" approaches involving peer
review by a large body of people should be used is attractive and could
be cost effective.
hence the need for assuring quality. We should encourage doctors and biomedical researchers, as
well as institutions, to comment on what they see on the internet. Also, agencies such as Magellan and Starting Point (web search engines
that also evaluate websites) perform the function of third party
evaluators. Ultimately, the reliability of the meta-analysis approach
(gaining new insights by amalgamating existing data from different
sources) would depend on the weights we give to differ-ent constituents in the distributed, democratic, and collaborative process
of rating suggested by Eysenbach and Diepgen. Another problem, not just
with medical information but with any information, is the cost of
standardisation of vocabulary, evaluation procedures, etc. Who will
pay?
© BMJ 1998
Read all Rapid Responses
What can you learn from this BMJ paper? Read Leanne Tite's Paper+