Commentary: measuring quality and impact of the world wide webBMJ 1997; 314 doi: http://dx.doi.org/10.1136/bmj.314.7098.1879 (Published 28 June 1997) Cite this as: BMJ 1997;314:1879
- Jeremy C Wyatt, senior research fellowa
- a ICRF Centre for Statistics in Medicine Institute for Health Sciences PO Box 777 Oxford OX3 7LF [email protected]
The world wide web gives patients and professionals access to thousands of pages of clinical information, some of which are assessed by Impicciatore et al above.1 However, although the web makes it absurdly easy to disseminate information, by allowing anonymous authors to conceal commercial or other conflicts of interest2 it does not help readers to discriminate between genuine insight and deliberate invention.3 Thus, recent proposals for improving the accountability of medical information on the internet2 will enhance its value. Sometimes, though, checking whether a web site passes the criteria of Silberg et al for explicit authorship and sponsorship, attribution of sources, and dating of material2 is not enough, as Impicciatore et al show.1 For example, most doctors would recommend to patients or junior colleagues only those web sites whose content seemed of adequate quality. Some clinicians might go further and have to satisfy themselves that a site was well constructed, easy to use, and had a beneficial impact on doctors and patients.
Thus, for many purposes, evaluation of web sites needs to go beyond mere accountability to assessing the quality of their content, functions, and likely impact (see table 1)–similar to the assessment of electronic textbooks, telemedicine, and decision support systems,4 5 6 where the same issues arise.
Evaluating the content and structure of a web site
Since internet philosophy declares that anyone can set up a web site7 there is a risk that, through ignorance or bias, the content of the site may not be correct even if the original information sources were reliable. Impicciatore et al showed that parents searching for information about treating a feverish child could either receive good advice or be advised to administer aspirin, putting their child at risk of Reye's syndrome, according to which web site they visited.1 These investigators compared the information available on each site with statements in a reputable textbook, but such statements often disagree with contemporary systematic reviews of the literature.8 Thus, to determine the accuracy of web material we need to compare it with the best evidence, which usually means a meta-analysis of the appropriate kind of evidence. For effectiveness of treatment this is randomised trials,9 but for risk factors it is cohort studies, and for diagnostic accuracy it is blinded comparisons of the test with a standard.10
An important advantage of publishing on the internet is that it allows regular, even hourly, updating,7 so that patients and professionals using the world wide web expect material to be more up to date than paper sources. The easiest way to assess timeliness is to check the date on web pages,2 but, since the material may not have been current even then, independent comparison with the most up to date facts obtained elsewhere is preferable.
Even if the content is correct and up to date, people must be able to read and understand it. The web allows information to be communicated in many ways–as diagrams, animations, linked pages, flashing red capitals on a blue background, etc–which may not always improve legibility and comprehension.11 Asking visitors to a web site to record their satisfaction with the material is unlikely to reveal problems with comprehension, as visitors may not realise that they have misunderstood or may blame themselves. For web sites intended for the general public, it is useful to decide a minimum reading age for the material; a word processor's grammar checker can then be used to assess the text's readability and reading age. This is often underestimated; for example, the minimum reader age for this paragraph is 18 years. However, such measures are less revealing than asking subjects to answer questions based on the material.
Evaluating functions of a web site
One major concern of web site developers is how easily web users can find their site. While some site addresses are published in journals (such as the BMJ's “Netlines”), many users locate material by following links from other sites or conducting a search with a web search engine.7 Thus, we need to measure how many steps typical users take to locate the site and what other advice they come across on the way. Returning to Impicciatore et al,1 we do not know which of their 41 sites anxious parents would have found first; they might never have seen the misleading ones in real searches. Thus, evaluators should first identify the subset of web sites which typical users do locate and then assess the quality of these.
Since some web sites are complex, a second question is how easily users can locate relevant material within the site. It is useful to compare users' ease of navigating through the site with the ease of using a printout of the material or the paper documents from which the web site is derived, to judge if the electronic medium makes information easier, or more difficult,12 to locate.
A third functional issue is whether the web site is actually used, and by whom. Most “server” software for web sites logs each access to each page together with the abbreviated internet address of the requesting computer. However, such records of use must be interpreted carefully: accesses to a page may be accidental, casual browsing by “info-tourists,” or by users en route to another page. Since most server logs do not distinguish repeated visits to a page by the same individual, visits to a page cannot be equated with visitors. To collect more information, users can be asked to fill in web forms, but, as with paper questionnaires, most usually fail to do this, casting serious doubt on the generality of the data.13 Even if data on use are genuine, comparison of rates of use between different sites needs to be simultaneous rather than historical, given the exponential growth in the use of the internet.
Evaluating the impact of a web site
For those investing resources in a web site, a key question is its likely impact on clinical processes and patient outcomes and its cost effectiveness compared with other methods for delivering the same information.5 Tentative answers to this question can be obtained by studying the impact of the site on the knowledge of sample users in laboratory settings, but its real impact on clinical practice can be studied only in the field. Randomised trials comparing the effects of providing the same information in two different ways raise problems familiar to evaluators of other kinds of information resource,4 5 6 such as contamination of the management of patients in one arm of the trial by the management of patients in the other arm, and Hawthorne effects. There do not seem to be any published trials of the effects of the world wide web on clinical practice, but such assessments are clearly essential to justify large scale expenditure on computer networking and web sites and to define adverse effects.
Methodology of evaluation
There are two key issues common to many evaluation studies: choosing appropriate subjects and making reliable, valid measurements.
Choosing appropriate subjects
Studies of information technology often use poorly selected subjects, typically enthusiasts for the technology in question.5 The reported details about the users or clinical setting may be insufficient to know if they are representative of all patients or professionals who might use the information resource. This problem is particularly acute when response rates are low.6 For example, in a survey of users a key question is what were the views or demographic profile of those–typically the majority–who used the web site but did not respond to online questions?
Making reliable, valid measurements
Measuring complex human attributes such as intelligence or ease of navigating a web site is hard, requiring systematic testing and refinement of pilot questions.6 13 Two major factors determine whether such data are useful: reliability (are the data stable across distinct but similar individuals, or the same individual tested on two occasions?) and validity (is the question measuring what we think it is measuring?). A further issue is anchoring of measurement results so that we can interpret, say, a navigation score of 3 in terms of something known, such as the ease of navigating a printed document. Reliability and validity are extremely sensitive to details of the wording of questions, so ambiguity and vagueness must be eliminated.13 However, it is not unusual to find web forms containing poorly worded questions such as, “Age: 20 to 40 years? 40 to 60 years? 60 and above?” Ideally, investigators would have access to a library of previously validated measurement methods, such as those used for quality of life. However, few methods are available for testing the effects of information resources on doctors and patients, so investigators must usually develop their own and conduct studies to explore their validity and reliability.6
Although surfing the web provides an excellent method for patients and professionals to access clinical knowledge,2 7 unless we evaluate the quality of clinical sites and their effects on users, we risk drowning in a sea of poor quality information. Improved technology is not the answer to making better use of this enticing resource. We need to be clearer about the web's clinical role and the evaluation problems that it raises–how to recruit suitable subjects, develop valid and reliable methods of measurement, and carry out many more rigorous evaluations.