A Guide to the Internet
for Medical Practitioners

by m.pallen@ic.ac.uk


3. The World Wide Web


Abstract


The world wide web, or the web, provides a uniform user friendly interface to the Internet. Web pages can contain text and pictures and are interconnected by hypertext links. The addresses of web pages are recorded as uniform resource locators (URLs), transmitted by hypertext transfer protocol (HTTP) and written in hypertext mark up language (HTML). Programs that allow you to use the web are available for most operating systems. Powerful on line search engines make it relatively easy to find information on the web. Browsing through the web--"net surfing"--is both easy and enjoyable. Contributing to the web is not difficul, and the web opens up new possibilities for electronic publishing and electronic journals.

Back to Top


The world wide web (1) (WWW, W3, or simply "the web") is the crowning glory of the Internet, providing a uniform, user friendly interface to the Net. It allows information to be presented in a sophisticated and attractive format, interlacing pictures with text. Simply by clicking on highlighted text, you can surf the net or search for infomation.

The Web has fuelled such an explosion of interest in the Internet that it is easy to forget quite how new it all is. Although Tim Berners-Lee and his coworkers first put forward proposals for the world wide web in 1989-90, (2) the web was catapulted to success only with the release of Macintosh (3) and Windows (4) versions of Marc Andreessen's world wide web client program (or "web browser"), "Mosaic," in the autumn of 1993 (5). Since then, the Web has shown astonishing exponential growth.

Back to Top


Anatomy of the web


The web page is the basic unit of information on the web. Four elements are needed for its creation, transmission, or retrieval: hypertext; uniform resource locators (URLs); hypertext transfer protocol (HTTP); and hypertext markup language (HTML).

Back to Top


Hypertext

Hypertext (6) underpins the web. The term "hypertext" was coined by Ted Nelson in the 1960s, although the concept was probably first proposed by Vannevar Bush in the 1940s .(7) The hypertext idea was later incorporated into the Macintosh program Hypercard (8) and now also features in the help program built into the Windows operating system.

Hypertext, in its most basic form, functions like an electronic footnoting system, where a hypertext link takes you to more detailed information about the issue in question. However, hypertext links on the world wide web, unlike conventional footnotes, may lead you to information held anywhere on the Internet--in another town or country or even on another continent--and the information you receive may be a text, graphics, or sound file, or even a movie. Unlike gopher (see box), the web's hypertext links can be embedded within any part of a web page and are not necessarily arranged in a hierarchical format. How hypertext links are highlighted varies according to the web browser--most browsers use underlining or display hypertext in a different colour (fig 3.1).

In following a series of the web's hypertext links, you trace a path through "information space" rather than through geography. A single click of your mouse on a web page in Europe might take you to a web site in Asia, Africa, the Americas, or Oceania. Hypertext encourages you to follow non-linear paths through information--metaphorically you explore a web rather than a tree.

Back to Top


URLs

URLs (9) (10)--uniform resource locators--are the standard form of address used in web hypertext links to retrieve or send information. You can use an URL not only to access other web pages but also to interact with other Internet services such as gopher (see box), File transfer protocol, or email.

Most URLs contain three pieces of information:

The formatis typically:
protocol://server-name/path
For example, this URL will take you to a web page at Queen Mary Westfield:
http://www.qmw.ac.uk/~rhbm001/index.html
whereas this URL will take you to the Queen Mary Westfield gopher (see Box):
gopher://gopher.qmw.ac.uk
and this URL will retrieve a file from Queen Mary Westfield's FTP server:
ftp://ftp.qmw.ac.uk/pub/aids/aids9409.rmf
Other examples of URLs can be seen in the references to the paper version of this article.

There are three ways in which you can use an URL to take you somewhere on the web. Firstly, you can simply click on a hypertext link on a web page. In this case, the URL is buried inside the link and is invisible to you, the user. Secondly, you can supply your web browser with the URL by typing it in manually. This option is useful if you find the URL for an interesting site printed in a magazine, journal, or book. Finally, if you are using a suitable operating system, you can cut and paste an URL from an electronic document (such as an email message) into a window on your web browser.

Back to Top


HTTP

HTTP (hypertext transfer protocol) is the protocol used to transfer information on the web .(11) An HTTP connection is a short lived affair, lasting just long enough for a web page or image to be requested by the browser, then sent by the web server. As a separate HTTP connection is needed to retrieve each image on a web page, displaying graphics-rich pages can be time consuming for some clients or servers. The web browser Netscape speeds things up by running mutiple connections in parallel when accessing such pages.

Back to Top


HTML

Web pages are plain text documents, marked up in hypertext markup language, HTML, (12) which defines the format of areas of text by "tagging" them. (13) The tags are interpreted by a web browser so that tagged areas are displayed differently from the normal style of text (for example, as headings or as lists--fig 3.1).


Fig 3.1a:
A typical web page as viewed
through a web browser


<B>The Microbial Underground</B> is a collection of 
Web pages that contain Medical, Microbiological and 
Molecular Biological material, with links to other such 
material on the internet. Pride of place goes to my, as yet 
incomplete, <a 
HREF="http://www.qmw.ac.uk/~rhbm001/intro.html">On-Line 
Course in Medical Bacteriology</a>, probably the first 
WWW course on Medical Bacteriology in the world.


<p>
<img src="http://www.qmw.ac.uk/~rhbm001/construction.GIF">
<p>


<B>The Microbial Underground</B> opened on 23rd 
March 1995 and is still very much under construction. You 
can expect to see plenty of changes over the coming weeks 
and months. One change that has already happened is that the 
Microbial Underground has moved from its original site at 
Imperial College to a <A HREF="moving.html">new site at 
QMW</A>.

Fig 3.1b: The HTML version of the same page


Tags usually surround the tagged area in the format text. For example, a top level (level 1) heading is tagged as follows:

<h1>This is a top-level heading</h1>

Tags are usually defined functionally, rather than visually, so that the way in which a given tag is displayed varies from browser to browser. Take, for example, text tagged with <h1></h1>. Both Mosaic and Netscape would display this text in a large, bold font; WinWeb would also underline the text; Lynx would centre the text and then capitalise all the letters.

Hypertext links are marked up in a special format:

 <A HREF="URL">hypertext</A> 
For example:
<A HREF="http://www.qmw.ac.uk/~rhbm001/index.html">The Microbial Underground</A>
Links marked up in this way can be used to take the reader elsewhere in the same document, to other HTML documents, or to completely different types of file.

Back to Top


Web browsers


There are now web browsers for almost all operating systems. (14) The simplest web browsers, such as Lynx, (15) allow users with text only Internet access to explore the web, albeit missing out on its graphical richness. If you do not have full Internet access, but do have a UNIX account, try typing "lynx" next time you are logged on; if that does not work, try one of the open access sites that allow free exploration of the web using a text only browser. (16) (17)

NCSA Mosaic (18) was the first widely used web browser that not only displayed on line images but also provided access to gopher and other non-web Internet resources. Although Mosaic revolutionised the web, it has now been replaced by Netscape Navigator (19) as the most commonly used web browser. The latest version of Netscape Navigator supports extensions to HTML (20) that, for example, allow authors to specify a background to web pages. Both Mosaic and Netscape are available free of charge to users in educational settings.

What if you have only email access to the Internet? You can still access the web by using a "WebMail" server to retrieve the text of WWW pages. (21) Send an email to webmail@curia.ucc.ie with the command GO in the body of the message, where is the URL of the document you wish to retrieve. There is no easy way to use this system to follow links embedded in the web page. You have to repost any URLs found in the retrieved document back to the web server. Also you cannot use enter data into forms using this approach. Another alternative, still under development, is Agora (22); for more information send an e-mail to to agora@mail.w3.org or agora@www.undp.org, with the command WWW in the body of the message.

Back to Top


Searching the web


One of the great strengths of the web that underpins its astonishing growth is that anyone with access to a web server can contribute to the web; apart from local rules as to what may be placed on the server, you do not need anyone's permission to publish on the web. (23) This is also a weakness, as it means that no one holds a central index of the web; the information you want may be out there somewhere, but you may not be able to find it.

Let us imagine that you are giving some lectures on orthopaedic surgery and want some images related to the subject. There are two approaches you can try. The first is to start your search at a site that provides a hierarchical index of the web. (24) (25) (26) (27)You might begin your search on the top menu of the popular web site, Yahoo (24). You select the link to "health," then to "medicine." (28)



Fig 3.3 Starting a
search on the WWW

Click on image to view full sized figure


This takes you to a list of a medical specialties, from which you select "orthopaedics." You find five orthopaedic institutes and one orthopaedic organisation listed. You select the hypertext link to orthopaedics at Queen's University, Belfast. This takes you to a web page (29) detailing the activities and resources of the orthopaedic department there.



Fig 3.4
Web based orthopaedic
information at Queen's University,
Belfast

Click on image to view full sized figure


Now all you need do is click on the link to the Orthopaedic Image Database (30) to access the department's collection of slides.



Fig 3.5 Orthopaedic images
on the web

Click on image to view full sized figure


An alternative approach to finding information on the web is to use one of the "search engines" that collect and index data published on the web. The most comprehensive of these is Lycos, (31) which currently holds information on over five million web pages. You can query several search engines one at a time via the ConFigurable Unified Search Interface, (CUSI) (32) or in parallel, using SavvySearch (33). You choose to use SavvySearch. You type the query term "orthopaedic" into the search window and cross check the "images" box to tell SavvySearch what sort of sites to search for (34). SavvySearch then lists numerous links, most of which take you back to Queen's University.

Once you have found an interesting site on the web, you can easily revisit it by using your history list or bookmark list. The history list is a temporary log of all of the sites that you have visited since you opened the program; you can scroll back through to revisit any of the sites listed, or just go back to the most recent site with a "go back" command. Your bookmark list is similar to the history list, except that it contains only those sites that you have deemed important enough to bookmark and, unlike the history list, it survives quitting and restarting the program.

Back to Top


Surfing the web


The web has turned net surfing into a pleasurable pastime and an occupational hazard--fascinating but also distracting. Simply by following one hypertext link after another in a stream of consciousness fashion, one can travel vast distances in both information and geographical space, often stumbling on hidden treasures.

To illustrate this point, let us begin a net surfing session, starting again in Yahoo's Health:Medicine Section (28). A click on the Education link takes us to Yahoo's list of 11 links to medical education sites. We choose what seems to be the most comprehensive site, Shlomi Codish's index on medical education, (35) which is held on a server at the Ben-Gurion University of the Negev in Israel. Here, we select a link to a list of web based educational material on cardiology; from this list we select a link entitled "The Virtual Heart" (36). This takes us to some stunning web pages on the heart maintained by the Franklin Institute Science Museum in Philadelphia. These describe all aspects of heart function and disease and contain links to movies, to sound files, and to nume! rous o ther cardiology pages on the web.

We start to download a movie of a trip down a coronary artery (37). That will takes us 10 minutes or more, so in the meantime we open a new window on our web browser and explore the institute's information about Benjamin Franklin. A link takes us to the American Declaration of Independence (38). The bad press given to King George III in that document prompts searches that take us to a review of "The Madness of King George". (39) and to information on the crystal structure of porphobilinogen deaminase (a defective version of which was probably responsible for George's illness) (40). Finally the movie arrives (fig 3.6) and our net surfing experience ends.



Fig 3.6 Travelling down
a coronary artery

Back to Top


Contributing to the web


The richness of the web stems largely from the fact that it is so easy to contribute material to it. (23) You need no training in computer science or graphic design, just the ability to write HTML and access to a web server. Creating good quality web pages is not difficult as long as you follow the basic rules spelt out in several good introductions to web authorship (most of them available only on the web) (23) (41) (42) (43) (44) (45) (46) (47) (48) (49). How easy it will be for you to gain access to a web server will vary. Some academic institutions or departments have very strict rules governing who may put what on the web, whereas others allow any bona fide academic to launch web pages. For those outside academia, several commercial Internet service providers will make space on a web server available to their customers, albeit at a cost.

What should you include in your web pages? This will depend on whether you are an official author of your department's or institution's web pages or whether you are acting independently. Departmental and Institutional web sites usually include: general information about the department or institution, with local phone numbers and email addresses; specific information about local research, educational, and clinical interests; and links to related sites elsewhere on the web. Several British medical schools now have their own home pages (50) and Brighton Health Care Trust (51) has led the way in producing the first non-academic NHS hospital site on the web. Many professional associations, such as the Association of Clinical Biochemists, (52) are also now on the web. Many individuals have created their own "home pages", (53) containing some biographical data, a picture, and links to sites that interest the author.

Back to Top


Publishing on the web


Although some scientists have made their research papers available through the Internet for many years, the arrival of the web presents a more attractive and powerful medium for electronic publishing. Several medical journals have now moved on to the web (54) (55) (table 3.1). Some provide only a table of contents for each issue; others, like the BMJ, publish a selection of their articles in electronic form; yet others are available in a full text format.

Back to Top


Table 3.1
Selected biomedical journals accessible via the web


For the professional medical publisher moving on to the web, having to use HTML is presently a problem, as:

There are several solutions to these problems. One approach, followed by the BMJ, is to mark up the publication in a more powerful markup language, SGML, (56) that can be interpreted both as HTML and by desktop publishing software. Another is to make documents accessible through the web, not as web pages, but in one of several portable document formats (e.g. Adobe Acrobat, (57) Common Ground, (58) Envoy, (59) Replica (60)). Material produced by desktop publishing software can be exported to these formats. This approach also affords more control over how the publication will appear to the reader. Communicable Diseases Report, (61) Mortality and Morbidity Weekly Review, (62) and the Journal of Biological Chemistry are all published in Acrobat's pdf format. Medical and scientific publication on the Internet raises some problems. There is anxiety that material can be freely distributed without being peer reviewed. While this might be a problem for the naive reader, it is usually quite clear whether electronically published information has been peer reviewed or not. On the plus side, implementation of electronic peer review is likely to overcome many of the deficiencies of current peer review practice. (63) (64)

Another problem is the evanescent nature of pages on the web; links expire as web pages are moved or removed, or as servers close down. This makes it difficult to cite data on the web. Authors citing an article in the paper version of the BMJ can be fairly certain that readers a hundred years from now will still be able to find that article; those providing a reference to the equivalent item on the web, there is no guarantee that readers will be able to find it the following week. Indeed, the BMJ changed all of its URLs while these articles were being written.

Finally, there are potential ethical problems with the reproduction of clinical data, such as images of patients, on the web. In particular, it is unclear whether additional permission should be sought from patients when existing photographic images are transfered onto the web. As a result of such considerations, the orthopaedics department at Queen's University, Belfast, has now restricted access to the image database (30) to local users.

Back to Top


Box: Gopher, the forerunner of the web

Gopher (65) (66) (67) is an Internet service similar to, but predating, the web. Gopher presents you with a series of menus; selecting items from a menu might take you to other menus, to text based information, or to any other sort of file. Indeed, these menu driven links, like the web's hypertext links, can take you to information anywhere on the Internet. Like the web, gopher allows easy "net surfing"--you keep making choices at each menu to see what turns up next.

The term "gopherspace" is often used to denote the information space served by gopher. You can search gopherspace with a tool called "Veronica", (68) (69) which you can access on most gophers by selecting "Other Gopher and Information Services" at the main menu, then "Searching through Gopherspace using Veronica."

Although Gopher can be used through dedicated client programs, (70) (71) (72) via certain open access UNIX sites, or even via email, (21) it is now often accessed through the web; most web browsers also act as gopher clients, and hypertext links can be made from the web pages to gopher sites using gopher specific URLs.

There are numerous medical and biological gophers, (73) covering almost everything from AIDS (74) to the WHO (75). Readers keen to explore medical gopherspace should start at lists held at the National Institute of Allergy and Infectious Disease gopher, (73) or alternatively perform a Veronica search with the keywords "Medical Gophers" (76).

Back to Top


Acknowledgments

I thank the following citizens of cyberspace (most of whom I have never met IRL) for their help in preparing this series of articles:
  • Ross Anderson
  • Clive Baldock
  • Amy Brenen
  • Ted Coles
  • John Cox
  • Chris Derrett
  • Paul Drummond
  • Georg Fuellen
  • David Harper
  • Tony Helman
  • Tom Heydeman
  • Rick Jones
  • George Kernohan
  • Iain Kewley
  • Ronald LaPorte
  • Frank Norman
  • Brian Payne
  • Mike Prentice
  • Anthony Redmond
  • Jon Rogers
  • Anne Summers
  • Sheila Teasdale
  • Harriet Thompson
  • John Togno
  • Ben Toth
  • Lesley West
  • Brendan Wren
I thank my wife for her proofreading and her patience.

Back to Top


Note added in proof

As a result of the ethical problems discussed in the final paragraph of this article, the Department of Orthopaedics at Queen's University, Belfast has now restricted access to the image database cited in reference 30 of this WWW version of the article (ref. 7 in the dead tree version) to local users.

Back to Top


References


1. World Wide Web FAQ

Back to Text

2. WorldWideWeb: Proposal for a HyperText Project

Back to Text

3. NCSA Mosaic For The Macintosh 2.0 FAQ

Back to Text

4. NCSA Mosaic for Microsoft Windows: Frequently Asked Questions

Back to Text

5. WWW Project History

Back to Text

6. Hypertext Places: Useful WWW sites regarding hypertext, hypermedia, and world wide web.

Back to Text

7. As We May Think by Vannevar Bush

Back to Text

8. Short History of Hypertext

Back to Text

9. WWW FAQ: What is an URL?

Back to Text

10. WWW Names and Addresses, URIs, URLs, URNs, URCs

Back to Text

11. Overview of HTTP

Back to Text

12. HTML - Working and Background

Back to Text

13. Understanding HTML and SGML

Back to Text

14. World Wide Web FAQ: Obtaining and using web browsers

Back to Text

15. About Lynx

Back to Text

16. Telnet to Oxford University Radcliffe Science Library

Back to Text

17. Getting started with WWW

Back to Text

18. NCSA Mosaic Home Page

Back to Text

19. Welcome to Netscape

Back to Text

20. EXTENSIONS TO HTML

Back to Text

21. Accessing The Internet By E-Mail: Doctor Bob's Guide to Offline Internet Access, 4th Edition July 1995

Back to Text

22. Agora

Back to Text

23. Running A WWW Service

Back to Text

24. Yahoo

Back to Text

25. Yahoo: Computers and Internet:Internet:World Wide Web:Indices to Web Documents

Back to Text

26. TradeWave Galaxy

Back to Text

27. The WWW Virtual Library: Subject Catalogue

Back to Text

28. Yahoo:Health:Medicine

Back to Text

29. Information maintained by Orthopaedic Surgery, QUB

Back to Text

30. Orthopaedic Image Database

Back to Text

31. Lycos Home Page

Back to Text

32. CUSI

Back to Text

33. SavvySearch

Back to Text

34. SavvySearch query for orthopaedic images

Back to Text

35. Comprehensive list of Educational Medical Sites on the web

Back to Text

36. The Virtual Heart

Back to Text

37. Trip down a coronary artery

Back to Text

38. Declaration of Independence

Back to Text

39. Review - The Madness of King George

Back to Text

40. Tetrapyrrole Biosynthesis

Back to Text

41. Yahoo: Computers and Internet:Internet:World Wide Web:Authoring

Back to Text

42. Yale C/AIM WWW Style Manual

Back to Text

43. Yale C/AIM WWW Style Manual: World Wide Web Authoring Resoures

Back to Text

44. A Beginner's Guide to HTML

Back to Text

45. "The HTML Sourcebook" Order at Blackwell's

Back to Text

46. "HTML Authoring for Fun and Profit" Order at Blackwells

Back to Text

47. Graham AIS. The HTML sourcebook. John Wiley & Sons Ltd., 1995.

Back to Text

48. Morris M. HTML Authoring for Fun and Profit. Prentice-Hall, 1995.

Back to Text

49. Top 10 ways to make your WWW service a flop

Back to Text

50. Yahoo:Health:Medicine:Medical Schools

Back to Text

51. Brighton Health Care NHS Trust home page

Back to Text

52. Association of Clinical Biochemists Home Page

Back to Text

53. Yahoo:Entertainment:People

Back to Text

54. MedWeb: Electronic Newsletters and Journals

Back to Text

55. Welcome to HyperJournal

Back to Text

56. TEI Guidelines for Electronic Text Encoding and Interchange 2: A Gentle Introduction to SGML

Back to Text

57. Adobe Acrobat

Back to Text

58. Common Ground

Back to Text

59. Envoy

Back to Text

60. Replica

Back to Text

61. CDSC Home Page

Back to Text

62. MMWR

Back to Text

63. Implementing Peer Review on the Net: Scientific Quality Control in Scholarly Electronic Journals

Back to Text

64. Harnad S. Implementing Peer Review on the Net: Scientific Quality Control in Scholarly Electronic Journals. In: Peek R, Newby G, eds. Electronic Publishing Confronts Academia: The Agenda for the Year 2000. Cambridge MA: MIT Press., 1995

Back to Text

65. Gopher FAQ via FTP

Back to Text

66. Gopher FAQ via WWW

Back to Text

67. Gopher FAQ via Gopher

Back to Text

68. Veronica FAQ

Back to Text

69. Veronica Home

Back to Text

70. TurboGopher

Back to Text

71. Gopher Software Distribution

Back to Text

72. Windows gopher clients

Back to Text

73. Bio and Medical Gophers and Info. Sites

Back to Text

74. AIDS/HIV Gophers

Back to Text

75. WHO gopher

Back to Text

76. Veronica Search: Medical Gophers

Back to Text