February 1993 - Volume 9:4
By Judy Smith and Daniel Updegrove
Franklin, Penn's online library catalog, PennInfo, our campus-wide information system, and Whois, the online electronic mail directory, are three popular resources on PennNet, the campus data network. Since passwords are not required for these systems, many users are, in fact, affiliates of other universities, government agencies, or industrial firms. Their link to PennNet is the Internet, a network of more than one million computers in forty countries.
Penn students, faculty, and staff, in turn, can access thousands of resources around the world: library catalogs, campus information systems, directories, databases, and archives. The number of institutions joining the Internet, the number of individuals with access, and the number of resources being contributed to this public domain continue to grow rapidly. But how does one navigate in such a vast sea of information?
Until recently, only intrepid researchers and networking gurus understood enough about network addressing, user command interfaces, and technical tricks to use the Internet for more than electronic mail. New navigation tools--easy to use, widely available, and free--have dramatically changed this. Now anyone can, in a matter of minutes, learn to explore from Sweden to Singapore in search of scholarly, technical, or avocational treasure. (See the "Internet hunt," sidebar for examples.)
Five Internet navigation tools--Archie, Gopher, Veronica, WAIS, and World-Wide Web--are introduced in this article, followed by instructions on accessing Gopher, which provides links to the other four tools, as well as to PennInfo. As with PennInfo, these tools are usable via any computer that can emulate a VT100 terminal, but additional power and ease of use are available with versions that operate as clients on Macintoshes and other workstations with IP/ethernet connections.
ArchieThe Internet community has been amassing text, image, software, and database resources for over twenty years. Historically, these resources have been stored in public repositories known as anonymous FTP servers. FTP is the Internet-standard high-speed file transfer protocol, used for exchange of private information by trusted parties with passwords as well as for publishing information without passwords, i.e., anonymously.
Hundreds of archives now exist but, up until a year ago, no one tracked them. Archie (ARCHIvE server) was developed at McGill University to index the contents of all FTP servers and provide keyword searching of the index. Its approach is simple but powerful: Every night it re-indexes roughly one thirtieth of the servers; the result is a database that is completely refreshed each month.
Although Archie enables you to locate information, it does not allow you to view or retrieve the information. To do that, you need FTP software on an IP-connected workstation or host (see Penn Printout, March 1992).
WAISWide Area Information System (a joint project of Apple Computer, Dow Jones, KPMG Peat Marwick, and Thinking Machines Corporation) provides a uniform interface to many full-text databases, together with a sophisticated "relevance search" capability. You can search any WAIS database using any word or phrase and the system will return a menu of documents, ordered from more to less relevant. WAIS databases are commonly collections of related data (The Bryn Mawr Classical Review), primary source documents (Clinton speeches), or reference works (CIA World Fact Book, Roget's Thesaurus). There are currently almost 400 WAIS databases, and new ones appear frequently.
Since it can be difficult to determine the focus of a WAIS database from its name, a Directory of Servers, itself a WAIS database, was developed. You can search this directory for topics that interest you, and it will suggest WAIS databases for you to explore. For example, you could search the directory using the keyword "religion," and you would be referred to three WAIS databases: the Book of Mormon, the Qur'an, and the Bible.
GopherGopher began as the University of Minnesota's version of PennInfo, a menu-driven campus-wide information system (CWIS). Gopher's simplicity as a distributed, client/server CWIS led to its rapid adoption by other institutions, some of which developed new client or server software for desktop or host computers and contributed them to the Gopher software archive (accessible via anonymous FTP, naturally). Soon thereafter, Minnesota offered to provide a menu of all Gopher servers that any other Gopher could access. The result was what networkers have been talking about for years: an interoperating set of information systems linking several hundred organizations around the world, all with a common user interface!
The next step in Gopher's evolution was addition of gateways to FTP, Telnet (the Internet standard remote terminal protocol), Archie, WAIS, and WWW. Gopher was thus transformed from an integrated set of CWIS programs into the most successful Internet navigation tool. But success became problematic: As the worldwide menu structure grew, locating information became increasingly tedious. Something like Archie was needed to help researchers locate information quickly in this new, ever expanding "Gopherspace."
VeronicaIn November, 1992 a search tool, Veronica, was contributed to Gopher by a team from University of Nevada at Reno. The original Veronica ("Very Easy, Rodent-Oriented, Net-wide Index to Computerized Archives," a comic acronym if ever there was one) provides a search through all menus using a single keyword. The result is a dynamically created menu of all Gopher resources that contain the keyword in their menus. Now, a second Veronica search tool has appeared--an indexed WAIS database extended to allow Boolean searches of menu documents. Although both searches are limited to words in menus (as opposed to the full text of documents), the combination of Veronica and Gopher results in a powerful capability to search for and retrieve information from all over the Internet, with the location of the information effectively irrelevant.
World-wide Web (WWW)WWW was developed as a hypertext system at the Center for Nuclear Energy Research (CERN) in Geneva. It allows links with and between WWW documents and, like Gopher, provides access to other Internet resources and navigation tools. Although much admired by many in the Internet community for the elegance of its design, WWW has not proliferated as has Gopher, in part because WWW services are more complex to create and maintain, and in part because security restrictions at CERN restricted Internet access.
Access to GopherFrom the PennNet annex: prompt, issue the command telnet gopher and you will be presented with a main menu including local and remote gopher servers, the Gopher-PennInfo gateway (that is, PennInfo menus and documents accessible via the Gopher user interface), as well as Archie, Veronica, WAIS, and WWW. Alternatively, use the "worldwide" command from within PennInfo. It should be noted that these navigation tools currently qualify for only "best effort" support from Data Communications and Computing Services (DCCS); no formal training, local documentation, or CRC help is planned. DCCS seeks feedback about the use of these tools; send comments to Al D'Souza, Director of Program Managemet, at 898-2429 or via e-mail to firstname.lastname@example.org.
To obtain Gopher client (or server) software, FTP to ftp.upenn.edu or boombox.micro.umn.edu. Log in as "anonymous," use "guest" as your password, and change your directory to pub/gopher. Gopher clients may be set to "point to" the server at Minnesota; you are encour-aged to reset your client to point to gopher.upenn.edu.
To obtain Archie, WAIS, or WWW client software for an IP/ethernet- connected computer, simply use Veronica and Gopher. (Alternatively, search PennInfo by the keyword "navigation" to determine FTP archive addresses, and then use FTP.) From a WWW client, point to http://www-penninfo.upenn.edu: 1962/ for access to PennInfo.
Home againThe joint development effort, client/server paradigm, and no-cost distribution of these navigation tools is characteristic of the Internet. Consider:
Sidebar 1: Internet Hunt
Below is a sampling of questions from the December and January Internet Hunts. This monthly treasure hunt was created by Rick Gates, Director of Library Automation at the University of California at Santa Barbara. For more information use the keywords "internet hunt" to search PennInfo.
Sidebar 2: Further Reading
Any one of the following books on the Internet is recommended for further reading:
JUDY SMITH is a Technical Writer for ISC Communications Group. DANIEL UPDEGROVE, Associate Vice Provost for Information Systems and Computing, is Executive Director of DCCS.