Archie – Search Engine – Doyle's Space

Now we just Google it. In the early days of the internet, we had to Archie it. Archie is a tool for indexing FTP archives, allowing users to easily identify specific files. The name derives from the word “archive” without the v.

The origins of the Internet date back to the development of packet switching and research commissioned by the United States Department of Defense in the late 1960s to enable the time-sharing of computers. The linking of commercial networks and enterprises by the early 1990s marked the beginning of the transition to the modern Internet.

The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data connections between the client and the server.

The Archie Search Engine has often been associated with the Archie comics. In fact, this association served as an inspiration for naming some search engines that came after it: Jughead and Veronica. Jughead is the acronym for “Jonzy’s Universal Gopher Hierarchy Excavation and Display”, and Veronica is the acronym for “Very Easy Rodent-Oriented Net-wide Index to Computer Archives”.

Both Jughead and Veronica were search engine systems for Gopher, a protocol developed by Mark McCahill in 1991, at the University of Minnesota.

Archie began as a project for students and volunteer staff at the McGill University School of Computer Science^[1] in 1987, when Peter Deutsch (systems manager for the School), Alan Emtage, and Bill Heelan were asked to connect the School to the Internet. The earliest versions of Archie would simply search a list of public anonymous File Transfer Protocol (FTP) sites using the Telnet^[2] protocol and create an index of the FTP files.

To view the contents of a file, it had first to be downloaded. The indexes are updated on a regular basis (contacting each roughly once a month, so as not to waste too many resources of the remote servers) and requested a listing. These listings were stored in local files to be searched using the Unix grep^[3] command. The developers populated the engine’s servers with databases of anonymous FTP host directories. This was used to find specific file titles since the list was plugged into a searchable database of FTP sites. Archie did not recognize natural language requests nor index the content inside the files. Therefore, users had to know the title of the file they wanted. The ability to index the content inside the files was first introduced by Gopher^[4].

The idea came to Emtage after spending countless hours trying to locate information for the students and staff of the faculty. This need became the basis of Emtage’s world-changing invention: rather than continuing to do it himself, he set out to write software that would allow students and staff to come in and search the index themselves. Today, Archie is considered the original search engine.

Many of the techniques that Emtage and his colleagues and fellow students established are now used by Google, Yahoo, and every other major Internet search engine. At its peak, Archie had 30 servers up and running and more than half of all internet traffic in Canada was running through them. Archie had no original price. Like all great Internet tools, it was free to use. Today, Archie has been reduced to only one server at the University of Warsaw’s Interdisciplinary Centre for Mathematical and Computational Modelling in Poland, but it is still very much used today (though primarily for educational or historical purposes).

Footnotes

McGill University is an English-language public research university located in Montreal, Quebec, Canada. Founded in 1821 by a royal charter granted by King George IV, the university bears the name of James McGill, a Scottish merchant whose bequest in 1813 formed the university’s precursor, University of McGill College (or simply, McGill College); the name was officially changed to McGill University in 1885. McGill’s main campus is on the slope of Mount Royal in downtown Montreal in the borough of Ville-Marie, with a second campus situated in Sainte-Anne-de-Bellevue, 19 miles west of the main campus on Montreal Island. The university is one of two members of the Association of American Universities located outside the United States, alongside the University of Toronto, and is the only Canadian member of the Global University Leaders Forum (GULF) within the World Economic Forum. McGill offers degrees and diplomas in over 300 fields of study, with the highest average entering grades of any Canadian university. [Back]
Telnet (short for “teletype network”) is a client/server application protocol that provides access to virtual terminals of remote systems on local area networks or the Internet. Telnet consists of two components: (1) the protocol itself which specifies how two parties communicate and (2) the software application that provides the service. User data is interspersed in-band with Telnet control information in an 8-bit byte-oriented data connection over the Transmission Control Protocol (TCP). Telnet was developed in 1969 beginning with RFC 15, extended in RFC 855, and standardized as Internet Engineering Task Force (IETF) Internet Standard STD 8, one of the first Internet standards. Telnet transmits all information including usernames and passwords in plaintext so it is not recommended for security-sensitive applications such as remote management of routers. Telnet’s use for this purpose has waned significantly in favor of SSH. Some extensions to Telnet which would provide encryption have been proposed. [Back]
grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p (globally search for a regular expression and print matching lines), which has the same effect. grep was originally developed for the Unix operating system, but later available for all Unix-like systems and some others such as OS-9. [Back]
The Gopher protocol is a communication protocol designed for distributing, searching, and retrieving documents in Internet Protocol networks. The design of the Gopher protocol and user interface is menu-driven and presented an alternative to the World Wide Web in its early stages, but ultimately fell into disfavor, yielding to HTTP. The Gopher ecosystem is often regarded as the effective predecessor of the World Wide Web. [Back]

Sources

Wikipedia
Stackscale
Mashable
History-Computer
Web Design Museum

Author: Doyle

I was born in Atlanta, moved to Alpharetta at 4, lived there for 53 years and moved to Decatur in 2016. I've worked at such places as Richway, North Fulton Medical Center, Management Science America (Computer Tech/Project Manager) and Stacy's Compounding Pharmacy (Pharmacy Tech). View all posts by Doyle

Archie – Search Engine

Footnotes

Further Reading

Sources

Like this:

Author: Doyle

Leave a ReplyCancel reply

Footnotes

Further Reading

Sources

Share this:

Like this:

Author: Doyle

Leave a ReplyCancel reply

Discover more from Doyle's Space