Search engines are the largest of the finding aids and often search the full text of many Web pages. The resources found are often numerous, numbering in the millions (representing billions of words), though not always well selected for quality or subject relevance. Even though they can be difficult to use and frequently result in many false drops, for many topics they are essential. However, be prepared to sift through many inconsequential pages to find what you want.

Search engines use spiders or robots (a type of software tool), to continually comb vast domains of Internet documents for information. Resources found are placed in the search engine's databases which are made available for searching by the user.

Users can connect to a search engine site and enter keywords to query the index. The best matches, whether Web pages or other Internet resources, are then returned as hits.

Search engines vary according to the size of the index, the frequency of updating the index, the search options, the speed of returning a result set, the result set presentation, the relevancy of the items included in a result set, and the overall ease of use.

It is important to remember that most resources on the Internet are not subject indexed at all like the resources listed in a card catalog. The library cataloger spends much time and energy selecting the perfect few words that describe the resource and using the same words in a consistent manner. When a search engine selects a page based on a keyword request, it does not pay attention to how the word is used on that page. Pages created by many different people may use the same word in different ways and in different contexts. For example, one person may use the word "madonna" to describe a religious figure while someone else is discussing the rock star. Even more frustrating is that a word may be used on a page to describe what is not going to be discussed, but the search engine cannot differentiate.

Some search engines are now indexing Web documents by the meta tags in the documents' HTML which are invisible to the browser. What this means is that the Web page author can have some influence over which keywords are used to index the document, and even in the description of the document that appears when it comes up as a search engine hit.

Unless the author of the Web document specifies the keywords for his or her document, it's up to the search engine to determine them. Each search engine has its own way of doing this, which is called its search algorithm. Some pull out words that are believed to be significant and ignore words believed to be insignificant. Some index every word on a page, some only part of the page. Sometimes words that are mentioned towards the top of a document and words that are repeated several times throughout the document are more likely to be considered important.

For the most part, search engines do not understand synonyms. A query on the moon would not return a document that used the word "lunar."

The better navigators allow more precision in searching via Boolean and/or proximity operators as well as through various weighting schemes (e.g., the frequency of occurrence of a chosen word in a document).

Some navigators are more useful for one subject than another. Most yield better results if one is able to do fairly precise searching on clearly defined concepts.

Some search engines also allow one to restrict one's search to certain parts of the Internet (i.e., Usenet or the Web) or to specific parts of Web documents (i.e., the title, author, abstract, keyword or URL).

Because the Internet is always growing and because these search engines search in different ways and search different parts of the Internet, doing the same search using different search engines will often give you wildly differing results. Eventually one learns with practice which search engines work best for individual searching styles and topics. But the learning is a continuing process. The most important thing to remember is not to get stuck just using one method of searching or one search engine, especially if the results are poor.

Popular search engines include:

Full-text index of more than 30 million Web pages and over 14,000 news groups. A powerful and very fast search engine enables Web users to conduct precise searches for specific information by looking for phrases, specifying key words, using case-sensitive matches, and restricting searches to titles or other parts of a document.

A subject directory as well as a search engine, so use the most unique search terms you can. Allows Boolean AND and OR but does not allow proximity searching.

Infoseek guide
Infoseek is a combination of search engine and listing service. Part of the reason for Infoseek's success is its ability to search not only the Web and Usenet, but also Web frequently asked questions (FAQs), e-mail addresses, current news, and company listings. Use it when you want to search more than just the Web or Usenet newsgroups.

Infoseek Ultra
Has indexed the full text of over 50 million pages and is updated daily. Unique to this new search engine are automatic name recognition, no need for quotes to recognize capitalized word phrases, and all words are searched. Plain English queries work and it finds all word variants, e.g. mice will find mouse.

Open Text
Gives you the choice of searching for a complete phrase, searching for groups of words using Boolean operators, or searching with proximity operators. Unlike services such as Excite, Lycos, and WebCrawler (which index only keywords), Open Text Index catalogs every word on every page it finds.

Small database, but great search engine and relevancy ranking. Good for quick searches.
WebCrawler handles searches in an orderly and logical manner. As a result, you'll almost always find what you're looking for--even if it takes more time than with other services. Use this search engine primarily for finding sites related to common topics.

Excite is known for its conceptual searching --it finds not only your keyword, but also lots of concepts related to it.

A search engine with first-rate speed and some unusual features, including the ability to limit searches to Web pages that contain specific technologies such as JavaScript or Shockwave.




Exploratorium Learning Studio
3601 Lyon Street
San Francisco, CA 94123

© The Exploratorium 3601 Lyon Street San Francisco, CA 94123