|
Search engines are the largest of the finding
aids and often search the full text of many Web
pages. The resources found are often numerous,
numbering in the millions (representing billions of
words), though not always well selected for quality
or subject relevance. Even though they can be
difficult to use and frequently result in many
false drops, for many topics they are essential.
However, be prepared to sift through many
inconsequential pages to find what you want.
Search engines use spiders or robots (a type of
software tool), to continually comb vast domains of
Internet documents for information. Resources found
are placed in the search engine's databases which
are made available for searching by the user.
Users can connect to a search engine site and
enter keywords to query the index. The best
matches, whether Web pages or other Internet
resources, are then returned as hits.
Search engines vary according to the size of the
index, the frequency of updating the index, the
search options, the speed of returning a result
set, the result set presentation, the relevancy of
the items included in a result set, and the overall
ease of use.
It is important to remember that most resources
on the Internet are not subject indexed at all like
the resources listed in a card catalog. The library
cataloger spends much time and energy selecting the
perfect few words that describe the resource and
using the same words in a consistent manner. When a
search engine selects a page based on a keyword
request, it does not pay attention to how the word
is used on that page. Pages created by many
different people may use the same word in different
ways and in different contexts. For example, one
person may use the word "madonna" to describe a
religious figure while someone else is discussing
the rock star. Even more frustrating is that a word
may be used on a page to describe what is not going
to be discussed, but the search engine cannot
differentiate.
Some search engines are now indexing Web
documents by the meta tags in the documents' HTML
which are invisible to the browser. What this means
is that the Web page author can have some influence
over which keywords are used to index the document,
and even in the description of the document that
appears when it comes up as a search engine hit.
Unless the author of the Web document specifies
the keywords for his or her document, it's up to
the search engine to determine them. Each search
engine has its own way of doing this, which is
called its search algorithm. Some pull out words
that are believed to be significant and ignore
words believed to be insignificant. Some index
every word on a page, some only part of the page.
Sometimes words that are mentioned towards the top
of a document and words that are repeated several
times throughout the document are more likely to be
considered important.
For the most part, search engines do not
understand synonyms. A query on the moon would not
return a document that used the word "lunar."
The better navigators allow more precision in
searching via Boolean and/or proximity operators as
well as through various weighting schemes (e.g.,
the frequency of occurrence of a chosen word in a
document).
Some navigators are more useful for one subject
than another. Most yield better results if one is
able to do fairly precise searching on clearly
defined concepts.
Some search engines also allow one to restrict
one's search to certain parts of the Internet
(i.e., Usenet or the Web) or to specific parts of
Web documents (i.e., the title, author, abstract,
keyword or URL).
Because the Internet is always growing and
because these search engines search in different
ways and search different parts of the Internet,
doing the same search using different search
engines will often give you wildly differing
results. Eventually one learns with practice which
search engines work best for individual searching
styles and topics. But the learning is a continuing
process. The most important thing to remember is
not to get stuck just using one method of searching
or one search engine, especially if the results are
poor.
Popular search engines include:
AltaVista
Full-text index of more than 30 million Web pages
and over 14,000 news groups. A powerful and very
fast search engine enables Web users to conduct
precise searches for specific information by
looking for phrases, specifying key words, using
case-sensitive matches, and restricting searches to
titles or other parts of a document.
http://www.altavista.digital.com/
Lycos
A subject directory as well as a search engine, so
use the most unique search terms you can. Allows
Boolean AND and OR but does not allow proximity
searching.
http://www.lycos.com/
Infoseek guide
Infoseek is a combination of search engine and
listing service. Part of the reason for Infoseek's
success is its ability to search not only the Web
and Usenet, but also Web frequently asked questions
(FAQs), e-mail addresses, current news, and company
listings. Use it when you want to search more than
just the Web or Usenet newsgroups.
http://guide.infoseek.com/
Infoseek Ultra
Has indexed the full text of over 50 million pages
and is updated daily. Unique to this new search
engine are automatic name recognition, no need for
quotes to recognize capitalized word phrases, and
all words are searched. Plain English queries work
and it finds all word variants, e.g. mice will find
mouse.
http://www.infoseek.com/Home?pg=ultra_home.html
Open Text
Gives you the choice of searching for a complete
phrase, searching for groups of words using Boolean
operators, or searching with proximity operators.
Unlike services such as Excite, Lycos, and
WebCrawler (which index only keywords), Open Text
Index catalogs every word on every page it
finds.
http://index.opentext.net/
WebCrawler
Small database, but great search engine and
relevancy ranking. Good for quick searches.
WebCrawler handles searches in an orderly and
logical manner. As a result, you'll almost always
find what you're looking for--even if it takes more
time than with other services. Use this search
engine primarily for finding sites related to
common topics.
http://webcrawler.com/
Excite
Excite is known for its conceptual searching --it
finds not only your keyword, but also lots of
concepts related to it.
http://www.excite.com/?1ag
HotBot
A search engine with first-rate speed and some
unusual features, including the ability to limit
searches to Web pages that contain specific
technologies such as JavaScript or Shockwave.
http://www.hotbot.com/
IF YOU HAVE ANY QUESTIONS OR NEED HELP
WITH ANY WEB SEARCH, EMAIL US AT
studio@exploratorium.edu
PLEASE LET US KNOW WHICH SLN SCHOOL YOU
ARE PART OF.
|