INTRODUCTION TO LOGIC USED IN
ELECTRONIC SEARCHING
Named after the nineteenth-century mathematician
George Boole, Boolean logic is a form of algebra in
which all values are reduced to either TRUE or
FALSE. There are three Boolean operators that can
be used to manipulate values and help insurethat
one's online searches will yield valuable results.
The terms--part of a century-old system called
Boolean logic--act as extremely effective filters
for finding just the information one needs on
databases and on the Web. Most of today's search
engines support some form of Boolean query. Check
the help section of a favorite search engine to
find out whether it allows Boolean searches.
While most of the search engines and databases
use Boolean logic, they don'talways express this
concept using the same terms. Therefore, it is
important to have a basic understanding of logical
systems so one can recognize the underlying process
whenever one encounters a new search engine or
database. We will be using examples from AltaVista,
a popular Web search engine.
Boolean Operators
AND, OR, NOT...they're simple words. Use them
correctly and the result will be relevant hits
instead of thousands of unrelated ones.
AND
The AND operator makes sure ALL the terms
one requests appear in the search results. If one
types hockey the result will be a lot of
hits on that sport. If one types sharks the
result will be a lot of hits, mostly on the animal.
But if one types hockey AND sharks, there is
a better chance of retrieving hits about the San
Jose, CA team.
AltaVista example: macaroni and cheese
The above would retrieve records or sites
containing both those terms in any order, evenin
separate fields, or paragraphs, with any number of
characters in between. Macaroni could be an
author's name and cheese could be in the title. One
could retrieve a recipe for a cheese sauce for
macaroni. Using AND returns very broad and
sometimes very surprising results, particularly
when searching full-text files, such as those on
the Web.
OR
Use OR to retrieve records or pages that
contains EITHER of two or more terms. For example
Microsoft OR Netscape will find information
that mentions either or both companies. OR
is frequently used to search for synonymous terms
or a variety of specific ways of expressing a
general concept
AltaVista Examples:
hives or urticaria
minority or asian or latino
NOT
Use NOT to ensure that certain words won't
appear in the search selections. For example,
modems NOT internal will narrow the search
to external modems. Of course one could have used
external AND modems to search for the same
concept, and, all-in-all, using AND is
usually preferable. As we will discuss later,
NOT is a very powerful operator that should
be used with caution.
AltaVista Examples:
AltaVista wants you to use NOT at the beginning
of a search statement and AND NOT in the middle:
clinton AND NOT whitewater
macaroni AND NOT cheese
NOT gold AND silver
Adjacency
Many Web search engines let you search on a
phase and have the words be immediately next to one
another and in the order you type them if you place
the words in double quotes
Altavista examples:
"hot air balloons"
"american psychological association"
"gone with the wind"
AltaVista also allows the use of a proximity
operator, NEAR. NEAR is defined as
limiting the terms to being within ten words of
each other.
AltaVista example: art NEAR
renaissance
This would retrieve
Art and Architecture of the Italian Renaissance
and also
The Northern Renaissance and Art
Combining Operators
The Boolean and proximity operators perform
different functions even when they are used to
relate the same topics. In search strategy an
AND links ideas to create a compound
subject. Java AND silicon valley will limit
the results to only Web pages which include both
subjects. Use of the AND operator will yield
a number of items less than or equal to the number
in either of the two sets; while use of the
OR operator will yield a number of items
greater than or equal to the number in either of
the sets.
The OR operator is often used when an
idea can be expressed by synonyms in order to
broaden the search. For example, the search
statement java OR coffee will cover the
topic of coffee drinks better than a search for
either topic alone, because an article which uses
the term coffee may not use the word
java and vice versa.
Java AND silicon valley could result in
hits about caffinated beverage imbibing in that
part of the San Francisco Bay Area, or the Netscape
programming language. This type of
linguisticproblem can be avoided by using the
NOT operator to restrict the search. java
AND sillicon valley NOT coffee will exclude
most hits on caffinated beverages.
The best searches combine operators. By
understanding the way Boolean operators manipulate
topics and by being able to picture compound
subject headings, one can enter precise and logical
search statements which, when applied to computer
databases and the Internet, will result in the
wanted information. An accurately formulated
request insures an effective and thorough search of
the electronically stored materials and can insure
against performing searches which are time
consuming and yield little results.
A Note about NOT
NOT can be a very dangerous operator to
use because it eliminates pages which include the
chosen term. There are many items that include both
the term wanted and the term not wanted and they
would all be eliminated. For example, if one wanted
information on lunar eclipses, the search strategy
would be lunar AND eclipses. It is tempting
to type eclipses NOT solar, but many pages
that have information on solar eclipses also have
information on lunar eclipses and one would miss
many relevant hits.
Venn Diagrams
Venn diagrams, invented by English logician John
Venn (1834-1923) to portray visually the algebra of
Boolean logic and set theory, are used to
illustrate the effects which the operators have on
the topics. If one imagines a rectangle to contain
all "Knowledge," then three circlesinscribed within
its outline will represent different portions of
knowledge. The circles will naturally occur within
the block. The circles overlap because units of
information can contain many distinct values. If
circle A represents "ice cream," it contains
all the flavors of ice cream, and circle B
is "vanilla," it contains the universe of vanilla,
including vanilla beans, vanilla cookies, vanilla
soda, etc. Overlapping the two circles will create
an area in common to both, This area will denote
"vanilla ice cream."
Venn diagrams can be useful for novice searchers
in visualizing one's search strategy.
Steps to Formulating and
Conceptualizing a Search
1. Identify concepts
When conducting any search, it is necessary to
break down the topic into its componentconcepts.
For example, if one wanted to find information on
the budget negotiations between President Clinton
and the Republicans, these are the concepts:
clinton, republicans, budget.
2. List keywords for each concept and their
synonyms
Once the concepts have been identified, one
needs to list keywords which describe each concept.
Some concepts may have only one keyword, while
others may have many. For example:
Concept 1: clinton, democrats
Concept 2: gingrich , republicans
Concept 3: budget, budget negotiations , budget
battle, budget impasse, budget deal
Depending on the focus of the search, there may
be other keywords that may be more appropriate.
3. Specify the logical relationships among the
keywords.
Venn diagrams may help. Once the keywords are
known, it is necessary to establish the logical
relationships among them using Boolean logic and
any or all of the logical or proximity operators.
clinton OR democrats AND gingrich OR
republicans AND budget
Notice that only the word budget is used
in the first pass of the search. This is because
one isnever sure how much information is out there
and it is good to be as broad as possible in
thebeginning while still being true to the search
goal. Later on, if many records or pages are
retrieved, one can narrow the search by adding
terms like impasse or negotiations.
Some search engines offer Boolean searching
without mentioning the logical operators byname.
For example, one might be asked to list the search
terms and choose that ALL of these terms be
searched. This denotes AND logic. Specifying
ANY of these terms denotes OR logic. .
Tips on Conducting Searches
1. Read the directions at each search site.
The technique for formulating a search depends
on the search engine one is using. There is a wide
variety of options available among the different
search engines.
2. Know when to use the "advanced" mode.
If it is a multi-term search, use an advanced or
Boolean search option if it is available. This will
help specify the logical relationships among the
terms.
3. Include synonyms or alternate spellings in
the search statements and connect these termswith
OR logic.
This is especially helpful if very little is
retrieved by the original search.
4. Check spelling.
5. Take advantage of capitalization if the
search engine is case sensitive.
6. If the results are not satisfactory, repeat
the search using alternative terms.
7. If there are too many results, or results
that are not relevant:
- Add concept words
- Use vocabulary that is more specific to the
topic
- Narrow the search to an individual field in
a record or parts of the Web page such asTitle,
Summary, First Heading, etc.
- Use the Boolean NOT to keep out records
containing terms not wanted
8. If there are too few results:
- If possible, drop off the least important
concept(s) to broaden the subject. For example,
if looking for hives AND strawberries AND
treatment and nothing is found, decide
whether the source of the hives is more
important, in which case one would choose
hives AND strawberries or if the cure is
more important, in which case one would choose
hives AND treatment .
- Use more vocabulary which is more general,
for example: allergy instead of hives.
- Add alternate terms or spellings for
individual concepts and connect with OR
9. Try different sources within search engines
to diversify the results.
Sources can include Usenet newsgroups, Internet
FAQs, reviewed pages, and more. In commercial
databases, choose other databases.
10. Experiment with different search engines.
No two search engines work from the same
database or have exactly the same way of going
about a search.
11. Try Web sites which allow searching of
multiple search engines simultaneously.
Be aware that one will lose access to advanced
query options since not all engines offer them.
If you still can't find what you are
looking for, ask us send email to
studio@exploratorium.edu.
Let us know where you have looked and what
strategies you have used. We will not only try to
find the information for you, but give you feedback
on your search approches. That way we can learn
together.
Readings
1-AltaVista Help for Advanced Queries
http://altavista.digital.com/av/content/help_advanced.htm
2-University of Southern California "Boolean
logic"
http://www-lib.usc.edu/Info/ilsdoc/pac/boolean.html
Review Venn diagrams for AND, OR and NOT
3-Page, Adam, "The Search is Over: The
search-engine secrets of the pros," PC/Computing,
v. 9, #10, October 1996, pp. 143+
http://www.zdnet.com/pccomp/features/fea1096/sub2.html
|