INTRODUCTION TO LOGIC USED IN ELECTRONIC SEARCHING

 

Named after the nineteenth-century mathematician George Boole, Boolean logic is a form of algebra in which all values are reduced to either TRUE or FALSE. There are three Boolean operators that can be used to manipulate values and help insurethat one's online searches will yield valuable results.

The terms--part of a century-old system called Boolean logic--act as extremely effective filters for finding just the information one needs on databases and on the Web. Most of today's search engines support some form of Boolean query. Check the help section of a favorite search engine to find out whether it allows Boolean searches.

While most of the search engines and databases use Boolean logic, they don'talways express this concept using the same terms. Therefore, it is important to have a basic understanding of logical systems so one can recognize the underlying process whenever one encounters a new search engine or database. We will be using examples from AltaVista, a popular Web search engine.

Boolean Operators

AND, OR, NOT...they're simple words. Use them correctly and the result will be relevant hits instead of thousands of unrelated ones.

AND

The AND operator makes sure ALL the terms one requests appear in the search results. If one types hockey the result will be a lot of hits on that sport. If one types sharks the result will be a lot of hits, mostly on the animal. But if one types hockey AND sharks, there is a better chance of retrieving hits about the San Jose, CA team.

AltaVista example: macaroni and cheese

The above would retrieve records or sites containing both those terms in any order, evenin separate fields, or paragraphs, with any number of characters in between. Macaroni could be an author's name and cheese could be in the title. One could retrieve a recipe for a cheese sauce for macaroni. Using AND returns very broad and sometimes very surprising results, particularly when searching full-text files, such as those on the Web.

OR

Use OR to retrieve records or pages that contains EITHER of two or more terms. For example Microsoft OR Netscape will find information that mentions either or both companies. OR is frequently used to search for synonymous terms or a variety of specific ways of expressing a general concept

AltaVista Examples:

hives or urticaria

minority or asian or latino

NOT

Use NOT to ensure that certain words won't appear in the search selections. For example, modems NOT internal will narrow the search to external modems. Of course one could have used external AND modems to search for the same concept, and, all-in-all, using AND is usually preferable. As we will discuss later, NOT is a very powerful operator that should be used with caution.

AltaVista Examples:

AltaVista wants you to use NOT at the beginning of a search statement and AND NOT in the middle:

clinton AND NOT whitewater

macaroni AND NOT cheese

NOT gold AND silver

Adjacency

Many Web search engines let you search on a phase and have the words be immediately next to one another and in the order you type them if you place the words in double quotes

Altavista examples:

"hot air balloons"

"american psychological association"

"gone with the wind"

 

AltaVista also allows the use of a proximity operator, NEAR. NEAR is defined as limiting the terms to being within ten words of each other.

AltaVista example: art NEAR renaissance

This would retrieve

Art and Architecture of the Italian Renaissance

and also

The Northern Renaissance and Art

Combining Operators

The Boolean and proximity operators perform different functions even when they are used to relate the same topics. In search strategy an AND links ideas to create a compound subject. Java AND silicon valley will limit the results to only Web pages which include both subjects. Use of the AND operator will yield a number of items less than or equal to the number in either of the two sets; while use of the OR operator will yield a number of items greater than or equal to the number in either of the sets.

The OR operator is often used when an idea can be expressed by synonyms in order to broaden the search. For example, the search statement java OR coffee will cover the topic of coffee drinks better than a search for either topic alone, because an article which uses the term coffee may not use the word java and vice versa.

Java AND silicon valley could result in hits about caffinated beverage imbibing in that part of the San Francisco Bay Area, or the Netscape programming language. This type of linguisticproblem can be avoided by using the NOT operator to restrict the search. java AND sillicon valley NOT coffee will exclude most hits on caffinated beverages.

The best searches combine operators. By understanding the way Boolean operators manipulate topics and by being able to picture compound subject headings, one can enter precise and logical search statements which, when applied to computer databases and the Internet, will result in the wanted information. An accurately formulated request insures an effective and thorough search of the electronically stored materials and can insure against performing searches which are time consuming and yield little results.

A Note about NOT

NOT can be a very dangerous operator to use because it eliminates pages which include the chosen term. There are many items that include both the term wanted and the term not wanted and they would all be eliminated. For example, if one wanted information on lunar eclipses, the search strategy would be lunar AND eclipses. It is tempting to type eclipses NOT solar, but many pages that have information on solar eclipses also have information on lunar eclipses and one would miss many relevant hits.

Venn Diagrams

Venn diagrams, invented by English logician John Venn (1834-1923) to portray visually the algebra of Boolean logic and set theory, are used to illustrate the effects which the operators have on the topics. If one imagines a rectangle to contain all "Knowledge," then three circlesinscribed within its outline will represent different portions of knowledge. The circles will naturally occur within the block. The circles overlap because units of information can contain many distinct values. If circle A represents "ice cream," it contains all the flavors of ice cream, and circle B is "vanilla," it contains the universe of vanilla, including vanilla beans, vanilla cookies, vanilla soda, etc. Overlapping the two circles will create an area in common to both, This area will denote "vanilla ice cream."

Venn diagrams can be useful for novice searchers in visualizing one's search strategy.

Steps to Formulating and Conceptualizing a Search

1. Identify concepts

When conducting any search, it is necessary to break down the topic into its componentconcepts. For example, if one wanted to find information on the budget negotiations between President Clinton and the Republicans, these are the concepts: clinton, republicans, budget.

2. List keywords for each concept and their synonyms

Once the concepts have been identified, one needs to list keywords which describe each concept. Some concepts may have only one keyword, while others may have many. For example:

Concept 1: clinton, democrats

Concept 2: gingrich , republicans

Concept 3: budget, budget negotiations , budget battle, budget impasse, budget deal

Depending on the focus of the search, there may be other keywords that may be more appropriate.

3. Specify the logical relationships among the keywords.

Venn diagrams may help. Once the keywords are known, it is necessary to establish the logical relationships among them using Boolean logic and any or all of the logical or proximity operators.

clinton OR democrats AND gingrich OR republicans AND budget

Notice that only the word budget is used in the first pass of the search. This is because one isnever sure how much information is out there and it is good to be as broad as possible in thebeginning while still being true to the search goal. Later on, if many records or pages are retrieved, one can narrow the search by adding terms like impasse or negotiations.

Some search engines offer Boolean searching without mentioning the logical operators byname. For example, one might be asked to list the search terms and choose that ALL of these terms be searched. This denotes AND logic. Specifying ANY of these terms denotes OR logic. .

Tips on Conducting Searches

1. Read the directions at each search site.

The technique for formulating a search depends on the search engine one is using. There is a wide variety of options available among the different search engines.

2. Know when to use the "advanced" mode.

If it is a multi-term search, use an advanced or Boolean search option if it is available. This will help specify the logical relationships among the terms.

3. Include synonyms or alternate spellings in the search statements and connect these termswith OR logic.

This is especially helpful if very little is retrieved by the original search.

4. Check spelling.

5. Take advantage of capitalization if the search engine is case sensitive.

6. If the results are not satisfactory, repeat the search using alternative terms.

7. If there are too many results, or results that are not relevant:

  • Add concept words
  • Use vocabulary that is more specific to the topic
  • Narrow the search to an individual field in a record or parts of the Web page such asTitle, Summary, First Heading, etc.
  • Use the Boolean NOT to keep out records containing terms not wanted

8. If there are too few results:

  • If possible, drop off the least important concept(s) to broaden the subject. For example, if looking for hives AND strawberries AND treatment and nothing is found, decide whether the source of the hives is more important, in which case one would choose hives AND strawberries or if the cure is more important, in which case one would choose hives AND treatment .
  • Use more vocabulary which is more general, for example: allergy instead of hives.
  • Add alternate terms or spellings for individual concepts and connect with OR

9. Try different sources within search engines to diversify the results.

Sources can include Usenet newsgroups, Internet FAQs, reviewed pages, and more. In commercial databases, choose other databases.

10. Experiment with different search engines.

No two search engines work from the same database or have exactly the same way of going about a search.

11. Try Web sites which allow searching of multiple search engines simultaneously.

Be aware that one will lose access to advanced query options since not all engines offer them.

If you still can't find what you are looking for, ask us send email to studio@exploratorium.edu. Let us know where you have looked and what strategies you have used. We will not only try to find the information for you, but give you feedback on your search approches. That way we can learn together.

 

Readings

1-AltaVista Help for Advanced Queries

http://altavista.digital.com/av/content/help_advanced.htm

2-University of Southern California "Boolean logic"

http://www-lib.usc.edu/Info/ilsdoc/pac/boolean.html

Review Venn diagrams for AND, OR and NOT

3-Page, Adam, "The Search is Over: The search-engine secrets of the pros," PC/Computing, v. 9, #10, October 1996, pp. 143+

http://www.zdnet.com/pccomp/features/fea1096/sub2.html

 

Exploratorium Learning Studio
3601 Lyon Street
San Francisco, CA 94123
415-528-4343
studio@exploratorium.edu

© The Exploratorium 3601 Lyon Street San Francisco, CA 94123

8/13/98