Keyword Search
What is Keyword Search?
- Keyword search is based on the occurrence of specific words in the page,
and is the most common form of text search on the Web.
- Unless the author of the Web document specifies the keywords for her
document (this is possible by using meta tags in later versions of
HTML), it's up to the search engine to determine them.
- Essentially, this means that search engines pull out and index words
that are believed to be significant.
- Words that are mentioned towards the top of a document and words that
are repeated several times throughout the document are more likely to
be deemed important.
- Some sites index every word on every page. Others index only part of
the document. For example,
- Lycos indexes the title, headings,
subheadings and the hyperlinks to other sites, along with the first 20
lines of text and the 100 words that occur most often.
- Infoseek uses a full-text indexing system, picking up every word in
the text except commonly occurring stop words such as "a," "an,"
"the," "is," "and," "or," and "www."
Problems With Keyword Searching
- Cannot distinguish different meanings of similarly spelt words (e.g.
river bank; bank robbery)
- Cannot make the jump between a word typed in the query and
a relevant, but differently spelled word in a document. (e.g
Macdonalds; hamburger)
- May not recognise different forms of the same word (cook; cookery; cooks).
[Fri Feb 11 14:29:29 2000]