Information retrieval through
searches and search engines is very challenging, expensive and
well-established. If search becomes a necessity, some sites or intranets
incorporate search systems from sites that allow you to search the entire web.
There are three different ways of searching the web:
·
A search within your site or its sub-sites, e.g.
a search within www.dice.com and its very
sub-sites
·
Metasearch, which involves searching across
multiple sites, e.g. www.clusty.com and www.dogpile.com
The website http://searchenginewatch.com/ is a great
resource for the latest information on web searching. The IA has to make the
decision whether their site needs to be searchable or not. They should be very
careful not to make the typical assumption that a search engine alone will
satisfy all users’ information needs. There are browsers who forego the search
utility but prefer to peruse the site and have a feel of things. Before the IA
makes the decision of adding the search functionality to their site, they
should carefully answer the following questions;
·
Is there sufficient content in your site?
· Does the company have sufficient resources to
invest in this effort? Is the investment going to divert resources from more
useful navigation systems?
·
Is time and the technical know-how available to
invest in optimizing your search system/
·
Are there better alternatives to search?
·
Will your site’s users actually bother to use
its search system?
Planning the capacity of your
site or intranet can sometime be very tricky and determinant whether to include
a search system or not. When sites become very popular, they grow organically
and more and more functional features get piled on haphazardly, leading to a
navigation nightmare. Certain issues can actually help the IA decide whether or
not their site has reached the point of needing a search system:
·
Your site has too much information to browse
·
If the site has become fragmented, it can
definitely use some help from a search system
·
Search can actually become a learning tool to
help improve the site through the analysis of the search logs
·
Nowadays, search actually needs to be there
because it has become a user expectation;
most users typically expect to find a search window on every single web
site they visit
·
If your site has highly dynamic content, you
should definitely include a search system to it.
The IA should make search
inclusion decisions based on the end-users of the site; hence they should know
their site’s users. The decision whether or not to include a search
functionality to either the intranet or a website is greatly influenced on how
much the IA knows his/her site’s users. This decision should be solely made
with the users in mind, rather than on the available technology. The search
system actually interfaces with the site’s users, hence the user should be the
King in influencing this decision.
The working of the search system
is usually a three part configuration. At the center of this configuration is
the search engine which contains indexes from indexed documents and processes
the queries from the searchers via the search interface. Matching indexes are
produced in the form of results to the queries which were supplied to the
search engine. Documents usually include web pages and web sites serve as the
input into the search system. Indexing can be manual or automatic. Traditional
commonly used manual systems for compiling indexes of documents make use of
cards, such as library catalogue cards, but nowadays a good computerized
Personal Reference System is to be preferred. For each document acquired, the
bibliographic identification elements are written, or typed, on a card. Thus,
for a journal article, the structure is: author's surname and forenames;
article title; periodical title; volume number; part number; date of publication;
pages. Keywords or descriptors of the contents should be written up.
Alternatively, a short abstract or summary can be included (you can often make
use of abstracts written by the author). The use of a standardized reference
format style is recommended. In automatic indexing, spiders & robots crawl
websites and index pages according to their own rules. As a result, they build
large databases containing the indexes.
Determining what to search for
can also be tricky. Whether to search the entire site or just specific pages or
documents or whether to create search zones or not, or whether to index the
entire site or just specific pages or documents or zones within the site are
all decisions to be made by the IA during the search system design. Sometimes it
becomes necessary to determinate search zones to limit searching the entire
site/intranet. It might also be necessary to create a mini search site within
the website itself. This search site can either be sub-site or a document type.
Some sites might necessitate the incorporation of web search within. This involves
searching through multimedia and heterogeneous sites with diverse content. Search
can also involve full text searches of the information being requested or just
the metadata about what’s being requested. The IA also has to decide what type
of indexing to incorporate within the search engine for documents, either content
words or just important words as those found in the metadata fields. Indexing can
also be for specific audiences, by topic or just for recent content, reading
level, topic, date of update, user task, etc…
Search algorithms find items with
specified properties among a collection of items. The items may be
stored individually as records in a database;
or may be elements of a search space defined by a mathematical
formula or procedure, such as the roots of an equation with integer
variables; or a combination of the two,
such as the Hamiltonian circuits of a graph.
There are about 40 different retrieval algorithms which retrieve information in
different ways. Most of these algorithms employ pattern-matching which uses recall
and precision.
Query builders affect the outcome
of a search by souping up a query’s performance. They are usually invisible to
users and common examples include:
·
Spell checks
·
Phonetic tools (the best-known of which is “Soundex”)
·
Stemming tools that allow users to enter a term
·
Natural language processing tools
·
Controlled vocabularies and thesauri
The IA will also need to determine
afore-hand and make choices on how the results for the search engines are to be
presented. Here, there are two main issues to consider:
·
Which content components to display for each
retrieved document– display less information to users who know what they’re
looking for, and more information to users who aren’t sure what they want, how
much or how many, how much information for each item,.
·
How to list or group the search results – by categories,
alphabetically, chronologically, ranking by relevance, ranking by popularity,
by users’ or experts’ ratings, by pay-for-placement (different sites bid for
the right to be ranked high, or higher, on users’ result lists.
Design the search interface
implies putting together what to search, what to retrieve, and how to present
the results in a single interface. With a varied user commodity and
search-technology functions, there are also many different types of search
interfaces. Designing the search interface will involve considering the
following variables:
·
Level of searching expertise and motivation
·
Type of information need
·
Type of information being searched
·
Amount of information being searched
No comments:
Post a Comment