Ad Code

Responsive Advertisement

Search Engine Technology

Conventional search engine technology has been based upon two main categories, which are crawler-based search engine and the human-powered directories based search engine(Sullivan, 2001). There also are search engines that use a combination of both techniques when presenting results to a user’s query. In the case of hybrid search engines, a crawler-based engine augments its results with human generated lists, while the human generated directories may augment its results with crawler-based results when its directories contain no or few results for a particular query (Marendy, 2001).

Many crawler-based search sites have added human-powered directory-based topic browsing. The Web is organized as a tree of topics, similar to the Dewey decimal system, in which the tree nodes are maintained by paid ontologists or special volunteers (Chakrabarti, 2000).

Other than the conventional search engine technologies mentioned above, there is also another search engine approach in which contains no internal database. It merely provides Internet surfers with a common interface for searching a variety of search engines. After user has entered his or her query line in the search text field, it will return user with the query results derived from multiple search engines.

1. Crawler based search engine

Crawler-based search engines, such as Google, create their listings automatically. Crawler based search engines are built upon three base design elements which are crawler or spider, index, and search engine software. In Google, the web crawling (downloading of web pages) is done by several distributed crawlers. There is a URLserver that sends lists of URLs to be fetched to the crawlers. The web pages that are fetched are then sent to the storeserver. The storeserver then compresses and stores the web pages into a repository. Crawler-based search engine crawls the web periodically in order to maintain an up to date index (Thewall, 2001).

2. Human Power Directories

Human power directories are also known as Web Directories or Hierarchical Directories (Marendy, 2001). They presented directories or pages in a hierarchical taxonomy, based on categories, so that queries can be directed to within that categories to improve the relevance and quality of search results (Huang, 2001).

Such directories are compiled through a variety of means. Direct submission of a site for review can be made by the site owner, and can include a short text description to be displayed in the search results to the user. If a description is not supplied then the reviewer will add one if the page is to be included in the directory (Sullivan, 2001).

Human-powered directories are also compiled by reviewers who maintain a particular category that they are familiar with (Huang, 2001). The manually compiled lists of the human powered directories can be interpreted as representing a judgement of “authority” for the pages that are interpreted by reviewers (Kleinberg, 2001). According to Brin, S. & Page, L ., 1998, although human maintained directories cover popular topics effectively but are subjective, expensive to build and maintain, slow to improve, and cannot cover all esoteric topics .

3. Meta search engine (Eagan & Bender, 1996)

Meta search engine is a search engine that provides a common interface for searching a variety of search engines. You enter your search on the Query line and it sends your query to multiple search engines.

Meta search engine is a search service that has no internal databases. It simply acts as a front end for different search engines. Meta search engine sends a user’s query to the search engines, then puts them into a uniform format for display. The results display returns the title of the document, selected text or an abstract (depending on the search engine), the relevancy ranking, the URL, and the search engine from which the information came.

The advantages of meta search engine are it provides a single interface for a number of popular search engines, allows you to use some fairly sophisticated search options and will check the document URLs to make sure the link is valid.


Reference:

Brin, S. & Page, L. (1998) The anatomy of a large-scale hypertextual web search engine, In Proc. of WWW7.

Chakrabarti, S. (1999) Recent results in automatic Web resource discovery. Indian Institute of Technology Bombay, Department of Computer Science and Engineering, Vol 31 , Issue 4es, Publisher: ACM Press, New York, USA.

Huang, L. (2001). A Survey On Web Information Retrieval Technologies.
Available at http://citeseer.nj.nec.com/336617.html

Kleinberg, J. (2001). Authoritative sources in a Hyperlink Environment, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998, and February 2001
Available at: http://www.cs.cornell.edu/home/kleinber/auth.ps

Marendy, P (2001). A Review of World Wide Web searching techniques, focusing on HITS and related algorithms that utilise the link topology of the World Wide Web to provide the basis for a structure based search technology.
Available at: http://net.pku.edu.cn/~webg/papers/a-review-of-world.pdf

Sullivan, D. (2001). How Search Engine Work.
Available at: http://www.searchenginewatch.com/webmaster/work.html

Thelwall, M. (2001). The responsiveness of search engine indexes, Cybermetrics, 5(1). Available at: http://www.cindoc.csic.es/cybermetrics/articles/v5i1p1.html.

Eagan, A & Bender. L. (1996). Spiders and Worms and Crawlers, Oh My: Searching on the World Wide Web.
Available at: http://www.library.ucsb.edu/untangle/eagan.html

Post a Comment

0 Comments