• Home
  • SEO Resources
  • Sitemap
  • About SEO Notes
  • Contact us
  • SEO Themes
  •   Subscribe via feeds

Building the index

Posted by seonotes in April 19th 2006  

-->

It is essential for a search engine to store the information in a useful manner as soon as the spider finds information on web pages.

The accumulated data can be made accessible to the users by including too key components.

a) The information stored with the data.
b) The method by which the information is indexed.

A search engine is designed in a way to store the world and the URL where it was found. Now this would make a search engine accessible to only a limited use. To make a search engine useful for better results, it must store more than just the word and URL. This is what an efficient search engine does.

A search enging might store the number of times that the word appears on a page. The engine may give a task to provide a weight to each entry, with increasing values assigned to words as they appear near the top of the document, in sub-heading, in links, in the meta tags or in the title of the page.

Since each search engine follows a different formula for assigning weight to the words in its index.

This result in to the production of different lists for a search for the same word on different search engines.

The data encodes to save storage space irrespective of the specific combination of additional pieces of information stored by a search engine.

For instance, the Google paper uses 2bytes, of 8bits each to store information on weighting, irrespective of the word was capitalized to help in ranking the hit.

Each factor may take up 2 or 3 bits within the 2-byte grouping (8 bits = 1byte).

This facilitates the storage of a great deal of information in a compact form. The compacted form of information is then ready for indexing.

An index has a sole purpose, that is to allow the information to be traced in a less possible time.

One of the most effective ways to build an index is to build a hash table.

In this process, a formula is applied to attach a numerical value to each word. The formula is designed to evenly distribute the entries across a predetermined number of divisions. This type of distribution is contrast to the distribution of words across the alphabet. This is the key to the success of a hash table.

In English, there are some specific letters that begin many words. For instance, there are many words that begin with the letter “A” as compared to the letter “X”.

Hashing decreases the amount of average time it take to find an entry irrespective of the word typed (”A” or “X”). It even separates the index from the actual entry.

The hash table mainly consists of the hashed number along with a pointer to the actual data, that can be sorted in whichever way allows it to be stored most efficiently.

This is facilitated by the combination of efficient indexing and effective storage, thereby providing quick search results.

Popularity: 4% [?]

Digg it Add to del.icio.us Stumble it No Comment

No Comment

Random Post

  • Overview of Directory Submission Services
  • Small efforts with awareness can make difference between Success and Failure of your online business.
  • PPC Campaigns
  • Measuring the Effectiveness of your keywords in articles
  • Keyword Strategies - Long Term and Short Term
  • Notes on Off-Page-SEO factors.
  • Developing a contest entry
  • Simple and specific website contents, good for humans and rules the SE's.
  • Useful tips for participation in seo contests
  • What are Seo Contests
Leave Your Comments Below

Please Note: All comments will be hand modified by our authors so any unsuitable comments will be removed and you comments will be appreared after approved

« Search Engine Optimization Techniques Revealed
Developing a contest entry »

Tags Cloud

2008 advertising article marketing articles article submissions article writing blogs contents contest copywriting crime css design directory submission directory submissions forums google identity theft image optimizations internet key phrase keywords kill spam Link Building linking strategy marketing marketing plan Meta Tags no spam off-page-seo on-page-seo organic seo RSS S.E.O. search engine optimization seo contest SEOcontest2008 SEO Contests seo notes seo tips SMM social bookmarking social marketing social networks website

Featured SEO Articles

Measuring the Effectiveness of your keywords in articles

The heart of SEO after keyword research, is writing articles that target those keywords. This is a very fine line and one that is easy to misread. Far too many people cram keywords into their ...read more

Google Analytics – What is your most valuable content?

We all know that SEO can be a hit and miss game sometimes. Keywords or pages that we thought would be very popular fail to attract attention and sometimes those pages which we thought were ...read more

Google Analytics – Where are your visitors coming from?

As we saw earlier, the visitors tracking module of Google Analytics provides detailed statistics about who is visiting your site and what they are doing there. However, to find out where they came from and ...read more

Search

Categories

  • Link Building (9)
  • Meta Tags (8)
  • Search Engines (18)
  • SEO Contests (8)
  • Web (10)
  • Web Crawlers (5)
  • WordPress Theme Contest (1)
  • seo notes (73)
  • seo tips (11)
  • social bookmarking (1)
  • website development (1)
  • directory submission (1)
  • web hosting (1)
  • domain registration (1)
  • SEO Software (9)

Archives

  • November 2009 (2)
  • October 2009 (3)
  • August 2009 (4)
  • July 2009 (6)
  • June 2009 (6)
  • May 2009 (6)
  • April 2009 (2)
  • February 2009 (3)
  • January 2009 (1)
  • March 2008 (1)
  • February 2008 (13)
  • January 2008 (9)

Pages

  • SEO Resources
  • Sitemap
  • About SEO Notes
  • Contact us

Meta

  • Log in
  • Valid XHTML
  • Valid CSS
  • kabonfootprint

RSS Search Engine Optimization News

    • DomainConsultant.com Opens Domain Name Auction (Marketwire via Yahoo! Finance) March 18, 2010
    • Paid Search Freakonomics: Finding and Ostracizing Losers (Search Engine Watch) March 17, 2010
    • What The Future Of Search And Social Marketing Means To An InHouse SEM (Search Engine Land) March 17, 2010
    • Get Acquainted With SEO Services To Boost Your Website's Business Potential (Turks.US) March 16, 2010
    • Search Engine Strategies (SES) Amsterdam 2010 Conference & Expo (PRWeb) March 16, 2010

Most Commented

  • SEO Spam Tactics to avoid : Blog Comment Spamming (4)
  • Keyword Strategies - Long Term and Short Term (3)
  • Time to say Good Bye readers (3)
  • Using Google Analytics (3)
  • SEO Contests - All you like to know about them. (2)
  • Float well with Search Engines - A repository of useful SEO Notes. (2)
  • Measuring Success in SEO (2)
  • Rank Tracker Software for measuring SEO (2)
  • What are Seo Contests (1)
  • Developing a contest entry (1)

Most Popular

  • How search engines accomplish major tasks assigned to them
  • Custom Web 2.0 (XHTML) Websites? how to get one with a small budget.
  • Winning in SEO Contest 2008 Can be Achieved through Forums
  • Time to say Good Bye readers
  • You Create a concept and smart webmaster's will earn money on it.
  • SEONotes Web Hosting and Domain Registrar reviews
  • Link Building: One Way Linking Strategies
  • Get your profile up on every network or loose your identity.
  • Link Building : Reciprocal Link Neighbors
  • Better Search Engine Placement through a Combination of SEO Strategies

Random Posts

  • Search Engine Optomization - SEO Contests vs Day to Day SEO
  • Search Engines hold lot of possibilities.
  • Using Google Analytics
  • Using Advanced Segmentation tools in Google Analytics
  • All about Search Engine
  • Manual Directory submissions are still the best way to gain quality links.
  • Web Crawler - Parallelization Policy
  • Welcome to the New Home of SEONotes.com
  • PPC Campaigns
  • PR Promotion, an effective press release may spread lot of buzz about you.
©2006-2010 SEO Notes
Disclaimer: All data and information provided on this site is for informational purposes only. SEO Notes makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site & will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use.All information is provided on an as-is basis.