A Recent study and research have proved that the largest and most efficient search engine can only cover a portion of the content that is publicly available. To sum up, no search engine indexes more than 16% of the web. Since, it has been proved that the crawler downloads just a fraction of the web pages it is compulsory that the downloaded fraction contains the most relevant pages.

The whole process requires a focus of giving priority to Web pages. Function of a pages intrinsic quality, popularity of its URL and links are referred to as the importance of a page.

Many web experts studied the policies of the schedule planning for web crawling carefully. A brief description of these experts and their studies is given below:

a) Cho et al, 1998: conducted the first study on policies for crawling scheduling. The study concluded that the partial page rank strategy becomes better in case the crawler desires to download pages with high page rank. This is for a single domain.
b) Najork and Wiener; 2001: On performing a crawl on as many as 328 million pages with the help of breadth-first ordering, they found that pages with high page rank were captured by the breadth-first crawl. To conclude, important pages have numerous links attached to them from several host and those links will be found early.
c) Abiteboul et al.2003: devised a crawling strategy base on an algorithm called OPIC (on-line page importance computation.
d) Boldi et al, 2004: tested the breadth-first against random ordering and a powerful strategy by using simulation on subsets of the web of 40 million pages from it domain and 100 million pages.
e) Baeza-yates et al: tested numerous crawling strategies by using simulation on two subjects of the web of 3 million pages from cl domain and the gr domain.

Path ascending crawling:

Web Crawlers aim at accumulating as many resources as possible from a website. They accumulate information by downloading the information. In 2004, Cothey introduced a Path ascending crawler. This path crawler had the quality to ascend to every possible path in URL.

This crawler was considered extremely beneficial in tracing isolated resources. It even found out resources that would have not given own any inbound link in regular crawling.

Focused Crawling: In a focused crawling, a function of the similarity of a page to a given query exhibits the significance of a web page for a crawler.

Deep web crawling: There are many pages that cannot be accessed by regular crawlers if there are no links provided to them. These pages are found in the deep web and can be accessed only once the queries are submitted to a database.