Google’s Web crawling bot – Googlebot (aka a spider) – is the mechanism by which the search giant discovers new and updated Web pages to index.
While there are seemingly endless pages across the Web (billions actually) Google is able to find practically all of them utilizing an enormous number of computers that “fetch” the data. The Googlebot algorithm determines which websites to crawl, how often to crawl them, and how many pages to fetch from each site.
This is how Google is able to return fast and accurate search results.
How it works
Googlebot’s crawl process begins with a list of webpage URLs, generated from previous crawl processes and augmented with sitemap data that is provided by webmasters. As Googlebot visits each of these websites it detects links on each page and then adds them to its list of pages to crawl. New sites, changes to existing sites, and dead links are noted and used to update the Google index.