Google News
logo
What is a search engine?
A search engine is an internet-based software which helps us to search an index of documents for a certain definition, phrase, or text. It is mostly used to refer to large web-based search engines that search through billions of pages on the internet.

A search engine consists of two main things : a database of information, and algorithms that compute which results to return and rank for a given query.

In the case of web search engines like Google, the database consists of trillions of web pages, and the algorithms look at hundreds of factors to deliver the most relevant results.

When a user enters a query into a search engine, a Search Engine Results Page (SERP) is returned, ranking the found pages in order of their relevance. How this ranking is done differs across search engines.

Search engines often change their algorithms (the programs that rank the results) to improve user experience. They aim to understand how users search and give them the best answer to their query. This means giving priority to the highest quality and most relevant pages.
How Do Search Engine Works
Basically, every Search Engine uses its own algorithm to rank webpages making sure that only relevant results are returned for the query entered by the user. The result for a specific query is then shown on the Search Engine Results Page (SERP)
 
The algorithm is a very complex and lengthy equation which calculates a value for any given site in relation to a search term.  We don’t know what the algorithm actually is, because search engines tend to keep this a closely guarded secret from competitors and from people looking to game the search engine to get to the top spots.
 
Search engines have three primary functions :
 
1. Crawl
2. Index
3. Rank
Crawl :

Search engines have a number of computer programs called web crawlers (thus the word Crawling), that are responsible for finding information that is publicly available on the Internet.

Search engines have their own crawlers, small bots that scan websites on the world wide web. These little bots scan all sections, folders, subpages, content, images, videos or any other format (CSS, HTML, javascript, etc) everything they can find on the website.

Crawling is based on finding hypertext links that refer to other websites. By parsing these links, the bots are able to recursively find new sources to crawl.

Why care about the crawling process ?

There are a number of things to do to make sure that crawlers can discover and access your website in the fastest possible way without problems.

1. Use Robots.txt to specify which pages of your website you don’t want crawlers to access.

2. Top search engines like Google and Bing, have tools (Webmaster tools), you can use to give them more information about your website (number of pages, structure, etc) so that they don’t have to find it themselves.

3. Use an XML sitemap to list all important pages of your website so that the crawlers can know which pages to monitor for changes and which to ignore.

Google Webmaster Tools Example:

Google Search Console

Big Webmaster Tools Example:

Google Search Console
Index :

Search engines process and store information they find in an index, a huge database of all the content they’ve discovered and deem good enough to serve up to searchers. Once the bots crawl the data, it’s time for indexing. The index is basically an online library of websites.

Your website has to be indexed in order to be displayed in the search engine results page. Keep in mind that indexing is a constant process. Crawlers come back to each website to detect new data.

Rank :

When someone performs a search, search engines scour their index for highly relevant content and then orders that content in the hopes of solving the searcher's query. This ordering of search results by relevance is known as ranking. In general, you can assume that the higher a website is ranked, the more relevant the search engine believes that site is to the query.