SEO
What is site indexing ᐈ How to set up indexing
What is site page indexing ❶ How to check a website's indexing in search engines ✅ How long does it take for a page to be indexed
Indexing websites by search engines: what you need to know
The first step in setting up an online business is to raise your website. The second is to make sure people can find it. And in this case, it is vitally important to know what indexing is. We'll talk about how search engines work and how to make their algorithms useful.
What is site indexing?
Indexing is the process of crawling a site with special programs that are used by search engines. First of all, they study the text content on the site. They also specialize in programming code, ranging from basic HTML structures to complex scripts. Recently, search engines have begun to pay more attention to graphic components and multimedia content, although the principles of working with them are still far from perfect.
The result of crawling a site is entering basic information about it into the search index. This is a huge database that contains information about all active web resources that comply with current SEO principles and do not violate search engine rules. Once indexed, a site may appear in search results for certain queries, where users will see it. If the page is not in the database, it will be very difficult to find it, which will definitely not benefit the business.
How can I check the indexing of a website in search engines?
If the web resource belongs to you, the procedure will be extremely simple. To check the site's indexing in search engines, you need to use the webmaster's panel. For example, the top search engine has Google Search Console. It requires:
1. Log in to your account. If the site has serious problems, they will already be visible on the main “Overview” page.
2. For more information, find the Index section and open the Coverage report. Here is data on the status of the site, broken down into indexed and non-indexed pages.
3. Details the site indexing report. By clicking on the category name, you can see a list of specific pages. When considering issues, Google will indicate the reason why the page is not indexed: duplicate content, technical errors, improper redirection, or manual sanctions.
To check how another owner's site (partner, competitor or just a random business representative) is indexed, you need to use special search engine commands:
· site:domain URL — select all indexed pages of a particular web resource. You will have to count them manually, which can be quite problematic for large platforms. An example of a command is site:microsoft.com;
· site:page URL — show a specific page and all its subsections. It allows you to check the indexing of a suspicious page that may contain errors. Example command —site:microsoft.com/en-us/microsoft-365;
· cache:page URL — allows you to see the archived page as it was when the site was indexed by search engines. It also contains information about the date and exact time of the scan. An example of a command is cache:microsoft.com/en-us/microsoft-365.
How do I set up site indexing?
The scanning process is as automated as possible. To get into the search results, you don't need to contact Google engineers and write letters. However, it should be noted that website indexing is extremely slow. If you let this process go by itself, the first scan may start only a month after the web resources are created.
Fortunately, it is possible and even necessary to speed up the first indexing of pages. Before you begin this process, you should make sure that the relevant sections of the resource have the Index:Follow and rel=canonical tags in the HTML code, are available to users, and provide an HTTP 200 response. Next, you need to:
1. After creating a site, sign up for Google Search Console and confirm ownership of the web resource.
2. Once you have access to your account, go to the “URL Verification” tab in the main menu at the top left.
3. Enter the page address. If it is not yet in the search engine database, a warning will appear. You can also go to the site to check its performance.
4. To index a page manually, click the “Request Indexing” button. As a rule, this process takes from a few minutes to a day.
It is important to remember that search bots have their limitations — they can index a limited number of pages for a certain period of time. If the site is very large, it's best to choose only sections that can bring in the most organic traffic.
Another way to speed up and streamline the process is to create two files with an action plan for crawlers. The first one is robots.txt. It contains a list of pages and rules for crawling them. It uses the following commands:
· User-agent: — determines the name of the robot for a particular search engine;
· sitemap: — indicates the path to the site map;
· Allow: — allows indexing;
· Disallow: — prohibits scanning;
· Crawl-delay: — sets the delay between page scans (usually used to successfully execute scripts);
· Clean-param: — indicates pages with special scanning parameters (most often used to block some links, for example, UTM tags).
An example of filling in a file:
To add a site to indexing, search engines will need another file —sitemap.xml. In fact, this is a site map that shows the relationships between individual pages. It paves the best way for search robots to move. It looks like this:
Naturally, in large projects, writing and editing such files will be a very laborious process. Therefore, to perform this task, special services are used that automate the process, for example, MySiteMapGenerator, Ryte, Small SEOTools, etc.
How long does it take for a page to be indexed?
The answer to this question will be individual for each site. We have already said that it can take a month to index a new website, but indexing a trusted resource with extensive experience and an excellent reputation is usually done in minutes. New pages on some media platforms can be crawled even in a few seconds. On average, indexing takes 24—48 hours.
Please note that the presence of a site in the index does not mean that it is visible to users. Page indexing is a prerequisite for entering the search results, but it can take a couple of days from the moment you crawl it to the link appears on other people's screens.
How to speed up website indexing?
Do you regularly update your web resource, but traffic is not growing due to slow page indexing? Of course, at first you can request it manually, especially when it comes to a small site. But already at this stage, it is important to consider a long-term solution to the problem. To index your site faster, we recommend:
· Increase link mass. The logic is pretty simple: if other resources link to your content, Google thinks it's right, authoritative, and valuable. The page gets more weight in the database, as a result of which search bots pay more attention to it.
· Fill the site with high-quality content. Although the nature of the text alone does not affect the page's indexing speed, it can also determine the authority of a resource. When ranking, Google uses both technical and behavioral factors, such as the number of clicks, the percentage of reading content to a certain point, the time spent studying it, etc. The more the text catches users, the more weight the page gets.
· Conduct regular content audits. The text should be readable and contain an optimal number of keywords. It is also important to create internal linking — links between website pages that provide easy navigation. You can't use duplicate content, spam with an excessive number of search queries, and fill the site with low-quality texts that are not useful to visitors.
· Update robots.txt and sitemap.xml files regularly. To open a site for indexing, you need to tell robots which pages to crawl. As the resource evolves, their list will change, which should be reflected in the instructions.
· Speed up site loading. Search engines have a bad attitude towards slow resources — they lower their rankings and limit the number of pages indexed. To speed up, you can optimize the amount of multimedia content, disable unnecessary scripts, or even move to a better hosting service.
· Use third-party services. There are programs that speed up page indexing. Their list is constantly changing as they adapt to new search algorithms. You can also use simple but effective acceleration methods, such as posting links to new pages on social media, including them in interesting organic content.
If none of the methods gives the desired result, you should check whether the page is not closed from indexing. If this step is also unsuccessful, it is best to contact specialists who will conduct a comprehensive diagnosis of the site and find the causes of the problems.
How to close a site from indexing?
While crawling by crawlers is important for a site, indexing all pages is not particularly valuable. Moreover, if the site has draft copies of sections with duplicated content or technical sections that are not optimized for SEO rules, it can even be harmful.
Therefore, some sections must be closed from search robots using one of three methods:
1. Noindex HTML tag. It is used in this section <head>along with other meta tags. It looks like <metaname="robots” content="noindex, nofollow">. To prevent robots from indexing the page but clicking on the links posted on it, you should specify content="noindex, follow”.
2. robots.txt file. Pages not mentioned in it will be crawled by default. To block them, use the “Disallow:” command.
3. HTTP code 403. It is configured in the web server admin panel. Sends users to a page with explanations and instructions. If you wish, you can block a page for robots from a particular search engine or visitors from a specific country.
In this case, it is better not to abuse the latter method. If it sees an error, the crawler will try again later, and after several similar answers, it will permanently remove the page from the index. It will have to be restored manually.
findings
In simple words, indexing is adding a site and its individual pages to a search engine database. This is done automatically, but this process can be time-consuming, so you should request a manual scan for the first time. Subsequently, in order to establish regular indexing, you will need to increase link mass, optimize content quality, update instructions for robots and speed up site loading.
You can view the site's indexing history and check for errors in the admin panel, for example, in Google Search Console. It should be remembered that not all pages need to be indexed — technical sections and drafts should be hidden from search engines. To do this, you can use HTML meta tags, commands in the robots.txt file, and the server's HTTP response.