SEO - Search Engine Optimization

How to identify and fix indexation bloat issues

Last Updated: April 26, 2017

[sc name=”GoogleLinkAds”]

Indexation bloat is when a website has pages within a search engine “index” that should not be indexed, and can cause issues if not monitored and policed properly. It is an extremely common SEO problem and affects all websites, ranging from small WordPress blogs to big Hybris and Magento ecommerce websites. The more serious cases of indexation bloat usually occur on ecommerce websites, as they tend to utilize user-friendly facetted navigations and filter lists, allowing users to quickly identify the products that they want. I’ve seen examples first hand of simple Demandware and Open Cart websites with only a few hundred products having millions of URLs appear in Google’s index because of the product filters generating URLs. Why is indexation bloat a problem? It’s a known fact that when Google and the other search engines crawl your website, they don’t crawl your website in its entirety. Allowing and asking them to crawl unnecessary URLs wastes this resource. If search engines aren’t regularly crawling your “money” pages and are instead getting stuck down other rabbit holes without picking up on updates, this could impact your organic performance. Bloat can also lead to duplicate content issues. While internal website content duplication isn’t as serious an issue as external duplication, it could dilute an individual page’s prominence and relevancy for search terms if the page itself as the search engines aren’t sure which URL to rank for the terms. Identifying index bloat issues One early indicator of index bloat is the number of pages appearing within search engine results. It’s important to note here that the number of pages typically identified using the site: operator within Google and Bing search often show different numbers to what you see in Google Search Console and Bing Webmaster Tools — this isn’t something to worry about. Website monitoring While there are ways to resolve index bloat, the best way, in my experience, to deal with it is to prevent it from happening at all. By checking Google Search Console and Bing Webmaster Tools on a monthly basis, specifically at crawl data, you can record what is and isn’t regular behavior for your website. Abnormal increases, or spikes in the “Pages crawled per day” and “Kilobytes downloaded per day” can be indicators that Google is accessing more URLs than it has been. Likewise conducting a site: search within Google and Bing will let you see how many URLs they have in the index, and you’ll know roughly how many pages your website has. How can I fix indexation bloat? Identifying that you have an index bloat issue is only step one, now you have to establish what is causing the bloat. These are some of the most common causes…

Source: How to identify and fix indexation bloat issues

[sc name=”GoogleLinkAds”]

About the author / 

Dan Taylor

[sc name="searchbox"]
[sc name="responsiveads"]

Email Subscriptions

Enter your email address:

Delivered by FeedBurner

Follow us on Twitter