Gone are the days when you could easily hack search engines by loading a page with keywords and creating artificial backlinks.
Today, Google is consistently rolling out changes to its algorithms to reward quality.
Unlike the past, if you have low-quality pages on your website, it can negatively impact your overall ranking.
What’s a low-quality page?
It’s one that isn’t used or visited, is full of duplicate content from other pages, has thin content or very low engagement in the eyes of Google. Some people call these “zombie pages”.
Is it ok to remove low quality content like this? Yes!
Here’s the thing:
It’s entirely possible that you have dozens, hundreds, or thousands of low-quality pages on your site — in the eyes of Google — and you might not even realize it.
We call this problem index bloat.
It happens when Google has indexed a lot of URLs for your website that it views as low-quality.
In this article, we’ll show you:
- An example of index bloat
- Common causes
- The exact steps you can take to see if you have a problem
Note: We can help you spot and fix issues on your website that are harming your overall ranking. Contact us here.
Index Bloat: A Real-life Example
We recently started working with an eCommerce client and discovered something fascinating (and troubling) as we did our standard checks to evaluate their site.
After talking to them, we expected the site to have somewhere around 10,000 pages.
When we looked in Google Webmaster Tools (now Google Search Console), we saw — to our surprise — that Google had indexed 38,000 pages for the website. Find this chart here: Web Tools > Search Console > Google Index > Index Status.
That was way too high for the size of the site.
We also saw that the number had risen dramatically.
In July of 2017, the site had only 16,000 pages indexed in Google Analytics.
How a Hidden Technical Glitch Caused Massive Index Bloat
It took a while to figure out what had gone wrong with our client’s site.
Eventually, we found a problem in their software that was creating thousands of unnecessary product pages.
At a high-level, any time the website sold out of their inventory for a brand (which happened often), the site’s pagination system created hundreds of new pages.
Put another way, the site had a technical glitch that was creating index bloat.
The company had no idea their site had this problem, which is common with a site that has a technical glitch.
For eCommerce sites that automatically generate new pages for products, brands, or categories, things like this can easily happen.
It’s one common cause of index bloat, but not the only one.
Other common causes include:
- Pages with too little original content
- Old blog posts, news releases, or case studies that get little to no traffic
- Search pages that get accidentally indexed by Google
Don’t think you’re safe just because your list of indexed pages looks like this:
Even if the overall number of pages on your site isn’t going up, you might still be carrying unnecessary pages from months or years ago — pages that could be slowly chipping away at your relevancy scores as Google makes changes to its algorithm.
The good news is: it’s relatively easy to identify and remove pages that are causing index bloat on your site.
We also have a free tool you can use that will help.
How to Identify and Remove Thin Content and Poor Performing Pages
Here’s the step-by-step process we use with our clients to identify and remove poor performing pages:
(1) Estimate the number of pages you should have
Estimate to the number of products you carry, the number of categories, blog posts, and support pages, and add them together. Your total indexed pages should be something close to that number.
(2) Use the Cruft Finder Tool to find poor-performing pages
The Cruft Finder tool is a free tool we created to identify poor-performing pages. It’s designed to help eCommerce site managers find and remove thin content pages that are harming your SEO ranking.
The tool sends a Google query about your domain and — using a recipe of site quality parameters — returns page content we suspect might be harming your index ranking.
Mark any page that:
- Is identified by the Cruft Finder tool
- Gets very little traffic (as seen in Google Analytics)
These are pages you should consider removing from your site.
(3) Decide what to keep and what to remove
For years, you’ve been told that adding fresh content on your site increases traffic and improves SEO. You should be blogging at least once a week, right?
If a blog post has been on your website for years, has no backlinks pointing to it, and no one ever visits it, that old content could be hurting your rankings. You should remove that outdated content.
Recently, we deleted 90% of one client’s blog posts. Why? Because they weren’t generating backlinks or traffic.
If no one is visiting a URL, and it doesn’t add value to your site, it doesn’t need to be there. It’s using up crawl budget for no reason.
(4) Revise and revamp necessary pages with little traffic
If a URL has valuable content you want people to see — but it’s not getting any traffic — it’s time to restructure.
Could you consolidate pages? Could you promote the content better through internal links? Could you change your navigation to push traffic to that particular page?
Also, make sure that all your static pages have robust, unique content. When Google’s index includes thousands of pages on your site with sparse or similar content, it can lower your relevancy score.
(5) Make sure your search results pages aren’t being indexed
Not all pages on your site should be indexed. The main example of this is search results pages.
You almost never want search pages to be indexed because there are better pages to funnel traffic that have better quality content. These are not meant to be entry pages.
This is a common issue.
For example, here’s what we found using the Cruft Finder tool for one major retail site: over 5,000 search pages indexed by Google.
If you find this issue on your own site, follow Google’s instructions to get rid of search result pages.
We recommend reading their instructions carefully before you remove or noindex these pages. They include nice details on that page about temporary versus permanent solutions, and when to delete pages vs. using a noindex tag, and more. If this gets too far into “technical SEO” for you, feel free to reach out to our SEO team for consultation or advice.
The Results and Impact on Traffic and Revenue
What kind of impact can index bloat have on your results?
And what kind of positive effect have we seen after correcting it?
Here’s a graph of indexed pages from a recent client that was letting their search result pages get indexed — the same way we explained above. We helped them implement a technical fix so those pages wouldn’t be indexed anymore.
In the Google Analytics graph, the the blue dot is where the fix was implemented. The number of indexed pages continued to rise for a bit, then dropped significantly.
Year over year, here’s what happened to the site’s organic traffic and revenue:
3 Months Before the Technical Fix
- 6% decrease in organic traffic
- 5% increase in organic revenue
3 Months After the Technical Fix
- 22% increase in organic traffic
- 7% increase in organic revenue
Before vs. After
- 28% total difference in organic traffic
- 2% total increase in organic revenue
This process takes time.
For this client, it took three full months before the number of indexed pages returned to the mid 13,000s, where it should have been all along.
Note: Interested in a personalized strategy to reduce index bloat and raise your SEO ranking? We can help. Contact us here.