Skip to content
How to optimize e commerce sitemaps with 1 M pages Blog Header

How to Optimize E-commerce Sitemaps with 1M+ Pages — Whiteboard Friday

Stevy Liakopoulou

The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Table of Contents

Stevy Liakopoulou

How to Optimize E-commerce Sitemaps with 1M+ Pages — Whiteboard Friday

The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Discover how to optimize your e-commerce website’s sitemap with Stevy’s six steps in this edition of Whiteboard Friday.

Click on the whiteboard image above to open a high-resolution version!

Hello, Moz fans. Welcome to this edition of the Whiteboard Friday series, and this is your host, Stevy from Search Magic.

So today, we're going to talk about how we can optimize sitemaps with more than one million product pages inside, especially for e-commerce websites.

Recently, I worked on a website. I started doing technical analysis. I started digging further into Google Search Console, and what I found there scared me. I found out that inside the sitemap, there were more than one million product pages not indexed. Google Search Console reported that those pages were "discovered currently not indexed" and "crawled currently not indexed," meaning that Google completely dropped them from indexing.

So, I started doing further research on this and finding patterns, and today, I'm going to show you how I did it.

Hierarchy + organization

heirarchy and organization

First of all, let's talk about hierarchy and organization inside sitemaps. We need to group URLs logically inside the sitemap in a way that mirrors the navigation of our website.

I recommend using sub-sitemaps based on the product, categories, or any other topic that makes sense for your business. If your sitemap supports more than one language, consider creating a sitemap for every single language. Do not stuff all the URLs from all the different languages inside the sitemap.

Categories/subcategories with 0 or 1 product

Categories/subcategories with 0 or 1 product

Second, categories and subcategories with zero products or one product. Now we are talking about thin content pages.

So, if you need some of those categories, consider merging them with other relevant categories. But if we are talking about categories that will not have products in the future, set a 301 redirection to relevant categories and remove them from the sitemap.

Product pages with 0 content/duplicated content

Product pages with 0 content/duplicated content

Number three, product pages with zero content, duplicated content, or even content copied from the manufacturer's feed. There is no one-way solution here. So we need to understand which of those products are important for our business and which are not in terms of revenue.

So, make a list of the products that are very important in terms of revenue and start doing manual optimizations in each one of these products. That means crafting unique, compelling content. Add keywords in good positions. For example, optimize page-level meta descriptions; add FAQs; add some videos; add a unique image; update the schema markup. You can even use AI for product descriptions for different product variations.

On the other hand, if we are talking about products that are not that important right now for your business, you should either optimize them gradually after you have finished with the important products or add a no-index tag and remove them from the sitemap.

Out-of-stock products

Out-of-stock products

Okay, let's talk about the out-of-stock products. Out-of-stock products can be treated as soft 404 errors, and there is a huge possibility of being dropped from the search results, according to John Mueller. So we need to do research here. We have two scenarios.

The first scenario is for those products to be permanently out of stock.

Research and ask yourself if those products receive traffic. Do they have backlinks? Do they make money for the business? If the answer is yes, then set a 301 redirection rule to those products to other relevant products, remove them from the sitemap, and remove them completely from the products feed.

If those products do not make any money, have no traffic, or have no backlinks, then set a 410 HTTP status and completely remove them from the sitemap and the product feed.

Under scenario number two, we are going to talk about the temporarily out-of-stock products. Then, in that case, it all comes down to the user experience.

We need a way to ensure that when the user lands on a temporarily out-of-stock product page, the user has a clear meaning that this product will be back in the future. We need to notify the user and let search engines know about the availability of this product.

So, we need to update the schema markup and create internal link blocks where we can list other relevant products so we can keep the user inside our website. Plus, we need to set a clear ‘Notify Me’ button and give the user the option to leave his email so he can receive a notification email saying, "Hey, this product is back in stock. Are you interested in this?"

And, of course, we are going to keep it inside the sitemap.

Products with multiple variations

Products with multiple variations

Number five, products with multiple variations. Okay, so hear me out. Let's say that we have a T-shirt that comes in 20 different colors. Google is not going to index all those pages. So most business owners fail to make those pages unique if we compare them to each other.

So, here we need to ask which of those 20 color variations generates the most traffic, has the most backlinks, or even makes the most money for my business. Which of these colors is the most famous for our business? As soon as we finish this research, we will take this variation and set it as a canonical tag for all the other color variations.

We also add only this one inside the sitemap. We will leave the rest of them outside of the sitemap. But hear me out. Here, we need to constantly analyze and audit if this product variation is still famous because, for example, in March, the black color might be famous, but during August, the pink color might be the most famous. If we have this, we need again to update the sitemap and adjust our strategy accordingly.

Errors

Errors

Number six, errors. Okay, do you have pages inside your sitemap that are reported as 404? Then, completely remove them from your sitemap.

Do you have pages that are reported with redirection issues? Make sure that those pages redirect correctly to the destination URLs, and then completely remove them from the sitemap.

Do you have pages that are reported with several issues? Then it's time to note these within your server and see how the number of pages is gradually decreasing inside the sitemap.

Do you have pages that are reported as soft 404 pages? Here, make a review because there might be pages that make sense to your business, so you can again optimize them accordingly and keep them inside the sitemap. But if there are pages that you do not need anymore, set a 301 redirection rule and remove it completely from the sitemap.

That's all for today. I hope you enjoyed my strategies. I hope they are helpful. Have a great day.

Transcription by Speechpad

Back to Top
Stevy Liakopoulou

Stevy is an SEO Specialist at Search Magic, specializing in e-commerce websites. She participates in industry conferences both as a speaker and as an attendee, and her goal is to audit at least 300 websites.

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Read Next

Brand Entity SEO – Whiteboard Friday

Brand Entity SEO – Whiteboard Friday

Nov 01, 2024
Elevating Your SEO Career and Team in the AI Era — Whiteboard Friday

Elevating Your SEO Career and Team in the AI Era — Whiteboard Friday

Oct 25, 2024
Google HCU: What Can You Do? — Whiteboard Friday

Google HCU: What Can You Do? — Whiteboard Friday

Oct 18, 2024