Crawling/indexing of near duplicate product pages
-
Hi,
Hope someone can help me out here. This is the current situation:
We sell stones/gravel/sand/pebbles etc. for gardens. I will take a type of pebbles and the corresponding pages/URL's to illustrate my question --> black beach pebbles.
- We have a 'top' product page for black beach pebbles on which you can find different types of quantities (differing from 20kg untill 1600 kg).
- There is not any search volume related to the different quantities
- The 'top' page does not link to the pages for the different quantities
- The content on the pages for the different quantities is not exactly the same (different price + slightly different content). But a lot of the content is the same.
Current situation:
- Most pages for the different quantities do not have internal links (about 95%)- But the sitemap does contain all of these pages.
- Because the sitemap contains all these URL's, google frequently crawls them (I checked the logfiles) and has indexed them.
Problems:
- Google spends its time crawling irrelevant pages --> our entire website is not that big, so these quantity URL's kind of double the total number of URL's.
- Having url's in the sitemap that do not have an internal link is a problem on its own
- All these pages are indexed so all sorts of gravel/pebbles have near duplicates.
My solution:
- remove these URL's from the sitemap --> that will probably stop Google from regularly crawling these pages
- Putting a canonical on the quantity pages pointing to the top-product page. --> that will hopefully remove the irrelevant (no search volume) near duplicates from the index
My questions:
- To be able to see the canonical, google will need to crawl these pages. Will google still do that after removing them from the sitemap?
- Do you agree that these pages are near duplicates and that it is best to remove them from the index?
- A few of these quantity pages do have intenral links (a few procent of them) because of a sale campaign. So there will be some (not much) internal links pointing to non-canonical pages. Would that be a problem?
Thanks a lot in advance for your help!
Best!
-
Hi Joseph, thanks for your reply, really helpful! 301 is not really an option, because these quantity URL's are sometimes used for promotions and need to be reachable. Therefore I guess canonicals are the second best solution.
We will implement the solution I described and see what will happen. Thanks again!
-
Hello there,
To answer your questions,
1. Google will still crawl your pages even if it's not from the sitemap unless you specify disallow from your robots.txt
2. If they are similar content with the main difference at "quantities" couldn't you consolidate them into one single page that lists all the quantities your company sell in and then 301 redirect the other pages to the consolidated one?
3. It doesn't seem like going to be causing any problem nor hurting your SEO performance, but you could always change these link to the canonical link.
Hope this helps,
Joseph Yap
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirect wordpress from /%post_id%/%postname%/ to /blog/%postname%/
Hi what is the code to redirect wordpress blog from site.com/%post_id%/%postname%/ to site.com/blog/%postname%/ We are moving the site to a new server and new url structure. Thanks in advance
Intermediate & Advanced SEO | | Taiger0 -
Links / Top Pages by Page Authority ==> pages shouldnt be there
I checked my site links and top pages by page authority. What i have found i dont understand, because the first 5-10 pages did not exist!! Should know that we launched a new site and rebuilt the static pages so there are a lot of new pages, and of course we deleted some old ones. I refreshed the sitemap.xml (these pages are not in there) and upload it in GWT. Why those old pages appear under the links menu at top pages by page authority?? How can i get rid off them? thx, Endre
Intermediate & Advanced SEO | | Neckermann0 -
Solution to Duplicate Pages within Shopify
Thanks in advance for your time and expertise. I am having issues with duplicate page content and titles on a client's Shopify subdomain. Examples below. Two questions: #1 How can I solve this issue? Do I block the duplicate pages from being crawled? With meta NoIndex? Establish the main page as the canonical version and stop obsessing? Other... #2 Is it a big concern or am I needlessly obsessing? Feels like a concern that needs to be addressed, but maybe not? Duplicate Page Content Examples: #1 URL: http://shop.shopvandevort.com #1 Duplicate URLs: http://shop.shopvandevort.com/collections/all; http://shop.shopvandevort.com/collections/all?page=1 #2 URL: http://shop.shopvandevort.com/collections/accessories #2 Duplicate URLs: http://shop.shopvandevort.com/collections/accessories; http://shop.shopvandevort.com/collections/types?q=Accessories Duplicate Page Title Examples: http://shop.shopvandevort.com/collections/vendors?q=For%20Love%20And%20Lemons http://shop.shopvandevort.com/collections/for-love-lemons http://shopvandevort.com/blog/tag/for-love-and-lemons/ http://shop.shopvandevort.com/collections/for-love-lemons?page=1 Thanks again for taking a look here, very much appreciated.
Intermediate & Advanced SEO | | AaronHurst0 -
Drop in indexed pages!
Hi everybody! I've been working on http://thewilddeckcompany.co.uk/ for a little while now. Until recently, everything was great - good rankings for the key terms of 'bird hides' and 'pond dipping platforms'. However, rankings have tanked over the past few days. I can't point my finger at it yet, but a site:thewilddeckcompany.co.uk search shows only three pages have been indexed. There's only 10 on the site, and it was fine beforehand. Any advice would be much appreciated,
Intermediate & Advanced SEO | | Blink-SEO0 -
How Long Does it Take for Rel Canonical to De-Index / Re-Index a Page?
Hi Mozzers, We have 2 e-commerce websites, Website A and Website B, sharing thousands of pages with duplicate product descriptions. Currently only the product pages on Website B are indexing, and we want Website A indexed instead. We added the rel canonical tag on each of Website B's product pages with a link towards the matching product on Page A. How long until Website B gets de-indexed and Website A gets indexed instead? Did we add the rel canonical tag correctly? Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Duplicated Pages and Forums
Does duplicate content hurt that particular duplicated content, or the entire site? There are some parts of my site that I don’t care about getting high rankings on search engines. For example, I have a forum and there are certain links that only logged in people can see. If you aren’t logged in, they will take you to a page where it tells u to log in. google, obviously not logged in, interprets this as lots and lots of the same duplicated page. Should I just leave it alone cause I dont care if those pages makes it to search engines. Will it not hurt the entire site? For example, can my homepage search rankings decrase? That leads to my next question. What is the best way to optimize a forum? Whenever someone posts a new post, it seems another url for the same forum thread is created..... which is obviously duplicated….in other words, if like 20 people post on a thread, i believe my site adds 20 urls for that page...anyone know how to fix this?
Intermediate & Advanced SEO | | waltergah0 -
Category Pages up - Product Pages down... what would help?
Hi I mentioned yesterday how one of our sites was losing rank on product pages. What steps do you take to improve the SERPS of product pages, in this case home/category/product is the tree. There isn't really any internal linking, except one link from the category page to each product, would setting up a host of internal links perhaps "similar products" linking them together be a place to start? How can I improve my ranking of these more deeply internal pages? Not just internal links?
Intermediate & Advanced SEO | | xoffie0 -
Domain Authority / Page Authority
I manage a site that has home page authority of 69, and overall domain authority of 63. To improve domain authority, would it help to remove some of the pages that have 0 page authority? There are over 1,000 pages to this site, and I always thought that the more pages you have, the better (generally). But, does it actually hurt the site to have pages that Google perceives as having 0 page authority, or does this have no bearing? Any insight would be greatly appreciated.
Intermediate & Advanced SEO | | DiscoverBoating0