Duplicate Content / Canonical Conundrum on E-Commerce Website
-
Hi all,
I’m looking for some expert advice on use of canonicals to resolve duplicate content for an e-Commerce site. I’ve used a generic example to explain the problem (I do not really run a candy shop).
SCENARIO
I run a candy shop website that sells candy dispensers and the candy that goes in them. I sell about 5,000 different models of candy dispensers and 10,000 different types of candy.
Much of the candy fits in more than one candy dispenser, and some candy dispensers fit exactly the same types of candy as others.
To make things easy for customers who need to fill up their candy dispensers, I provide a “candy finder” tool on my website which takes them through three steps:
1. Pick your candy dispenser brand (e.g. Haribo)
2. Pick your candy dispenser type (e.g. soft candy or hard candy)
3. Pick your candy dispenser model (e.g. S4000-A)
RESULT: The customer is then presented with a list of candy products that they can buy. on a URL like this:
Candy-shop.com/haribo/soft-candy/S4000-A
All of these steps are presented as HTML pages with followable/indexable links.
PROBLEM:
There is a duplicate content issue with the results pages. This is because a lot of the candy dispensers fit exactly the same candy (e.g. S4000-A, S4000-B and S4000-C). This means that the content on these pages are the basically same because the same candy products are listed. I’ll call these the “duplicate dispensers” E.g.
Candy-shop.com/haribo/soft-candy/S4000-A
Candy-shop.com/haribo/soft-candy/S4000-B
Candy-shop.com/haribo/soft-candy/S4000-C
The page titles/headings change based on the dispenser model, but that’s not enough for the pages to be deemed unique by Moz. I want to drive organic traffic searches for the dispenser model candy keywords, but with duplicate content like this I’m guessing this is holding me back from any of these dispenser pages ranking.
SOLUTIONS
1. Write unique content for each of the duplicate dispenser pages: Manufacturers add or discontinue about 500 dispenser models each quarter and I don’t have the resources to keep on top of this content. I would also question the real value of this content to a user when it’s pretty obvious what the products on the page are.
2. Pick one duplicate dispenser to act as a rel=canonical and point all its duplicates at it. This doesn’t work as dispensers get discontinued so I run the risk of randomly losing my canonicals or them changing as models become unavailable.
3. Create a single page with all of the duplicate dispensers on, and canonical all of the individual duplicate pages to that page.
e.g. Canonical: candy-shop.com/haribo/soft-candy/S4000-Series
Duplicates (which all point to canonical):
candy-shop.com/haribo/soft-candy/S4000-Series?model=A
candy-shop.com/haribo/soft-candy/S4000-Series?model=B
candy-shop.com/haribo/soft-candy/S4000-Series?model=C
PROPOSED SOLUTION
Option 3.
Anyone agree/disagree or have any other thoughts on how to solve this problem?
Thanks for reading.
-
Yes, adwords CR would give you that answer. The budget required depends on so many factors. But you can reduce the list of KW sampling the complete list.
But at least at macro level if you discuss that with someone from your client who knows his market and his consumers you should start getting an idea.
Logic+common sense is a good start.
I would analyze that before to start changing the website.
But if you do the opposite is not that you are going to break any porcelain. Duplicate content is not like a manual penalization, as far as I know, once you fix it and google crawl the new version the ranking is updated.
-
Thanks Max, your feedback makes complete sense.
KW volume analysis is a big job but managable, though I'm not even sure where I'd start with analysing whether people buy or not based on certain organic KWs. I'd probably have to set up Adwords campaigns and test conversion rates? Across a long tail of keywords that's going to be expensive to get statistically significant results.
Assuming that I don't have the resources to do that immediately, but that I do have a duplicate content issue (at least Moz seems to think so) am I better off "fixing" it with my proposed solution, or would you hold off until the KW analysis was done. This section of the site gets very little organic traffic at the moment as it's also a very competitive space and it doesn't have many inbound links so the risk of causing damage is low. I'm reluctant to start promoting this section and linking to it if I know there's a significant underlying duplicate content problem.
You're right about the URL too - it actually starts /Candy-Dispenser-Candies-Refills/*, I didn't think I'd get picked up on that!
Thanks,
George
-
As a rule of thumb I would put the category before the brand in the url structure. But...
In my opinion there's much more you should research before to take a decision.
Did you analyze your consumer behavior? What keywords are they going to type in google search box?
Are they really looking for your candy dispenser brands? Or by dispenser model names? Brand+model? Or they don't know much about candy dispensers manufacturer and models and just searching by some characteristics?
Don't be tricked by keywords volume, maybe there are a lot of searches for a brand or model, but what is their intention when searching by those terms? To buy? To find information planning to buy? To find information about a product they bought and learnt the name after making the purchase?
You should find out before to design the url structure.
And before to take a decision about how to mitigate the duplicate content risk.
What I mean is... There are characteristics of those dispensers you want to use to differentiate pages to target different keywords, and characteristics you can just put all in one page with “dispenser configurator”.
-
Same scenario on our site, we have a Product Finder search that returns x results based on user criteria. My solution canonical tag the search result pages to the root page.. in my case advanced_search.php.
My thought process is this, if somebody is searching for a very specific product, I absolutely don't want them hitting a random search page, rather I want them to see my product page. This means that the search page is likely crap in the rankings and that is by design.
There is nothing wrong with trying to capitalize on the search results, but isn't that what your categories and actual product pages are for?
Hope this helps,
Don
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Same site serving multiple countries and duplicated content
Hello! Though I browse MoZ resources every day, I've decided to directly ask you a question despite the numerous questions (and answers!) about this topic as there are few specific variants each time: I've a site serving content (and products) to different countries built using subfolders (1 subfolder per country). Basically, it looks like this:
Intermediate & Advanced SEO | | GhillC
site.com/us/
site.com/gb/
site.com/fr/
site.com/it/
etc. The first problem was fairly easy to solve:
Avoid duplicated content issues across the board considering that both the ecommerce part of the site and the blog bit are being replicated for each subfolders in their own language. Correct me if I'm wrong but using our copywriters to translate the content and adding the right hreflang tags should do. But then comes the second problem: how to deal with duplicated content when it's written in the same language? E.g. /us/, /gb/, /au/ and so on.
Given the following requirements/constraints, I can't see any positive resolution to this issue:
1. Need for such structure to be maintained (it's not possible to consolidate same language within one single subfolders for example),
2. Articles from one subfolder to another can't be canonicalized as it would mess up with our internal tracking tools,
3. The amount of content being published prevents us to get bespoke content for each region of the world with the same spoken language. Given those constraints, I can't see a way to solve that out and it seems that I'm cursed to live with those duplicated content red flags right up my nose.
Am I right or can you think about anything to sort that out? Many thanks,
Ghill0 -
Copied Content - Define Canonical
Hello, The Story I am working on a news organization. Our website is the https://www.neakriti.gr My question regards copied content with source references. Sometimes a small portion of our content is based on some third article that is posted on some site (that is about 1% of our content). We always put "source" reference if that is the case. This is inevitable as "news" is something that sometimes has sources on other news sites, especially if there is something you cannot verify or don't have immediate sources, and therefore you need to state that "according to this source, something has happened". Here is one article of ours that has a source from another site: https://www.neakriti.gr/article/ellada-nea/1503363/nekros-vrethike-o-agnooumenos-arhimandritis-stin-lakonia/ if you open the above article you will see we have a link to the equivalent article of the original source site http://lakonikos.gr/epikairothta/item/133664-nekros-entopistike-o-arximandritis-p-andreas-bolovinos-synexis-enimerosi Now here is my question. I have read in other MOZ forum articles that a "canonical" approach solves this issue... How can we be legit when it comes to duplicate content in the eyes of search engines? Should we use some kind of canonical link to the source site? Should the "canonical" be inside the link in some way? Should it be on our section? Our site has AMP equivalent pages (if you add the /amp keyword at the end of the article URL). Our AMP pages have canonical to our original article. So if we have a "canonical" approach how would the AMP be effected as well? Also by applying a possible canonical solution to the source URL, does that "canonical" effect our article as not being shown in search results, thus passing all indexing to the canonical site? (I know that canonical indicates what URL is to be indexed). Additionally, does such a canonical indication make us legit in such a case in the eyes of search engines? (i.e. it eliminates any possible article duplication for original content in the eyes of search engines?). Or simply put, having a simple link to the original article (as we have it now) is enough for the search engines to understand that we have reference to original article URL? How would we approach this problem in our site based on its current structure?
Intermediate & Advanced SEO | | ioannisanif0 -
SEM Rush & Duplicate content
Hi SEMRush is flagging these pages as having duplicate content, but we have rel = next etc implemented: https://www.key.co.uk/en/key/brand/bott https://www.key.co.uk/en/key/brand/bott?page=2 Or is it being flagged as they're just really similar pages?
Intermediate & Advanced SEO | | BeckyKey0 -
How should I manage duplicate content caused by a guided navigation for my e-commerce site?
I am working with a company which uses Endeca to power the guided navigation for our e-commerce site. I am concerned that the duplicate content generated by having the same products served under numerous refinement levels is damaging the sites ability to rank well, and was hoping the Moz community could help me understand how much of an impact this type of duplicate content could be having. I also would love to know if there are any best practices for how to manage this type of navigation. Should I nofollow all of the URLs which have more than 1 refinement used on a category, or should I allow the search engines to go deeper than that to preserve the long tail? Any help would be appreciated. Thank you.
Intermediate & Advanced SEO | | FireMountainGems0 -
WMT Index Status - Possible Duplicate Content
Hi everyone. A little background: I have a website that is 3 years old. For a period of 8 months I was in the top 5 for my main targeted keyword. I seemed to have survived the man eating panda but not so sure about the blood thirsty penguin. Anyway; my homepage, along with other important pages, have been wiped of the face of Google's planet. First I got rid of some links that may not have been helping and disavowed them. When this didn't work I decided to do a complete redesign of my site with better content, cleaner design, removed ads (only had 1) and incorporated social integration. This has had no effect at all. I filed a reconsideration request and was told that I have NOT had any manual spam penalties made against me, by the way I never received any warning messages in WMT. SO, what could be the problem? Maybe it's duplicate content? In WMT the Index Status indicates that there are 260 pages indexed. However; I have only 47 pages in my sitemap and when I do a site: search on Google it only retrieves 44 pages. So what are all these other pages? Before I uploaded the redesign I removed all the current pages from the index and cache using the remove URL tool in WMT. I should mention that I have a blog on Blogger that is linked to a subdomain on my hosting account i.e. http://blog.mydomain.co.uk. Are the blog posts counted as pages on my site or on Blogger's servers? Ahhhh this is too complicated lol Any help will be much appreciated! Many thanks, Mark.
Intermediate & Advanced SEO | | Nortski0 -
Showing Duplicate Content in Webmaster Tools.
About 6 weeks ago we completely redid our entire site. The developer put in 302 redirects. We were showing thousands of duplicate meta descriptions and titles. I had the redirects changed to 301. For a few weeks the duplicates slowly went down and now they are right back to where they started. Isn't the point of 301 redirects to show Google that content has permanently been moved? Why is it not picking this up? I knew it would take some time but I am right where I started after a month.
Intermediate & Advanced SEO | | EcommerceSite0 -
Duplicate Content
http://www.pensacolarealestate.com/JAABA/jsp/HomeAdvice/answers.jsp?TopicId=Buy&SubtopicId=Affordability&Subtopicname=What%20You%20Can%20Afford http://www.pensacolarealestate.com/content/answers.html?Topic=Buy&Subtopic=Affordability I have no idea how the first address exists at all... I ran the SEOMOZ tool and I got 600'ish DUPLICATE CONTENT errors! I have errors on content/titles etc... How do I get rid of all the content being generated from this JAABA/JSP "jibberish"? Please ask questions that will help you help me. I have always been 1st on google local and I have a business that is starting to hurt very seriously from being number three 😞
Intermediate & Advanced SEO | | JML11790 -
Duplicate Content on Wordpress b/c of Pagination
On my recent crawl, there were a great many duplicate content penalties. The site is http://dailyfantasybaseball.org. The issue is: There's only one post per page. Therefore, because of wordpress's (or genesis's) pagination, a page gets created for every post, thereby leaving basically every piece of content i write as a duplicate. I feel like the engines should be smart enough to figure out what's going on, but if not, I will get hammered. What should I do moving forward? Thanks!
Intermediate & Advanced SEO | | Byron_W0