Duplicate Content & Canonicals
-
I am a bit confused about canonicals and whether they are "working" properly on my site. In Webmaster Tools, I'm showing about 13,000 pages flagged for duplicate content, but nearly all of them are showing two pages, one URL as the root and a second with parameters. Case in point, these two are showing as duplicate content:
http://www.gallerydirect.com/art/product/vincent-van-gogh/starry-night
We have a canonical tag on each of the pages pointing to the one without the parameters. Pages with other parameters don't show as duplicates, just one root and one dupe per listing,
So, am I not using the canonical tag properly? It is clearly listed as:Is the tag perhaps not formatted properly (I saw someone somewhere state that there needs to be a /> after the URL, but that seems rather picky for Google)?Suggestions?
-
Thanks, Dr. Pete.
I'll discuss the options with our dev team and see which one will cause the least amount of developer caffeine consumption.
-
Argh... sorry, I didn't even check/see that. Yeah, that may be a real problem - you're basically sending two canonicalization signals that are in conflict. Is there any way to hide the defaults? If the canonicals point to (A), but then (A) redirects to (B), Google may just ignore the canonical.
Unfortunately, your options are to either: (1) hope for the best, (2) canonical to the uglier URL, or (3) kill the redirect and set the default parameters on the server-side (without resetting the URL).
I am primarily seeing the canonical URL in Google's index, so I'm not sure it's actually causing you harm. It's just not an ideal situation.
-
Dr. Pete:
I'm looking into it to be sure, but I believe that you are correct in that this is an ad-tracking URL.
A follow up question:
The URL that is the canonical version of each page would be in the format of
http://www.gallerydirect.com/art/product/vincent-van-gogh/starry-night
However, this exact URL redirects to one with default parameters for substrate, style and frame size:
Should we change our canonical from the first URL (without the parameters) to the second URL with the parameters? Or is that a moot point with Google?
-
While the properly closed tag should have "... />", that's generally only an issue in very isolated cases. I've never seen it interfere with a canonical tag. It's a harmless change to make (and it is more correct), but my gut reaction is that this will make no difference. Google should be honoring these canonicals.
One odd thing I'm seeing. If I dig into the index, I'm finding the following page:
This may be an ad-tracking URL (?) and it's redirecting somehow (but not with a 301 or 302) to the non-canonical URL. This may be sending a mixed signal, and ideally it would redirect to the canonical version of the URL. I'm not sure where this version is coming from, so it's a bit hard to diagnose.
-
Hi Darin
The tag is not working because if you go into Google and enter the URL: http://www.gallerydirect.com/art/product/vincent-van-gogh/starry-night?substrate_id=3&product_style_id=8&frame_id=63&size=25x20 you will see that it is being indexed on Google.
If it's being indexed, then it runs the risk of duplicate content issues.
The tag definitely does need the /> at the end, so the correct usage of the tag would be: rel="canonical" href="http://www.gallerydirect.com/art/product/vincent-van-gogh/starry-night" />
I think if you implement that small change, there shouldn't be any problems.
Hope this helps.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mobile-First Indexing New Site monetized with Adsense AMP or not?
I am considering developing a new site monetized with Adsense. I am wondering if it's still worth bothering with AMP, it will take some work to get the functionality I have in mind working on these pages due to the inherent limitations. Has anyone got any insights in terms of current and future benefits of AMP in terms of ranking benefits and Adsense earning potential?
Web Design | | GrouchyKids0 -
Copy partial content to other pages ?
One of our clients looking to redesign their website since we're redesigning the whole website we thought it would be good idea to separate services into individual pages so every service will have it's own page (currently there is 1 page that describes all of the services). what we're planing to do is to write unique content for each service page (about 300-400 keywords), but we also want to use some of the existing content which is kind of explains the process of provided services. so here i need your help! what would be the best practice to use same part of existing content on every service page without getting penalized for duplicated content? here is how we want to structure the page with h1 and h2 <main> Service name (same as page title) Subline new and unique content about 300-400 keywords Part of old content which is going to be placed on every service page </main> any help would be much appreciated!
Web Design | | MozPro30 -
Curious why site isn't ranking, rather seems like being penalized for duplicate content but no issues via Google Webmaster...
So we have a site ThePowerBoard.com and it has some pretty impressive links pointing back to it. It is obviously optimized for the keyword "Powerboard", but in no way is it even in the top 10 pages of Google ranking. If you site:thepowerboard.com the site, and/or Google just the URL thepowerboard.com you will see that it populates in the search results. However if you quote search just the title of the home page, you will see oddly that the domain doesn't show up rather at the bottom of the results you will see where Google places "In order to show you the most relevant results, we have omitted some entries very similar to the 7 already displayed". If you click on the link below that, then the site shows up toward the bottom of those results. Is this the case of duplicate content? Also from the developer that built the site said the following: "The domain name is www.thepowerboard.com and it is on a shared server in a folder named thehoverboard.com. This has caused issues trying to ssh into the server which forces us to ssh into it via it’s ip address rather than by domain name. So I think it may also be causing your search bot indexing problem. Again, I am only speculating at this point. The folder name difference is the only thing different between this site and any other site that we have set up." (Would this be the culprit? Looking for some expert advice as it makes no sense to us why this domain isn't ranking?
Web Design | | izepper0 -
Are URL suffixes ignored by Google? Or is this duplicate content?
Example URLs: www.example.com/great-article-on-dog-hygiene.html www.example.com/great-article-on-dog-hygiene.rt-article.html My IT dept. tells me the second instance of this article would be ignored by Google, but I've found a couple of instances in which Google did index the 'rt-article.html' version of the page. To be fair, I've only found a couple out of MANY. Is it an issue? Thanks, Trisha
Web Design | | lzhao0 -
Pin It Button, Too Many Links, & a Javascript question...
One of the sites I work for has some massive on-page link problems. We've been trying to come up with workarounds to lower the amount of links without making drastic changes to the page design and trying to stay within SEO best practices. We had originally considered the NoFollow route a few months back but that's not viable. We changed around some image and text links so they were wrapped together as one link instead of being two links to the same place. We're currently running tests on some pages to see how else to handle the issue. What has me stumped now though is that the damned Pinterest Pin Button counts as an external link and we've added it to every image in our galleries. Originally we found that having a single Pin It button on a page was pulling incorrect images and not listing every possible image on the page... so to make sure that a visitor can pin the exact picture they want, we added the button to everything. We've been seeing a huge uptick in Pinterest traffic so we're definitely happy with that and don't want to get rid of the button. But if we have 300 pictures (which are all links) on a page with Pin It buttons (yet more links) we then have 600+ links on the page. Here's an example page: http://www.fauxpanels.com/portfolio-regency.php When talking with one of my coders, he suggested some form of javascript might be capable of making the button into an event instead of a link and that could be a way to keep the Pin It button while lowering on-page links. I'm honestly not sure how that would work, whether Google would still count it as a link, or whether that is some form of blackhat cloaking technique we should be wary of. Do any of you have experience with similar issues/tactics that you could help me with here? Thanks. TL;DR Too many on page links. Coder suggests javascript "alchemy" to turn lead into gold button links into events. Would this lower links? Or is it bad? Form of Cloaking?
Web Design | | MikeRoberts0 -
Internal links, new pages & Domain Authority
I have two questions regarding Domain Authority: 1. Is it possible that a drop in Domain Authority may have been caused by adding a blog and blog posts? In other words, would adding pages/posts dilute the site's authority? And will it catch back up with itself or will that require inbound links to those new pages? (oops! that was 3 questions in one) 2. Would it be detrimental to have internal links coming from blog posts without authority to my Home page and could that have contributed to a drop in Domain Authority? Thanks!
Web Design | | gfiedel0 -
Duplicate H1 tag IF it holds SAME text?
Hello people, I know that majority of SEO gurus (?) claim that H1 tag should only be used once per page. In the landing page design I'm working with, we actually need to repeat our core message stated in H1 & H2 - at the bottom of the page. Now the question is: Can that in any way cause any ranking penalty from big G? In my eyes that is not attempt to over optimize page as it contains SAME info as the H1 & H2 at the top of the page. Confusing, so I'm hope that some SEO gurus here will share some light on this. Thanks in advance!
Web Design | | RetroOnline0 -
Google Bot cannot see the content of my pages
When I go to Google Webmaster tools and I type in any URL from the site http://www.ccisolutions.com in the "Fetch as Google Bot" feature, and then I click the link that says "success," Google bot is seeing my pages like this: <code>HTTP/1.1 200 OK Date: Tue, 26 Apr 2011 19:11:50 GMT Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a DAV/2 PHP/5.2.4 mod_jk/1.2.25 Set-Cookie: CCISolutions-UT-Status=66.249.72.55.1303845110495128; path=/; expires=Thu, 25-Apr-13 19:11:50 GMT; domain=.ccisolutions.com Last-Modified: Tue, 28 Oct 2008 14:36:45 GMT ETag: "314b26-5a-2d421940" Accept-Ranges: bytes Content-Length: 90 Keep-Alive: timeout=15, max=99 Connection: Keep-Alive Content-Type: text/html Any clue as to why this could be happening?</code>
Web Design | | danatanseo0