Using robots.txt to deal with duplicate content
-
I have 2 sites with duplicate content issues.
One is a wordpress blog.
The other is a store (Pinnacle Cart).
I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?
-
It will be any part of the URL that doesn't handle navigation, so look at what you can delete off the URL without breaking the link to the product page.
Take a look at this: http://googlewebmastercentral.blogspot.com/2009/10/new-parameter-handling-tool-helps-with.html
Remember, this will only work with Google!
This is another interesting video from Matt Cutts about removing content from Google: http://googlewebmastercentral.blogspot.com/2008/01/remove-your-content-from-google.html
-
If the urls look like this...
Would I tell Google to ignore p, mode, parent, or CatalogSetSortBy? Just one of those or all of those?
Thanks!!!
-
For Wordpress try : http://wordpress.org/extend/plugins/canonical/
also look at Yoast's Wordpress SEO plugin referenced on that page - I love it!
and for the duplicate content caused by the dymanic content on the pinnacle cart you can use the Google Webmasters tool to tell the Google to ignore certain parameters - go to Site configuration - Settings - Parameter handling and add the variables you wish to ignore to this list.
-
Hi,
The two sites are unrelated to each other so my concern is not duplicate content between the two, there is none.
However, on each of the sites I have the duplicate content issues. I do have admin privileges to both sites.
If there is a Wordpress plug in that would be great. Do you have one that you would recommend?
For my ecommerce site using pinnacle cart, I have duplicates because of the way people can search on the site. For example:
|
http://www.domain.com/accessories/
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=date
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=name
http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=price
|
These all show as duplicate content in my webmaster tools reports. I don't have the ability to edit each head tag of pages in order to add a canonical link on this site.
-
What are your intentions here? Do you intend to leave both sites running? Can you give us more information on the sites? Are they aged domains, is one/any/both of them currently attracting any inbound links, are they ranking? What is the purpose of the duplicate content?
Are you looking to redirect traffic from one of the sites to the other using 301 redirect?
Or do you want both sites visible - using the Canonical link tag?
(I am concerned that you say you 'cannot edit the tag'? Do you not have full Admin access to either site?
There are dedicated Canonical management plugins for Wordpress (if you have access to the wp-admin area)
You are going to need some admin priviledges to make any alterations to the site so that you can correct this.
Let us know a bit more please!
These articles may be useful as they provide detailed best practice info on redirects:
http://www.google.com/support/webmasters/bin/answer.py?answer=66359
http://www.seomoz.org/blog/duplicate-content-block-redirect-or-canonical
Check out this article on redirects
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content and canonicalization confusion
Hello, http://bit.ly/1b48Lmp and http://bit.ly/1BuJkUR pages have same content and their canonical refers to the page itself. Yet, they rank in search engines. Is it because they have been targeted to different geographical locations? If so, still the content is same. Please help me clear this confusion. Regards
Technical SEO | | IM_Learner0 -
Duplicate Content in Dot Net Nuke
Our site is built on Dot Net Nuke. SEOmoz shows a very large amount of duplicate content because at the beginning each page got an extension in the following format: www.domain.com/tabid/110/Default.aspx The site additionally exists without the tabid... part. Our web developer says an easy fix with a canonical tag or 301 redirect is not possible. Does anyone have DNN experience and can point us in the right direction? Thanks, Ricarda
Technical SEO | | jsillay0 -
Duplicate Content
SEOmoz is reporting duplicate content for 2000 of my pages. For example, these are reported as duplicate content: http://curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158
Technical SEO | | jplill
http://curatorseye.com/Name=âHolster-Atlasâ---Used-by-British-Officers-in-the-Revolution&Item=4158 The actual link on the site is http://www.curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158 Any insight on how to fix this? I'm not sure where the second version of the URL is coming from. Thanks,
Janet0 -
Using Robots.txt
I want to Block or prevent pages being accessed or indexed by googlebot. Please tell me if googlebot will NOT Access any URL that begins with my domain name, followed by a question mark,followed by any string by using Robots.txt below. Sample URL http://mydomain.com/?example User-agent: Googlebot Disallow: /?
Technical SEO | | semer0 -
How much to change to avoid duplicate content?
Working on a site for a dentist. They have a long list of services that they want us to flesh out with text. They provided a bullet list of services, we're trying to get 1 to 2 paragraphs of text for each. Obviously, we're not going to write this off the top of our heads. We're pulling text from other sources and trying to rework. The question is, how much rephrasing do we have to do to avoid a duplicate content penalty? Do we make sure there are changes per paragraph, sentence, or phrase? Thanks! Eric
Technical SEO | | ericmccarty0 -
Robots.txt versus sitemap
Hi everyone, Lets say we have a robots.txt that disallows specific folders on our website, but a sitemap submitted in Google Webmaster Tools that lists content in those folders. Who wins? Will the sitemap content get indexed even if it's blocked by robots.txt? I know content that is blocked by robot.txt can still get indexed and display a URL if Google discovers it via a link so I'm wondering if that would happen in this scenario too. Thanks!
Technical SEO | | anthematic0 -
Does google recognize original content when affiliates use xml-feeds of this content
Hi, Concerning the upcoming (We're from the Netherlands) Panda release: -Could the fact that our affiliates use XML-feeds of our content effect our rankings in some way -Is it possible to indicate to google that content is yours? Kind regards, Dennis Overbeek [email protected] | ACSI publishing | www.suncamp.nl | www.eurocampings.eu
Technical SEO | | SEO_ACSI0 -
Duplicate Content
We have a main sales page and then we have a country specific sales page for about 250 countries. The country specific pages are identical to the main sales page, with the small addition of a country flag and the country name in the h1. I have added a rel canonical tag to all country pages to send the link juice and authority to the main page, because they would be all competing for rankings. I was wondering if having the 250+ indexed pages of duplicate content will effect the ranking of the main page even though they have rel canonical tag. We get some traffic to country pages, but not as much as the main page, but im worried that if we remove those pages and redirect all to main page that we will loose 250 plus indexed pages where we can get traffic through for odd country specific terms. eg searching for uk mobile phone brings up the country specific page instead of main sales page even though the uk sales pages is not optimized for uk terms other than having a flag and the country name in the h1. Any advice?
Technical SEO | | -Al-0