How to publish duplicate content legitimately without Panda problems
-
Let's imagine that you own a successful website that publishes a lot of syndicated news articles and syndicated columnists.
Your visitors love these articles and columns but the search engines see them as duplicate content.
You worry about being viewed as a "content farm" because of this duplicate content and getting the Panda penalty.
So, you decide to continue publishing the content and use...
<meta name="robots" content="noindex, follow">
This allows you do display the content for your visitors but it should stop the search engines from indexing any pages with this code. It should also allow robots to spider the pages and pass link value through them.
I have two questions.....
-
If you use "noindex" will that be enough to prevent your site from being considered as a content farm?
-
Is there a better way to continue publication of syndicated content but protect the site from duplicate content problems?
-
-
Good idea about attributing with rel=canonical.
Thanks!
-
Noindexing the syndicated articles should, in theory, minimize the likelihood of having a Panda problem, but it seems like Panda is constantly evolving. You will probably see some kind of drop in rankings as the number of indexed pages of for site will decrease. If you have say, 1000 pages total on the site and suddenly 900 are taken out of the index, this might be a problem. If it is a much smaller percentage of the site, you might not have a problem at all. Other than the number of indexed pages, I don't think you will have a problem once the syndicated stuff is noindexed.
It will probably take Google a while to re-index/un-index the pages, so hopefully it won't be a fast drop if there is one. In the long run, it is probably better to at least have the appearance of trying to do the right thing. Linking to the source, and maybe using rel=canonical tags to the original article would also be a good practice. -
Thank you, Nick.
We will be using the "noindex" only on the pages with syndicated content. This is a DreamWeaver site and it is easy to place the code on specific pages and does not use excerpts.
Do you still see a potential problem?
The question really is... "Could a site that contains a lot of syndicated content have a Panda problem if the pages that contain that content are noindexed?"
-
I am assuming you intend to use no index only on the duplicate content articles. Using no index on everything would also prevent your content from being indexed and found through Google.
If you are using Wordpress or something else that will allow showing excerpts, you could try making the article pages noindex and show only excerpts on the main page and category pages which would be indexed and followed. I think that would make the articles not appear in searches and avoid duplicate content penalties, while allowing the pages that show the excerpts to still be indexed and rank OK.
The idea here is that the pages showing the excerpts would have enough text to help the home and category pages to rank for the subject matter and hopefully not be seen as what it is - copied content.
You will probably eventually get caught by the Panda, but this may work as a temporary solution until you can get some original content mixed in.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does duplicate content not concern Rand?
Hello all, I'm a new SEOer and I'm currently trying to navigate the layman's minefield that is trying to understand duplicate content issues in as best I can. I'm working on a website at the moment where there's a duplicate content issue with blog archives/categories/tags etc. I was planning to beat this by implementing a noindex meta tag on those pages where there are duplicate content issues. Before I go ahead with this I thought: "Hey, these Moz guys seem to know what they're doing! What would Rand do?" Blogs on the website in question appear in full and in date order relating to the tag/category/what-have-you creating the duplicate content problem. Much like Rand's blog here at Moz - I thought I'd have a look at the source code to see how it was dealt with. My amateur eyes could find nothing to help answer this question: E.g. Both the following URLs appear in SERPs (using site:moz,com and very targeted keywords, but they're there): https://mza.bundledseo.com/rand/does-making-a-website-mobile-friendly-have-a-universally-positive-impact-on-mobile-traffic/ https://mza.bundledseo.com/rand/category/moz/ Both pages have a rel="canonical" pointing to themselves. I can understand why he wouldn't be fussed about the category not ranking, but the blog? Is this not having a negative effect? I'm just a little confused as there are so many conflicting "best practice" tips out there - and now after digging around in the source code on Rand's blog I'm more confused than ever! Any help much appreciated, Thanks
Technical SEO | | sbridle1 -
Duplicate content on charity website
Hi Mozers, We are working on a website for a UK charity – they are a hospice and have two distinct brands, one for their adult services and another for their children’s services. They currently have two different websites which have a large number of pages that contain identical text. We spoke with them and agreed that it would be better to combine the websites under one URL – that way a number of the duplicate pages could be reduced as they are relevant to both brands. What seamed like a good idea initially is beginning to not look so good now. We had planned to use CSS to load different style sheets for each brand – depending on the referring URL (adult / Child) the page would display the appropriate branding. This will will work well up to a point. What we can’t work out is how to style the page if it is the initial landing page – the brands are quite different and we need to get this right. It is not such an issue for the management type pages (board of trustees etc) as they govern both identities. The issue is the donation, fundraising pages – they need to be found, and we are concerned that users will be confused if one of those pages is the initial landing page and they are served the wrong brand. We have thought of making one page the main page and using rel canonical on the other one, but that will affect its ability to be found in the search engines. Really not sure what the best way to move forward would be, any suggestions / guidance would be much appreciated. Thanks Fraser .
Technical SEO | | fraserhannah0 -
Issue with duplicate content
Hello guys, i have a question about duplicate content. Recently I noticed that MOZ's system reports a lot of duplicate content on one of my sites. I'm a little confused what i should do with that because this content is created automatically. All the duplicate content comes from subdomain of my site where we actually share cool images with people. This subdomain is actually pointing to our Tumblr blog where people re-blog our posts and images a lot. I'm really confused how all this duplicate content is created and what i should do to prevent it. Please tell me whether i need to "noindex", "nofollow" that subdomain or you can suggest something better to resolve that issue. Thank you!
Technical SEO | | odmsoft0 -
How to deal with 80 websites and duplicated content
Consider the following: A client of ours has a Job boards website. They then have 80 domains all in different job sectors. They pull in the jobs based on the sectors they were tagged in on the back end. Everything is identical across these websites apart from the brand name and some content. whats the best way to deal with this?
Technical SEO | | jasondexter0 -
Https Duplicate Content
My previous host was using shared SSL, and my site was also working with https which I didn’t notice previously. Now I am moved to a new server, where I don’t have any SSL and my websites are not working with https version. Problem is that I have found Google have indexed one of my blog http://www.codefear.com with https version too. My blog traffic is continuously dropping I think due to these duplicate content. Now there are two results one with http version and another with https version. I searched over the internet and found 3 possible solutions. 1 No-Index https version
Technical SEO | | RaviAhuja
2 Use rel=canonical
3 Redirect https versions with 301 redirection Now I don’t know which solution is best for me as now https version is not working. One more thing I don’t know how to implement any of the solution. My blog is running on WordPress. Please help me to overcome from this problem, and after solving this duplicate issue, do I need Reconsideration request to Google. Thank you0 -
Duplicate Content
Hello All, my first web crawl has come back with a duplicate content warning for www.simodal.com and www.simodal.com/index.htm slightly mystified! thanks paul
Technical SEO | | simodal0 -
Question about duplicate content within my site
Hi. New here to SEOmoz and also somewhat new to SEO in general. A friend has asked me to help do some onsite SEO for their company's website. The company uses Drupal Content Management System. They have a couple product pages that contain a tabbed section for features, accessories, etc. When they built their tabs, they used a Drupal module called Quicktabs, by which each individual tab is created as a separate page and then pulled into the tabs from those pages. So, in essence, you now have instances of repeated content. 1) the page used to create the tab, and 2) the tab that displays on the product page. My question is, how should I handle the pages that were used to create the tabs? Should I make them NOINDEX? Thank you for your advice in advance.
Technical SEO | | aprilm-1890400 -
Duplicate content
I have to sentences that I want to optimize to different pages for. sentence number one is travel to ibiza by boat sentence number to is travel to ibiza by ferry My question is, can I have the same content on both pages exept for the keywords or will Google treat that as duplicate content and punish me? And If yes, where goes the limit/border for duplicate content?
Technical SEO | | stlastla0