How do I use public content without being penalized for duplication?
-
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized?
Thanks,
Ruben
-
I didn't think about other sites, but that's a fabulous point. Best to play it safe.
Thanks for your input!
- Ruben
-
My gut says that your idea to keep the content noindexed is best. Even if the content is unique it borders on the area of auto-generated. Now, I might change your mind if there were a lot of users that were actively interacting with most of these pages. If not though, then you'll end up having a large portion of your site consisting of auto-generated content that doesn't seem useful. Plus...it's also possible that other sites are using this information so you would end up having content that is duplicated on other sites too.
I could be wrong, but my gut says not to try to use this content for ranking purposes.
-
I appreciate the follow up, Marie. Please give me your thoughts on the following idea;
The NHTSA only posts the updates for the past month. If I noindex the page for now (which is what I'm doing) and wait five months, then what would happen? At that point, yes, the current month would be duplicated, but I'd have four months of "unique" content because the NHTSA deletes there's. Plus, I could add pictures of all the automobile, too. Do you think that would be enough to index it?
(I'm most likely going to keep it noindex, because this borders on shady, or at least, I could see google taking it that way, but just as a thought experiment, what do you think?) Or anyone else?
Thanks,
Ruben
-
To expand on EGOL's answer, if you are taking someone else's content (even with their permission) and wanting Google to index it then Google can see that you have a large amount of copied content on your site. This can trigger the Panda filter and can cause Google to consider your whole site as low quality.
You can add a noindex tag as EGOL suggested or you could use a canonical tag to show Google who the originator of the content is, but probably the noindex tag is easiest.
There is one other option as well. If you think it is possible that you can add significant value to the content that is being provided then you can still keep it indexed. If you can combine the recall information with other valuable information then that might be ok to index. But, you have to truly be providing value, not just padding the page with words to make it look unique.
-
Alright, that sounds good. Thanks!
-
I had a bunch of republished articles on my website, done mostly at the request of government agencies and universities. That site got hit in one of the early Panda updates. So, I deleted a lot of that content and to the rest I added this line above the tag
name="robots" content="noindex, follow" />
That tells google not to index the page but to follow the links and allow pagerank to flow through. My site recovered a few weeks later.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ranking without use of keywords on page & without use of matching anchor text??
Howdy folks. So, here is a dilemma. One of competitors of ours is somehow ranking for a keyphrase "houston chronicle obituaries" without any usage of these keywords on the page, without any full or partial anchor text match ("chronicle" is not used anywhere). The rest of competitiors' rankings make sense. Any ideas?
Intermediate & Advanced SEO | | DmitriiK0 -
Using unique content from "rel=canonical"ized page
Hey everyone, I have a question about the following scenario: Page 1: Text A, Text B, Text C Page 2 (rel=canonical to Page 1): Text A, Text B, Text C, Text D Much of the content on page 2 is "rel=canonical"ized to page 1 to signalize duplicate content. However, Page 2 also contains some unique text not found in Page 1. How safe is it to use the unique content from Page 2 on a new page (Page 3) if the intention is to rank Page 3? Does that make any sense? 🙂
Intermediate & Advanced SEO | | ipancake0 -
Duplicate Content Question
We are getting ready to release an integration with another product for our app. We would like to add a landing page specifically for this integration. We would also like it to be very similar to our current home page. However, if we do this and use a lot of the same content, will this hurt our SEO due to duplicate content?
Intermediate & Advanced SEO | | NathanGilmore0 -
I have search result pages that are completely different showing up as duplicate content.
I have numerous instances of this same issue in our Crawl Report. We have pages showing up on the report as duplicate content - they are product search result pages for completely different cruise products showing up as duplicate content. Here's an example of 2 pages that appear as duplicate : http://www.shopforcruises.com/carnival+cruise+lines/carnival+glory/2013-09-01/2013-09-30 http://www.shopforcruises.com/royal+caribbean+international/liberty+of+the+seas We've used Html 5 semantic markup to properly identify our Navigation <nav>, our search widget as an <aside>(it has a large amount of page code associated with it). We're using different meta descriptions, different title tags, even microformatting is done on these pages so our rich data shows up in google search. (rich snippet example - http://www.google.com/#hl=en&output=search&sclient=psy-ab&q=http:%2F%2Fwww.shopforcruises.com%2Froyal%2Bcaribbean%2Binternational%2Fliberty%2Bof%2Bthe%2Bseas&oq=http:%2F%2Fwww.shopforcruises.com%2Froyal%2Bcaribbean%2Binternational%2Fliberty%2Bof%2Bthe%2Bseas&gs_l=hp.3...1102.1102.0.1601.1.1.0.0.0.0.142.142.0j1.1.0...0.0...1c.1.7.psy-ab.gvI6vhnx8fk&pbx=1&bav=on.2,or.r_qf.&bvm=bv.44442042,d.eWU&fp=a03ba540ff93b9f5&biw=1680&bih=925 ) How is this distinctly different content showing as duplicate? Is SeoMoz's site crawl flawed (or just limited) and it's not understanding that my pages are not dupe? Copyscape does not identify these pages as dupe. Should we take these crawl results more seriously than copyscape? What action do you suggest we take? </aside> </nav>
Intermediate & Advanced SEO | | JMFieldMarketing0 -
Duplicate content in Webmaster tools, is this bad?
We launched a new site, and we did a 301 redirect to every page. I have over 5k duplicate meta tags and title tags. It shows the old page and the new page as having the same title tag and meta description. This isn't true, we changed the titles and meta description, but it still shows up like that. What would cause that?
Intermediate & Advanced SEO | | EcommerceSite0 -
How to Remove Joomla Canonical and Duplicate Page Content
I've attempted to follow advice from the Q&A section. Currently on the site www.cherrycreekspine.com, I've edited the .htaccess file to help with 301s - all pages redirect to www.cherrycreekspine.com. Secondly, I'd added the canonical statement in the header of the web pages. I have cut the Duplicate Page Content in half ... now I have a remaining 40 pages to fix up. This is my practice site to try and understand what SEOmoz can do for me. I've looked at some of your videos on Youtube ... I feel like I'm scrambling around to the Q&A and the internet to understand this product. I'm reading the beginners guide.... any other resources would be helpful.
Intermediate & Advanced SEO | | deskstudio0 -
I need to add duplicate content, how to do this without penalty
On a site I am working on we provide a landing page summary (say top 10 information snippets) and provide a link 'see more' to take viewers to a page with all the snippets. Now those first 10 snippets will be repeated in the full list. Is this going to be a duplicate content problem? If so, any suggestions.
Intermediate & Advanced SEO | | oznappies0 -
Mobile version creating duplicate content
Hi We have a mobile site which is a subfolder within our site. Therefore our desktop site is www.mysite.com and the mobile version is www.mysite.com/m/. All URL's for specific pages are the same with the exception of /m/ in them for the mobile version. The mobile version has the specific user agent detection capabilities. I never saw this as being duplicate content initially as I did some research and found the following links
Intermediate & Advanced SEO | | peterkn
http://www.youtube.com/watch?v=mY9h3G8Lv4k
http://searchengineland.com/dont-penalize-yourself-mobile-sites-are-not-duplicate-content-40380
http://www.seroundtable.com/archives/022109.html What I am finding now is that when I look into Google Webmaster Tools, Google shows that there are 2 pages with the same Page title and therefore Im concerned if Google sees this as duplicate content. The reason why the page title and meta description is the same is simply because the content on the 2 verrsions are the exact same. Only layout changes due to handheld specific browsing. Are there any speficific precausions I could take or best practices to ensure that Google does not see the mobile pages as duplicates of the desktop pages Does anyone know solid best practices to achieve maximum results for running an idential mobile version of your main site?1