Remove Scraped Content?
-
There is a site I work for that has content that, when you search in Google a snippet of text from, they are not the top result for. I believe what has happened is that they had written blogs and articles and added them to their site and article directories at the same time and the article directories got cached first.
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Should I remove all content from our site where this is happening, even though we actually did create these articles?
-
I explained the answer to this in the second part of my original post.
-
I would hope you had a link, when possible, back to your site. If not, then the page should be dated by creation and last update which Google can see. Although I would not leave anything up to guess work, but make sure you have links, and I would even put the date it was posted onto the post on your site like news article are. Just another indicator.
I would not remove the content if in fact, it did originate from you.
-
Yes, it was intentionally distributed. I would like to know whether the duplicate content on our site is being seen (by Google) as copied, not original, scraped, pulled from another source because we're so lazy we can't come up with any material of our own??
If this is the case, I will be removing the content, as the quality of the content sucks and there is quite a bit of it. Please, do not respond "if the content sucks, then why have it on your site..."
-
The term "scraped content" is most often used for content that has been grabbed from your website by a visiting robot.
Based upon your posting, the duplicate content that you are talking about was intentionally distributed.
-
Then how do you determine if Google is seeing content as scraped? As you know, Google has made it very clear recently how they feel about scraped content.
-
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Search engines can not identify original authors. (unless you use the rel="author" attribute and then they are merely taking your word for it) They only know which page with the content was discovered first. The content could have been on other pages first or the content could have been published first offline. Search engines don't have divine powers
The page that ranks first in the SERPs is the one that has the best combination of relevance, domain authority and other ranking factors. Has nothing to do with authorship.
Should I remove all content from our site where this is happening, even though we actually did create these articles?
I would not do that if the content is valuable for your visitors, has acquired links from other sites or if the content is pulling traffic from search.
The take-away from this is not to give your content away if you want to rank for it in search. Giving it away can create strong competitors and feed existing competitors.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Another website copying our blog content but credit us. Still bad?
Hi Moz community, A few businesses that we work with are asking if they can leverage our content such as blogs by basically copying it and post it on their site. They will give us credit for the content though. My concern is that going to cause duplicate content issue and hurt us with our SEO? We'd like to provide it to them in a way that would benefit us or at least doesn't hurt us. I can think of a few possible options... 1. Have them only copy part of the content and link back to our site with a link "Read the original article" or something similar 2. Have them implement rel=canonical back to our site 3. Have them just copy the whole thing (because it doesn't really hurt us?). In that case, do we have them link back to us or no? Is there anything I missed? What's the best option for us? Thank you for the help in advance!
Content Development | | aphoontrakul1 -
Community Discussion - Should low-cost content providers be seen as viable options for content marketers?
Hello there, In the latest YouMoz post, "Case Study: How We Gained More than 100 Links for a Travel Website via Content Marketing," Tom McLoughlin recommends an idea for content creation that is sure to elicit strong opinions from all sides: "Websites like Fiverr and Upwork are fantastic resources for finding freelancers who do great work. It simply takes a bit of initial time to sift through and separate the wheat from the chaff. Once that’s done, give the freelancers a detailed brief and tell them exactly what you want." What's your opinion? Have you had good experiences using these sites? If so, what have you found as the keys to making the working relationship a success.
Content Development | | ronell-smith1 -
What is your strategy in looking for content to write relative to your niche?
looking to keep adding to our blog in a big way. Things I use are questions we get a lot, we add them to blog and answer them - works quite nicely. Look at other blogs although in our industry its not really there, etc. What are some of your strategies for looking for content to write about?
Content Development | | PaulDylan1 -
Marking our content as original, where the rel=author tag might not be applied
Hello, Can anyone tell, if it is possible to protect text –type content without the rel=author tag? We host a business listing site, where, apart from the general contact information, we have also started to write original 800+ character-long unique and original contents for the suppliers, where we expect visits, so rankings should be increased. My issue is that this is a very competitive business, and content crawling is really an everyday practice. Of course, I would like to keep my original content or at least mark it as mine for Google. The easiest way would be the author tag, but the problem is, that I do not want our names and our photos to be assigned to these contents, because from one hand, we are not acknowledged content providers on our own (no bio and whatsoever), and on the other hand, we provide contents for every sort of businesses, so just having additional links to our other contents, might not help readers to get what they want. I also really do not think that a photo of me could help increase the CTR from the SERP:) What we currently do, is that we submit every major fresh content through url submission in WMT, hoping that first indexing might help. We have only a handful of them within a day, so not more than 10. Yes, I could perhaps use absolute links, but this one is not a feasible scenario in all cases, and about DMCA, as our programmer says, what you can see on the internet, that you can basically own. So finally, I do not mind our contents being stolen, as I can’t possibly prevent this. I want however our original content to be recognized as ours by Google, even after the stealing is done. (Best would be an ’author tag for business’, so connected to our business Google+ page, but I am not aware, this function can be used this way.) Thank you in advance for all of you, sharing your thoughts with me on the topic.
Content Development | | Dilbak0 -
Nearely identical content
Hi Everybody, I'm just checking the warnings from Seomoz an realized that on our site there are a lot of duplicate page content problems. In fact some of them are not really duplicated content because there are subtle differencies ie. colour or pack of products: http://www.szepsegbolt.hu/termekek/david_beckham_intimately_yours_for_man_eau_de_toilette_30_ml.html http://www.szepsegbolt.hu/termekek/david_beckham_intimately_yours_for_man_eau_de_toilette_50_ml.html What do you suggest, ignore this warning or change something on the site? Thank you in advance Balint
Content Development | | SanomaMediaseo0 -
Wordpress hacked. Entire content wiped out
Someone hacked my Wordpress site, wiped out all my content and changed my login status to a subscriber. Years of hard work gone, I can't log in to fix anything. Is there anything I can do. Is there a way to prevent this from happening ever again. Is there a way to catch these people?
Content Development | | ArenaS0 -
Building Content on E-Commerce Store
Hey guys, In 2011 it seems more and more important to build great content on your website to help SERP rankings. With an E-Commerce store what is the best way to add content? Would using the blog and adding related blog articles related to the product work and internally linking the anchor text to the specific product page? Obviously it would be more beneficial to rank the specific product page so wouldn't this method take away from those efforts? Or do we bank on being able to channel the visitor from the blog to the product page? Thanks Jason
Content Development | | mediapoint0 -
Is this Duplicate Content?
I searched a snippet of one of our Articles (in quotes) and got two results back in Google, one for the article on our site and one for our development/staging site. Does that mean that our development site is getting indexed by Google, even thought we "Disallow:/" in the robots.txt file? Is this a big duplicate content issue? Thanks
Content Development | | poolguy0