Remove Scraped Content?
-
There is a site I work for that has content that, when you search in Google a snippet of text from, they are not the top result for. I believe what has happened is that they had written blogs and articles and added them to their site and article directories at the same time and the article directories got cached first.
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Should I remove all content from our site where this is happening, even though we actually did create these articles?
-
I explained the answer to this in the second part of my original post.
-
I would hope you had a link, when possible, back to your site. If not, then the page should be dated by creation and last update which Google can see. Although I would not leave anything up to guess work, but make sure you have links, and I would even put the date it was posted onto the post on your site like news article are. Just another indicator.
I would not remove the content if in fact, it did originate from you.
-
Yes, it was intentionally distributed. I would like to know whether the duplicate content on our site is being seen (by Google) as copied, not original, scraped, pulled from another source because we're so lazy we can't come up with any material of our own??
If this is the case, I will be removing the content, as the quality of the content sucks and there is quite a bit of it. Please, do not respond "if the content sucks, then why have it on your site..."
-
The term "scraped content" is most often used for content that has been grabbed from your website by a visiting robot.
Based upon your posting, the duplicate content that you are talking about was intentionally distributed.
-
Then how do you determine if Google is seeing content as scraped? As you know, Google has made it very clear recently how they feel about scraped content.
-
If we're not coming up first for our article, that means we are not believed to be the original author, correct?
Search engines can not identify original authors. (unless you use the rel="author" attribute and then they are merely taking your word for it) They only know which page with the content was discovered first. The content could have been on other pages first or the content could have been published first offline. Search engines don't have divine powers
The page that ranks first in the SERPs is the one that has the best combination of relevance, domain authority and other ranking factors. Has nothing to do with authorship.
Should I remove all content from our site where this is happening, even though we actually did create these articles?
I would not do that if the content is valuable for your visitors, has acquired links from other sites or if the content is pulling traffic from search.
The take-away from this is not to give your content away if you want to rank for it in search. Giving it away can create strong competitors and feed existing competitors.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to deal with lot of old content that doesn't drive traffic - delete?
Hi community, i hope someone can help me with this, We are migrating our e-commerce site next februari. I'm preparing the content migration. For a large part exact copies of our product listing and product detail pages will be migrated.
Content Development | | Marketing-Omoda
However, we also have a lot of old blog content, which is, because of seasonality and trendiness, outdated and doesn't drive traffic anymore. It actually is just worthless content. (Not only as a traffic driver, this also counts for extremely low to none internal driven traffic (both internal search and internal navigation). We have about 4.000+ blogs of which about 100 drive the most traffic (mostly incited by e-mail and social campaigns and internal navigation promoted on important category landing pages during some period. Is it a bad signal to search engines to delete these old content pages? I.a.: going from a content-rich to a content-poor site?
Off course I will migrate the top 100 traffic earning content and provide proper redirects to them0 -
Are press releases that could end up being published with duplicate content links point back to you bad for your site ?
With all the changes to the seo landscape in the resent years im a little unsure as to how a press release work looks in the eyes of Google (and others). For instance, you write up a 500 word press release and it gets featured on the following sites : Forbes Techcrunch BBC CNN NY Times etc ... If each of these cover your story but only rewrite 50% of the article (not saying these sites wouldn't re write the entire artcile, but for this purpose lets presume only 50% is rewritten) could it be negative to your backlink profile, ? Im thinking not, as these sites will have high authority, but what if once your press release is published on these sites 10 other smaller sites re publish the stories with almost no re writing, either straight from the press release or straight from the article in the mainstream news sites. (For clarification this Press release would be done in the fashion of a article suggestion to relevant journalists, rather than a blanket press release, via PR Newswire, mass mail out etc. Although i guess the effect with duplicate content backlinks is the same.) You now have c. 50 articles online all with very similar content with links pointing back at you, would this have a negative effect or would each link just not carry as much value as it normally would. By now we all understand publishing duplicate content on our own sites is a terrible idea, but dose have links pointing back to your self from duplicate (or similar) content hosted on other sites (some being highly authoritative) effect your site 's seo ?
Content Development | | Sam-P1 -
Duplicate Content - Joint Press Releases
One of my clients posts Joint Press Releases with it's partners on their blog. The client's partner posts the exact same press release on their website. I think this must meet the Duplicate Content flag... does anyone have experience in dealing with this? Sarita
Content Development | | sarita-2201190 -
Filling Up Content For A New News Publishing Site
Hello, SEO Gurus. I have a client whom I've been working with for a few months now, and part of our service offering is to publish and promote fresh, daily content on his site's blog. This strategy has been a huge success thus far, he is very happy with the content, etc. Now, he is getting ready to launch a second site, which will be a news publishing site for his industry niche, and we will once again be providing the content on a daily basis: we're going to be producing 10 to 15 articles a day. It's a big operation for us. The client, however, is concerned that he doesn't want the site to appear "thin" on content in the early going, and asked if it would be possible to populate the new site with the articles we wrote on the other site's blog. My gut reaction to this is that it would be an exceedingly bad idea to do this. While we are the ones who authored the original content (and we've used author tags and publishing markup), the best bet is to simply start fresh. Besides that, seeing as we'll be pumping out tons of content on a daily basis, it won't take long to fill up the content coffers. That being said, I just wanted to run this past you all and see if anyone had any alternative ideas on how to use the old content without it being duplicate content. I was thinking that maybe designating all of the old articles with noindex, nofollow could be an option? Many thanks in advance for your time and attention. Sincerely, Mike
Content Development | | RCNOnlineMarketing0 -
RSS feeds with dup content and titles
Hi, For my Buddypress site I use a tool to create sites with RSS feeds. Each site is for a different feed, but the number of dup tiles and content is running in the thousands. I've been trying to reduce the dups, but have begun to think there is more trouble from such content than benefit. Should I dump the content or ignore the errors flagged by SEOMOZ? Any ideas if thes RSS feed dups are hurting my BuddyPress site? Any suggestions in general about how to eliminate such dupe for a Buddypress Site, eg. the activity log. Larry
Content Development | | tishimself0 -
Article Distribution - Duplicate content or not?
Many SEO's disagree on this subject so I wanted to see what everyone thinks. Producing an article and then distributing to multiple article directories. In the eyes of Google is this considered spam(duplicate content) or it is not? As I understand unique content is good SEO. However, I also understand that visibility of the content on multiple sites is very important. What do youthink?
Content Development | | DmitryP0 -
Best strategy for content/articles. Individual pages or blog posts?
Hi all, Whilst adding content to one of my sites quite often I'm left deciding whether I should create an individual webpage for the content, or write it up as another blog post. More often I write it up as a static page so it fits in with the rest of my website more 'directly'. However I'm wondering if I'm missing out here as obviously I'm not taking advantage of the benefits of a blog, RSS, Tag Cloud, etc etc... Just wondering if others encounter the same quandary?
Content Development | | davebrown19750 -
Duplicate content and Facebook
If i have content on my site and the same content duplicated on my facebook pages, will google treat this as duplicate content? At the moment when i copy and paste a line of text from the content on my site Facebook is returned first.
Content Development | | Turkey0