What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Scraping Website and Using Our Clients Info
One of our clients on Moz has noticed that another website has been scraping their website and pulling lots of their content without permission. We would like to notify Google about this company but are not sure if that is the right remedy to correct the problem. They appear in search results on Google using the client's name so they seem to be use page titles etc with the client's name in them. Several of the SERP links link to their own website but it pulls in our client's web page. Was hoping anyone could perhaps provide some additional options on how to attack this problem?
White Hat / Black Hat SEO | | InTouchMK0 -
Social engineering content detected
hello, i have Got Social engineering content detected Message on webmaster tools on my around 20 sites, i have checked on server cleared, all unnecessary folders, But still i am not getting rectified this issue. One more error i got is Remove the deceptive content, But there is no any content on website which can harm my site, so kindly help & tell us steps we need take to resolve this issue, i am facing it from 10 days, yet not able to resolve, thnx in advance
White Hat / Black Hat SEO | | rohitiepl0 -
Pages mirrored on unknown websites (not just content, all the HTML)... blackhat I've never seen before.
Someone more expert than me could help... I am not a pro, just doing research on a website... Google Search Console shows many backlinks in pages under unknown domains... this pages are mirroring the pages of the linked website... clicking on a link on the mirror page leads to a spam page with link spam... The homepage of these unknown domain appear just fine... looks like that the domain is partially hijacked... WTF?! Have you ever seen something likes this? Can it be an outcome of a previous blackhat activity?
White Hat / Black Hat SEO | | 2mlab0 -
Competitor ranking well with duplicate content—what are my options?
A competitor is ranking #1 and #3 for a search term (see attached) by publishing two separate sites with the same content. They've modified the title of the page, and serve it in a different design, but are using their branded domain and a keyword-rich domain to gain multiple rankings. This has been going on for years, and I've always told myself that Google would eventually catch it with an algorithm update, but that doesn't seem to be happening. Does anyone know of other options? It doesn't seem like this falls under any of the categories that Google lists on their web spam report page—is there any other way to get bring this up with the powers that be, or is it something that I just have to live with and hope that Google figures out some day? Any advice would help. Thanks! how_to_become_a_home_inspector_-_Google_Search_2015-01-15_18-45-06.jpg
White Hat / Black Hat SEO | | inxilpro0 -
Site Scraping and Canonical Tags
Hi, So I recently found a site (actually just one page) that has scraped my homepage. All the links to my site have been removed except the canonical tag, should this be disavowed through WMT or reported through WMT's Spam Report? Thanks in advance for any feedback.
White Hat / Black Hat SEO | | APFM0 -
Same content, different target area SEO
So ok, I have a gambling site that i want to target for Australia, Canada, USA and England separately and still have .com for world wide (or not, read further).The websites content will basically stays the same for all of them, perhaps just small changes of layout and information order (different order for top 10 gambling rooms) My question 1 would be: How should I mark the content for Google and other search engines that it would not be considered "duplicate content"? As I have mentioned the content will actually BE duplicate, but i want to target the users in different areas, so I believe search engines should have a proper way not to penalize my websites for trying to reach the users on their own country TLDs. What i thought of so far is: 1. Separate webmasterstools account for every domain -> we will need to setup the user targeting to specific country in it.
White Hat / Black Hat SEO | | SEO_MediaInno
2. Use the hreflang tags to indicate, that this content is for GB users "en-GB" the same for other domains more info about it http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077
3. Get the country specific IP address (physical location of the server is not hugely important, just the IP)
4. It would be great if the IP address for co.uk is from different C-class than the one for the .com Is there anything I am missing here? Question 2: Should i target .com for USA market or is there some other options? (not based in USA so i believe .us is out of question) Thank you for your answers. T0 -
Content box (on page content) and titles Google over-optimization penalty?
We have a content box at the bottom of our website with a scroll bar and have posted a fair bit of content into this area (too much for on page) granted it is a combination of SEO content (with links to our pages) and informative but with the over optimization penalty coming around I am a little scared if this will result in a problem for us. I am thinking of adopting the process of this website HERE with the content behind a more information button that drops down, would this be better as it could be much more organised and we will be swopping out to more helpful information than the current 50/50 (SEO – helpful content) or will it be viewed the same and we might as well leave it as is and lower the amount of repetition and links in the content. Also we sell printed goods so our titles may be a bit over the top but they are bring us a lot of converting traffic but again I am worried about the new Google release this is an example of a typical title (only an example not our product page) Banner Printing | PVC Banners | Outdoor Banners | Backdrops | Vinyl Banners | Banner Signs Thank you for any help with these matters.
White Hat / Black Hat SEO | | BobAnderson0 -
Is using twiends.com to get twitter followers considered black hatting?
Hi, I've been struggling to get followers on Google Plus and Twitter, and recently stumbled upon twiends.com. It offers an easy service that allows you to get twitter followers very quickly. Is this considered black hating? Even if Google doesn't consider the followers as valid, am I likely to be punished if using their service? Even if it doesn't help rankings, it is nice to have lots of followers so that they will see my tweets which has the potential to drive more traffic to my site, and give awareness to my business. What are your thoughts?
White Hat / Black Hat SEO | | eugenecomputergeeks0