Duplicate Content - Bulk analysis tool?
-
Hi
I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk?
I used Copyscape a while ago, but don't remember this having a bulk feature?
Thank you!
-
Great thank you!
I'll give both a go!
-
Great thanks
Yes I use screaming frog for this, but it was to look at actual page content. So yes to see if sites copy our content, but also to see whether we need to update our product content as some products are very similar.
I'll check the batch process on copyscape thanks!
-
I have not used this tool in this way, but have used it for other crawler projects related to content clean up and it is rock solid. They have been very responsive to me on questions related to use of the software. http://urlprofiler.com/
Duplicate content search is the project next on my list, here is how they do it.
http://urlprofiler.com/blog/duplicate-content-checker/
You let URL profiler crawl the section of your site that is most likely to be copied (say your blog) and you tell URL profiler what section of your HTML to compare against (i.e. the content section vs the header or footer). URL profiler then uses proxies (you have to buy the proxies) to perform Google searches on sentences from your content. It crawls those results to see if there is a site in the Google SERPs that has sentences from your content word for word (or pretty close).
I have played with Copyscape, but my markets are too niche for it to work for me. The logic here from URL profilers is that you are searching the database that most matters, Google.
Good luck!
-
I believe you might be able to use List Mode in ScreamingFrog to accomplish this, however it depends on ultimately what your goal is to check for duplicate content. Do you simply want to find duplicate titles or duplicate descriptions? Or do you want to find pages with sufficiently similar text as to warrant concern?
== Ooops! ==
It didn't occur to me that you were more interested in duplicate content caused by other sites copying your content rather than duplicate content among your list of URLs.
Copyscape does have a "Batch Process" tool but it is only available to paid subscribers. It does work quite nicely though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content - "Same" profile-information
Hi, I own a casting website with lots of profiles. Some of these profiles only typed in their firstname, email and age, when they registered on the site, and they haven't added more information ever since. From Crawl Diagnostics, I can see that there is "lots" of these profiles, which looks exactly the same (only showing age and firstname), allthought they are not the same. I could add which day the profile were created on the site, to maybe avoid these "duplications". The email will always be hidden. Or, how big an issue is this? Crawl Diagnostics tells me, that there is around 200 of these, and they are "marked" as High Priority. Any ideas on what to do? /Kasper
On-Page Optimization | | KasperGJ0 -
Duplicate content penalty
when moz crawls my site they say I have 2x the pages that I really have & they say I am being penalized for duplicate content. I know years ago I had my old domain resolve over to my new domain. Its the only thing that makes sense as to the duplicate content but would search engines really penalize me for that? It is technically only on 1 site. My business took a significant sales hit starting early July 2013, I know google did and algorithm update that did have SEO aspects. I need to resolve the problem so I can stay in business
On-Page Optimization | | cheaptubes0 -
Acquired Old, Bad Content Site That Ranks Great. Redirect to Content on My Site?
Hello. my company acquired another website. This website is very old, the content within is decent at best, but still manages to rank very well for valuable phrases. Currently, we're leaving the entire site active on its own for its brand, but i'd like to at least redirect some of the content back to our main website. I can't justify spending the time to create improved content on that site and not our main site though. What would be the best practice here? 1. Cross-domain canonical - and build the new content on our main website? 2. 301 Redirect Old Article to New Location containing better article 3. Leave the content where it is - you won't be able to transfer the ranking across domain. Thanks for your input.
On-Page Optimization | | Blenny0 -
Duplicate content on ecommerce
We have a website that we created a little over a year ago and have included our core products we have always focused on such as mobility scooters and power wheelchairs. We have been going through and updating product descriptions, adding product reviews that our customers have provided etc in order to improve on our SEO rankings and not be penalized by the Panda update. We were approached by a manufacturer last year about their products and they had close to 10k products that we were able to upload easily into our system. Obviously these all have standard manufacturers descriptions many sites are also using. It will take us forever to go through and change all of these and many products are similar to each other anyway they just vary in size, color etc. Will it help our rankings for our core products to simply go through and delete all of these additional products and categories and just add them one by one with unique descriptions and more detailed information when we have time? We aren't really selling many of them anyway so it won't hurt our sales. I'm clearly new to SEO and any help at all would be greatly appreciated. My main website is www.bestmedicalsuppliesonsale dot com A sample core category that we have changed descriptions for is http://www.bestmedicalsuppliesonsale.com/mobility-scooters-s/36.htm A sample of a category and products we simply uploaded would be at http://www.bestmedicalsuppliesonsale.com/Wound-Care-s/4837.htm I'm open to all suggestions I would just like to see my traffic and obviously sales increase. If there are any other glaring problems please let me know. I need help!
On-Page Optimization | | BestMedical0 -
Issue: Duplicate Page Content (index.htm)
I get an error of "**Issue:**Duplicate Page Content" for the following pages in the SEOMOZ Crawl Diagnostics. But these pages are the same one! Duhhhh.... Is there a way to hide this false error? http://www.stdtime.com/ http://www.stdtime.com/index.htm BTW, I also get "**Issue:**Duplicate Page Title" for this page. Another false error...
On-Page Optimization | | raywhite0 -
What is the best way to manage industry required duplicate Important Safety Information (ISI) content on every page of a site?
Hello SEOmozzer! I have recently joined a large pharmaceutical marketing company as our head SEO guru, and I've encountered a duplicate content related issue here that I'd like some help on. Because there is so much red tape in the pharmaceutical industry, there are A LOT of limitations on website content, medication and drug claims, etc. Because of this, it is required to have Important Safety Information (ISI) clearly stated on every page of the client's website (including the homepage). The information is generally pretty lengthy, and in some cases is longer than the non-ISI content on each page. Here is an example: http://www.xifaxan.com/ All content under the ISI header is required on each page. My questions are: How will this duplicated content on each page affect our on-page optimization scores in the eyes of search engines? Is Google seeing this simply as duplicated content on every page, or are they "smart" enough to understand that because it is a drug website, this is industry standard (and required)? Aside from creating more meaty, non-ISI content for the site, are there any other suggestions you have for handling this potentially harmful SEO situation? And in case you were going to suggest it, we cannot simply have an image of the content, as it may not be visible by all internet users. We've already looked into that 😉 Thanks in advance! Dylan
On-Page Optimization | | MedThinkCommunications0 -
Keyword Density Tools
Does anyone have recommendations on the best tool(s) to use to check the keyword density of each page of a website? I'm not sure if SEOmoz has such a tool.
On-Page Optimization | | webestate0 -
Duplication About PDF Files on Website
Hello, My site's URL (web address) is: http://www.vostastores.com/ Above is the Our Website URL. We are in the process of Upgrading Our Website and for that we are adding all Details of each and every products. One of the thing that we are planning to do is to get Manufacturer's product PDF files on our Website which the manufacturer already have on their website. So our Question is that Since the manufacturer has the file on their website and we want to add the same on our website, Will be there any Duplication issue? If yes, then please provide us with a Solution by which we can add the same on our website. Thanks & Regards.
On-Page Optimization | | CommercePundit0