Many Duplicate Content Flags
-
Not sure about you all, but I’m loving the new Moz Site Crawler. However, I was noticing that it is identifying a huge amount of pages as duplicate content.
There are about 30,000 pages in this website, with that said we’ve had to make many templates to make the site scalable. Additionally a url rule was lost which caused a significant amount of duplicate pages to be created. I am working through using the moz crawl tool to identify duplicate pages but noticing many pages under “Affected Pages,” are actually unique content pages with initial content that is duplicate.
I read that Moz flags any pages with 90% or more content overlapping content or code. My theory for this is that some templates that are too similar, to the point that Moz reads them as duplicative. Has this happened for anyone else?
In addition, if Moz is flagging these similar pages as duplicate content, do we surmise that Google bots are having the same issue? We have seen issues with rankings as it pertains to the actual duplicate pages but hadn't experienced issues across the unique pages, they are hyperlocal pages so we are able to see rankings quite easily.
-
Hey there,
Sam from Moz's Help Team here!
You're correct - our tool has a 90% tolerance for duplicate content, which means it will flag any content that has 90% of the same code between pages. This includes all the source code on the page and not just the viewable text.
You can run your own checks for percentage similarity using this tool: http://smallseotools.com/similar-page-checker/.
If we're identifying two pages as having duplicate content it's likely that Google will be running into the same issues. You can read a little more about duplicate content, and how to resolve it, on our resource page here: https://mza.bundledseo.com/learn/seo/duplicate-content
Let us know if we can help with anything else!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need help fixing a duplicate content issue for my website. The moz crawl is show OMG my website with https:// and https://www. But I have never used the url https:// so I don’t understand why moz is showing this
Moz is showing my url with two different starts. Https:// and then the one I use https://www. The problem is I don’t think I have ever used the url without the www. at the start. How do I fix this?
Moz Bar | | jdp_uk0 -
Trying to duplicate Screaming Frog report
I am trying to use the crawl report to generate the Screaming Frog report shown below. I want to use it to calculate the internal page rank as Paul Shapiro outlined in this article. Google is not seeing my site the way I want them to and I want to work on the site heirarchy and structure. I am not seeing where: Type, Anchor, and Alt are listed in the crawl report - is it called something else? Is there another report I can run for the whole site to generate the data above? Does the Moz PA already calculate the internal PR of pages? Any additional resources or people that could help me with this? Thanks, SAM
Moz Bar | | SammyT0 -
How can I find duplicate pages from a Moz Crawl?
We have many duplicate pages that show up on the Moz Crawl, and we're trying to fix these but it's very difficult because I can't see a way to isolate the code where the duplicate is found. For instance, http://experiencemission.org/immersion/ is one of our main pages, and the crawl shows one duplicate of http://experiencemission.org/immersion. It appears that one of our staff manually edited the source code in one of our pages but forgot the trailing slash. This would be an easy fix but the problem is that this page is linked to internally on our website 2423 times, so it's next to impossible to find the code that is incorrect. We have many other pages with this same basic problem. We know we have duplicates, but it's next to impossible to isolate them. So my question is this: When viewing the Moz Crawl data is there any way to see where a specific duplicate page link is located on our website? Thanks for any and all help!
Moz Bar | | expmission0 -
What is Considered Duplicate Content by Crawlers?
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense. For example, the following URL's were grouped together as duplicate content: | https://www.firefold.com/contact-us | https://www.firefold.com/gabe | https://www.firefold.com/sale | | | How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content. Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data. Any help would be greatly appreciated. Ryan
Moz Bar | | RyanRhodes0 -
Many more 404 being reported in GWT than MA
Hi I have been submitting MA crawl reports to clients developers post going live with a new site migration and instructing them to set up 301's for any 404's still reporting which they have now done (despite instructing them not to go live until all old url's mapped & 301'd to new replacement page, or HP if no replacement). 1) When i look in GWT crawl errors there has been a spike since going live with 660 'page not founds' being reported compared to x11 404's in MA. Could there be a 'lag' in GWT reporting and actually they have already been dealt with just not updated by GWT in this report and the MA report is more accurate ? Should i wait and see/or mark as fixed and see if return tomorrow, or tell dev to immediately investigate ? I have checked some samples links and they are going to 404 type pages so presume they are still broken and urgent issue dev must fix immediately ? 2) How long does it take aprox for the page authority to be transferred via a 301 redirect to a new page since i see some category pages that had good PA and have been 301'd to new category urls, are now showing a PA of just 1 !! Cheers Dan
Moz Bar | | Dan-Lawrence0 -
Crwal errors : duplicate content even with canonical links
Hi I am getting some errors for duplicate content errors in my crawl report for some of our products www.....com/brand/productname1.html www.....com/section/productname1.html www.....com/productname1.html we have canonical in the header for all three pages <link rel="canonical" href="www....com productname1.html"=""></link rel="canonical" href="www....com>
Moz Bar | | phes0 -
Dupe content report showing in 'Errors' section when surely should be in 'Warnings' section ?
Why is the dupe content info showing in errors and not warnings ? Since if dupe content can get your site penalised (as per Panda) or worse banned, surely it should be in that section of reports ? Cheers
Moz Bar | | Dan-Lawrence
Dan0 -
Moz Dupe content crawl anomaly
Hi Moz has completed a crawl for a site i'm working on which also has a development area (hence with lots of dupe content) on a sub domain (and this dev area hasn't been hidden from crawlers via password, robots, gwt etc etc). Moz dupe content report is not showing any of these urls though even though my campaign setting is on 'root' domain so i would have thought report should be listing the subdomain urls as dupe content (because they are dupe content). Any ideas ? Cheers Dan
Moz Bar | | Dan-Lawrence0