Crawl Diagnostics Report
-
I'm a bit concerned about the results I'm getting from the Crawl Diagnostics Report.
I've updated the site with canonical urls to remove duplicate content and when I check the site - it all displays the right values, but the report, which has just finished crawling is still showing a lot of pages as duplicate content.
Simple example:
Both of them are in the duplicate content section although both have canonical url set as:
Does each crawl check the entire site from the beginning or just the pages it didn't have a chance to crawl the last time?
This is just one of 333 duplicate content pages, which have canonical url pointing to the right page.
Can someone please explain?
-
Yep!
That's why I really like the csv files because you can sort stuff and filter things down to specifically what you want to see.
-
Hi Kenny,
Thanks for getting back to me.
So is it just the way it is reported on the page and it's not the actual problem with the duplicate content?
-
Hi Sebastian,
Sorry for the confusion. Our software currently reports those urls as having both duplicate content and canonical tags. I find that the best way to view this information is by exporting your crawl diagnostic's csv file. You can easily locate the export functionality in the upper right of the crawl diagnostic page.
Kenny
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
The crawl report shows a lot of 404 errors
They are inactive products, and I can't find any active links to these product pages. How can I tell where the crawler found the links?
Moz Pro | | shopwcs0 -
Link report that is broken down by C Block?
I've tried to do this in the advanced reports are of Moz, but to no avail. I just want to be able to see all the links (and anchor text would be nice too) for each CBlock.
Moz Pro | | DeluxeCorp0 -
Crawl Report Warnings
How much notice should be paid to the warnings on the SEO Moz crawl reports? We manage a fairly large property site and a lot of the errors on the crawl reports relate to automated responses. As a matter of priority which of the list below will have negative affects with the search engines? Temporary RedirectToo Many On-Page LinksOverly-Dynamic URLTitle Element Too Long (> 70 Characters)Title Missing or EmptyDuplicate Page ContentDuplicate Page TitleMissing Meta Description Tag
Moz Pro | | SoundinTheory0 -
Crawl Diagnostics - unexpected results
I received my first Crawl Diagnostics report last night on my dynamic ecommerce site. It showed errors on generated URLs which simply are not produced anywhere when running on my live site. Only when running on my local development server. It appears that the Crawler doesn't think that it's running on the live site. For example http://www.nordichouse.co.uk/candlestick-centrepiece-p-1140.html will go to a Product Not Found page, and therefore Duplicate Content errors are produced. Running http://www.nhlocal.co.uk/candlestick-centrepiece-p-1140.html produces the correct product page and not a Product Not Found page Any thoughts?
Moz Pro | | nordichouse0 -
SEOmoz Report: Competitive Domain Analysis
Hi there, According to SEOmoz Dashboard's Competitive Domain Analysis, the 'Total Links' for our domain shot up (more than quatripled) in the last few weeks. We haven't really done much link building in over 6 months so not sure how to explain this. Did SEOmoz make updates to their bots? Could it be that the bots are finding all these links only now? We've been using SEOmoz reporting for almost a year now, and I've never seen this before. Some insight / explanation for this would be great. Thanks in advance,
Moz Pro | | RBA
Gemma0 -
How long is a full crawl?
It's been now over 3 days that the dashboard for one of our campaigns shows "Next Crawl in Progress!". I am not complaining about the length... but I have to agree that SEOMoz is quite addictive, and it's quite frustrating to see that everyday 🙂 Thanks
Moz Pro | | jgenesto0 -
Why does my crawl report show just one page result?
I just ran a crawl report on my site: http://dozoco.com The result report shows results for just one page - the home page, but no other pages. The report doesn't indicate any errors or "do not follows" so I'm unclear on the issue, although I suspect user error - mine.
Moz Pro | | b1lyon0