Crawl Diagnostics Report Lacks Information
-
When I look at the crawl diagnostics, SEOMoz tells me there are 404 errors.
This is understandable, because some pages were removed.
What this report doesn't tell me is how those pages were discovered.
This is a very important piece of information, because it would tell me there are links pointing to those pages, either internal or external. I believe the internal links have been removed.
If the report told me how if found the link, I would be able to take immediate action. Without that information, I have to go so a lot of investigation. And when you have a million pages, that isn't easy.
Some possibilities:
- The crawler remembered the page from the previous crawl.
- There was a link from an index page - i.e. it is in the database still
- There was an individual link from another story - so now there are broken links
- Ditto, but it in on a static index page
- The link was from an external source - I need to make a redirect
Am I missing something, or is this a feature the SEO Moz crawler doesn't have yet?
What can I do (other than check all my pages) to discover this?
-
OK thank you, Ralph
I can work on that.
-
I think it's the SEOMoz crawler, but what I have found is that the error reports are limited here whereas GWT is much bigger and shows the links leading to the error. My guess is that SEOMoz limit the number of crawl errors they show due to limitations set on their crawler i.e. while their crawl is comprehensive, it's not going to capture what Google does.
-
Thank you Ralph.
Yes, had it for years. So is this a GWT report? I thought it was SEOMoz !
No not IIS, Linux.
-
If you download the csv file for the crawl you can sort it by http status to get all of the 404 errors together. Then there is a specific column that contains the referrer that provides the information you are after.
-
This may be a silly question, but have you got Google Webmaster tools installed? That will show you the source of the errors.
If your site is on IIS then you should also use the awesome IIS SEO toolkit provided by Microsoft for free.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Crawl 4xx Errors?
Hello! When I check our website's critical crawler issues with Moz Site Crawler, I'm seeing over 1000 pages with a 4xx error. All of the pages that are showing to have a 4xx error appear to be the brand and product pages we have on our website, but with /URL at the end of each permalink. For example, we have a page on our site for a brand called Davinci. The URL is https://kannakart.com/davinci/. In the site crawler, I'm seeing the 4xx for this URL: https://kannakart.com/davinci/URL. Could this be a plugin on our site that is generating these URLs? If they're going to be an issue, I'd like to remove them. However, I'm not sure exactly where to begin. Thanks in advance for the help, -Andrew
Moz Pro | | mostcg0 -
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
Find a 4xx or 5xx link referenced in an SEO Crawl Report
So I just got the Crawl Diagnostics report for a client site and it came back with a number of 4xx errors and even 1 5xx error. So while I can find the URL that has the problem, I cannot find the pages that have the links pointing to these non-existent or problematic pages. Normally I would just search the database for the site, but in this case I don't have access to it as the site is on a proprietary platform with no access other than to the CMS. Is there anyway to get the linking URL from the report? Thanks!
Moz Pro | | farlandlee0 -
Crawl Diagnostics : Problem of display in Excell.
Hi Mozers, I've just finished watching the Crawl Diagnostics Webinar and when I try to export one of my campaign into the CSV format, I've a display problem into Microsoft Excell. Every headtitles are into the "A" column so, I can't do anything with that : I can't organize the data,... It's totally unreadable. What can I do? Thank you for yours answers. Jonathan
Moz Pro | | JonathanLeplang0 -
Why are not nofollowed links counted in On-Page Analysis Report?
When I run the On-Page Analysis on our homepage, the report says the page has 238 **"Internal followed links". ** Why are not nofollowed internal links counted as well? Nofollowed links have been leaking link juice for quite some time now. Martin
Moz Pro | | TalkInThePark0 -
I have corrected the Problems in Crawl Diagnostics. When would it refresh/ re-crawl my site ?
I have corrected most of the problems shown in crawl diagnostics and changed the meta desc. , titles etc. When will SEOMOZ recrawl those pages and show that Its correct now ?
Moz Pro | | VarunBansal0 -
Crawl slow again
Once again the weekly crawl on my site is very slow. I have around 441 pages in the crawl and this has been running for over 12 hours. This last happened two weeks ago (ran for over 48 hours). Last week's crawl was much quicker (not sure exactly how long but guessing an hour or so). Is this a known issue and is there anything that can be done to unblock it? Weekends are the best time for me to assess and respond to changes I have made to my site so having this (small) crawl take most of the weekend is really quite problematic. Thanks. Mark
Moz Pro | | MarkWill0