Odd crawl test issues
-
Hi all, first post, be gentle...
Just signed up for moz with the hope that it, and the learning will help me improve my web traffic. Have managed to get a bit of woe already with one of the sites we have added to the tool. I cannot get the crawl test to do any actual crawling. Ive tried to add the domain three times now but the initial of a few pages (the auto one when you add a domain to pro) will not work for me.
Instead of getting a list of problems with the site, i have a list of 18 pages where it says 'Error Code 902: Network Errors Prevented Crawler from Contacting Server'. Being a little puzzled by this, i checked the site myself...no problems. I asked several people in different locations (and countries) to have a go, and no problems for them either. I ran the same site through Raven Tool site auditor and got some results. it crawled a few thousand pages. I ran the site through screaming frog as google bot user agent, and again no issues. I just tried the fetch as Gbot in WMT and all was fine there.
I'm very puzzled then as to why moz is having issues with the site but everyone is happy with it. I know the homepage takes 7 seconds to load - caching is off at the moment while we tweak the design - but all the other pages (according to SF) take average of 0.72 seconds to load.
The site is a magento one so we have a lengthy robots.txt but that is not causing problems for any of the other services. The robots txt is below.
Google Image Crawler Setup
User-agent: Googlebot-Image
Disallow:Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
#Disallow: /js/
#Disallow: /lib/
Disallow: /magento/
#Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/
Disallow: /catalog/product
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
#Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /catalog/product/gallery/Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txtPaths (no clean URLs)
#Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=Pagnation
Disallow: /?dir=
Disallow: /&dir=
Disallow: /?mode=
Disallow: /&mode=
Disallow: /?order=
Disallow: /&order=
Disallow: /?p=
Disallow: /&p=If anyone has any suggestions then please i would welcome them, be it with the tool or my robots. As a side note, im aware that we are blocking the individual product pages. Too many products on the site at the moment (250k plus) which manufacturer default descriptions so we have blocked them and are working on getting the category pages and guides listed. In time we will rewrite the most popular products and unblock them as we go
Many thanks
Carl
-
Thanks for the hints re the robots, will tidy that up.
-
Network errors can be somewhere between us and your site and not necessarily directly with your server itself. The best bet would be to check with your ISP for any connectivity issues to your server. Since your issues are only the first time they are reported, the next crawl may be more successful.
One thing though you will want to keep your user-agent directives in a single block of code without spaces.
so
Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/would need to look like:
Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/ -
Many thanks for the reply. The server we use is a dedicated server which we set up ourselves inc OS and control panel. Just seems very odd that every other tool is working fine etc but moz won't. I cannot see how it would need anything special from, say, Raven's site crawler.
I will check out those other threads though to see if i missed anything, thanks for the links.
Just checked port 80 using http:// www.yougetsignal. com/tools/open-ports/ (not sure if links allowed) and no problems there.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have too many tittle tag issue for my site on moz site crawl error
I have too many tittle tag issue in site crawl error but when I checked manually for the error there is no title in source code. Please Help me to understand
Moz Bar | | Nileshaggarwal0 -
I get a redirect chain issue, but is it because of how I entered the campaign?
I get a redirect chain issue, but I see that I enter the website in Moz as http://www.website.com Google and everywhere else have it as https://www.website.com so I would imaging Google would never run into this issue, only the Moz bot does because of how it's entered. However, I can't change the campaign, so do I just ignore it? Or is there still an actual problem that needs to be addressed?
Moz Bar | | bizmarquee0 -
Moz can't crawl my new website?
We had a new website go live at the end of April - I keep requesting crawl tests but I get this in the excel copy... URL Title Tag
Moz Bar | | RayflexGroup
http://www.pvc-strip.co.uk 602 : Page redirects to a URL outside the scope of this campaign. I always list the website as https://... but the crawl always returns the http:// version. Not sure what I can do to make sure the website can be crawled?0 -
Moz Top Pages and Tool Bar Not Crawling Internal Pages and Links
Hello, We’re having two issues with our Moz tools and we’re not sure what’s causing them and whether they are related. The Moz Bar isn’t highlighting some of our internal links (including navigation links). The Top Pages Report in Open Site Explorer is only picking up the homepage and a couple error pages (none of the internal pages). The full Crawl Report is picking up everything though. Could a potential cause of both these issues be the Title attribute in some our links? – We use <a <="" span="">title="Example" href="link"></a> <a <="" span="">Or is this most likely from something else blocking the crawler from accessing our links/pages? Google Search Console does seem to be picking up the links in the navigation and everything is indexed/rendered correctly so we also didn’t know if this is something that could be issue. Any insight or help would be appreciated. Please let us know if there are any details we could provide that might help. Looking forward to hearing from all of you! Thank you in advance. Best,</a>
Moz Bar | | Ben-R0 -
Is there a way to export all your crawl errors for multiple Moz campaigns at once?
We're looking for a simple way to export all crawl errors for our Moz campaigns. More than likely we could use the API, but was wondering if there was any functionality already built into Moz for exporting all crawl errors.
Moz Bar | | ReunionMarketing0 -
Moz Crawl Report showing non-existent Duplicate Errors since new reporting layout
Hi Moz Community, Since Moz changed to the new style of Crawl report, we've seen a jump in duplicate errors for our site. These duplicate errors do not exist and were not present on the Crawl reports before the report change and also we have not made any changes to the flagged pages on our site since then either. When you download the report data in csv it appears that the Moz report is mixing up data for two or more pages on the site. e.g.in csv for 'Page1' data, it will show the meta description for 'Page2' and 'Page2' shows that for 'Page1', so this then gets flagged as duplicate, however looking at the actual Meta description assigned onsite, both Page 1 and Page 2 are completely unique. Has anyone else experienced this and Moz Team - are you looking into this? Thanks, V
Moz Bar | | WWTeam1 -
Canonicals in crawling reports
The crawling reports gives info about several meta data missing, what about the lack of a canonical tag? This would be nice too... and images without alt tag (or empty).
Moz Bar | | KBC0 -
Way has the number of pages crawled plummeted?
Why has the number of pages crawled for our campaign plummeted in Moz Analytics – down to 729 from over 10k? Don't see any issues in Google Analytics with crawling our site.
Moz Bar | | EyeglassesGuy0