Our crawler was not able to access the robots.txt file on your site
-
Hello Mozzers!
I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.
https://www.thefurnshop.co.uk/robots.txt
and Google isn't flagging anything up to us.
Does anyone know how to solve this problem?
Thanks
-
@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.
-
Wanted to jump back in on this topic as I've just confirmed my initial suspicion.
I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.
Reconfigure your site and it should begin to work.
-
There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:
First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).
Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /
-
Hi there,
There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved What would the exact text be for robots.txt to stop Moz crawling a subdomain?
I need Moz to stop crawling a subdomain of my site, and am just checking what the exact text should be in the file to do this. I assume it would be: User-agent: Moz
Getting Started | | Simon-Plan
Disallow: / But just checking so I can tell the agency who will apply it, to avoid paying for their time with the incorrect text! Many thanks.0 -
I want to increase the DA of my Site
Dear all i have established a new site and want to increase its DA fast. So i need your suggestions in this regard. This newly established blog is my first independant project so i want to make it perfect.
Getting Started | | hamza52520 -
Moz site crawl doesn't work
The Moz site crawl isn't working for my campaign, but works for the site's on demand crawl. The search should not be disallowed by robots.txt or the headers. I'd like to be able to track the website for the campaign so I can see SEO gains / losses and increases / decreases in indexing.
Getting Started | | DrainKing0 -
Why is Moz unable to crawl my site?
Was hoping someone could advise why Moz is unable to crawl my site at https://www.oceaniacruises.com **Moz was unable to crawl your site on Oct 5, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Any help would be appreciated. Thanks!
Getting Started | | jbarinaga0 -
How to have MOZ site crawl pre-launch
Hi, Our new website is about to launch. We would love to have moz.com SITE CRAWL our site before launch. For issues like "missing meta description" and everything else that moz.com checks. We would love to do it before we launch. The new site is currently on a different domain than our live site. example.com <-- this is our live site. new-site.com <-- this is our "staging" server with the new site. We have a long running campaign for example.com Do we need to create a new campain for new-site.com ? Or is there some other simpler way? When we launch we will switch the site from new-site.com to example.com .. example.com will be the address for the new site.. Any ideas or suggestions? best practices? edit Forgot to say thank you for your help and input 🙂
Getting Started | | tandvarden0 -
What Moz tool is best to find reasons google has not spidered by site
I just joined Moz and am trying to use the tools however, when I attempt to do so every link comes to a that only allows me access to post questions here. If anyone can tell me what tool is best to find reasons google has not indexed my site, I would greatly appreciate the help. Also if anyone knows why I am keep getting routed to this forum when I try to use any of the tools, I would also appreciate help with this. So far Moz is very frustrating.
Getting Started | | Johndeeray19640 -
New to using MOZ. Familiar with Google Analytics. With MOZ is there a code snippet to include on my site?
Just taken over web marketing responsibilities at my company. Will be doing some major website upgrades soon. I'm not familiar with MOZ and don't want to overwrite anything. So when setting up MOZ, is there a code snippet that goes anyplace on the site like there is with Google analytics? Thanks.
Getting Started | | NanoLumens0 -
How do get Moz to spider a Development site PRE LAUNCH?
Hi, Does anyone know how we could get Moz to browse a development site before launch? But without Google and other engines indexing it? Thanks
Getting Started | | bjs20100