Our crawler was not able to access the robots.txt file on your site
-
Hello Mozzers!
I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.
https://www.thefurnshop.co.uk/robots.txt
and Google isn't flagging anything up to us.
Does anyone know how to solve this problem?
Thanks
-
@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.
-
Wanted to jump back in on this topic as I've just confirmed my initial suspicion.
I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.
Reconfigure your site and it should begin to work.
-
There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:
First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).
Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /
-
Hi there,
There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't Crawl Site - but deducting crawls.
Why am I being deducted crawls if MOZ keeps telling me that it can't crawl my site?
Getting Started | | BloggyMoms1 -
Moz site crawl doesn't work
The Moz site crawl isn't working for my campaign, but works for the site's on demand crawl. The search should not be disallowed by robots.txt or the headers. I'd like to be able to track the website for the campaign so I can see SEO gains / losses and increases / decreases in indexing.
Getting Started | | DrainKing0 -
Why does the MOZ bar not work on SERPs with site links?
Maybe a dumb question (am a beginner), but MOZ bar shows no information on SERP results with site links? I'm sure there's a reason? What is it?
Getting Started | | RionHaber0 -
Why is Moz unable to crawl my site?
Was hoping someone could advise why Moz is unable to crawl my site at https://www.oceaniacruises.com **Moz was unable to crawl your site on Oct 5, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Any help would be appreciated. Thanks!
Getting Started | | jbarinaga0 -
Error Code 612: Error response for robots.txt
Hi, We are getting Error Code 612: Error response for robots.txt in our crawl but everything looks to be ok with the robots file. Can you confirm what is wrong? Thanks
Getting Started | | david.weston0 -
How do i stop Moz from indexing my dev site?
Hey, I am new to MOZ, and I am seeing in my page reports that I have duplicate content, but this duplicate content is mostly my development site. dev.domain.com. I have this blocked in robots.txt for google. How do i stop MOZ including it in its reports? Cheers everyone.
Getting Started | | Tholomew1
Bart0 -
How do get Moz to spider a Development site PRE LAUNCH?
Hi, Does anyone know how we could get Moz to browse a development site before launch? But without Google and other engines indexing it? Thanks
Getting Started | | bjs20100 -
Whenever I try to access campaigns in moz pro I get an error page
I recently signed-up for a new pro account. As I was adding my first subdomain everything was fine until I was asked to link to GA, when I clicked yes I got this error message: 403 Forbidden Now every time I click on set-up campaign I get taken to a page with nothing but the 403 Forbidden text.
Getting Started | | Toptal0