Our crawler was not able to access the robots.txt file on your site
-
Hello Mozzers!
I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.
https://www.thefurnshop.co.uk/robots.txt
and Google isn't flagging anything up to us.
Does anyone know how to solve this problem?
Thanks
-
@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.
-
Wanted to jump back in on this topic as I've just confirmed my initial suspicion.
I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.
Reconfigure your site and it should begin to work.
-
There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:
First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).
Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /
-
Hi there,
There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to seo my site ?
Hi, I'm owner of farsindex.com. I want to seo my site and improve page authority. What are your suggestions?
Getting Started | | amin_material0 -
Moz can't crawl my site.
Moz cannot carry out the site crawl on my online shop. Not really sure what the issue is, it has no problem getting onto my site when you use www. before the address, but it needs to be able to access bluerinsevintage.co.uk Stuck as what to do, we are a shopify store. Anyone else had this problem, or know what i need to change so they can crawl the site? thjis is the page they are getting when trying to get on bluerinsevintage.co.uk but if they use www.bluerinsevintage.co.uk the site comes up. Adam
Getting Started | | bluerinsevintage0 -
Why is Moz unable to crawl my site?
Was hoping someone could advise why Moz is unable to crawl my site at https://www.oceaniacruises.com **Moz was unable to crawl your site on Oct 5, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Any help would be appreciated. Thanks!
Getting Started | | jbarinaga0 -
I have a client with a wordpress.com site.
Is it possible to manage a campaign for such a site on Moz? It looks like in order to be able to add an independent Google Analytics tracking id, he has to upgrade to a business account. Does anybody have any experience with this?
Getting Started | | chill9860 -
Duplicate Content after Moz Site Audit
Hello folks, So I signed up for the trial version of the Moz tool and ran an initial site audit. One of the site audit results is confusing me.
Getting Started | | jjimen03
It reports that there are two pages with duplicate content ( Each page has a duplicate page with duplicate content in it).
When I take a look at what those pages are, here is what I see: mysite.com/Contact-Us.html
mysite.com/contact-us.html
( The difference in the above is the Contact and Us, the first letters are capitalized on one of the URLS) mysite.com/index.html
mysite.com Now I am confused because for one thing, I don't have 2 Contact Us html files uploaded on my hosting server.
Why is Moz seeing 2 Contact Us pages? How to remove one? Regarding my home page, why is it flagging the same page as two different pages? How to remove of them?0 -
How Do I Scan My New Site & Grade My Work With The Robots Turned Off? For Pre-Inspection before I launch my Site?
I have a new site that has all the bots turned off so google can't index my site until I'm finished it. I've been working on this site for a couple months now optimizing and I was wondering if there was anyway I can run a preliminary scan on the site for my titles, URLs, Headers, Alt Tags and pretty much anything else that will grade my work and tell me if i did anything wrong? Can MOZ do this with the Bots turned off? Thanks
Getting Started | | Inframan0 -
Does the Moz crawler have a static ip address?
We block AWS from crawling our site. Does the moz crawler use a static ip address that we could allow? I'm currently not able to add a campaign because Moz can't connect to our site.
Getting Started | | uShip0 -
Where is my access id?
Hi, i am using a 3rd party wordpress plugin (WPMU DEV - Infinite SEO). I've got a trial account and the plugin is asking me for: Access ID Secret Key Where can i find these? much appreciated graham
Getting Started | | aguyiknow0