Crawl error robots.txt
-
Hello, when trying to access the site crawl to be able to analyze our page, the following error appears:
**Moz was unable to crawl your site on Nov 15, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster.
Can help us?
Thanks!
-
@Linda-Vassily yes
-
The page is: https://frizzant.com/ And don't have noindex
-
Thanks Lind and Tawny! i 'll check it
-
Hey there!
This is a tricky one — the answer to these questions is almost always specific to the site and the Campaign. For this Campaign, it looks like your robots.txt file returned a 403 forbidden response to our crawler: https://www.screencast.com/t/f42TiSKp
Do you use any kind of DDOS protection software? That can give our tools trouble and cause us to be unable to access the robots.txt file for your site.
I'd recommend checking with your web developer to make sure that your robots.txt file is accessible to our user-agent, rogerbot, and returning a 200 OK status for that user-agent. If you're still having trouble, it'll be easier to assist you if you contact us through [email protected], where we can take a closer look at your account and Campaign directly.
-
I just popped that into ScreamingFrog and I don't see a noindex on that page, but I do see it on some other pages. (Though that shouldn't stop other pages from being crawled.)
Maybe it was just a glitch that happened to occur at the time of the crawl. You could try doing another crawl and see if you get the same error.
-
The page is: http://www.yogaenmandiram.com/ And don't have noindex
-
Hmm. How about on the page itself? Is there a noindex?
-
Yes, our robots.txt it's very simple:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php -
That just says that you are blocking the Moz crawler. Take a look at your robots.txt file and see if you have any exclusions in there that might cause that page not to be crawled. (Try going to yoursite.com/robots.txt or you can learn more about this topic here.)
-
Sorry, the image don't appear
Try now -
It looks like the error you are referring to did not come through in your question. Could you try editing it?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl Issue
Hi, We have 3 campaigns running for our websites in different territories. All was going well until April 11th when Moz reported that our .com site (sendmode.com) could not be crawled. I get this error "Your page redirects or links to a page that is outside of the scope of your campaign settings ..." I've been through the site a number of times but have been unable to get to the root of the problem. Robots.txt and 301's look fine. Is there any way I can find out which page is causing the issue? John
Product Support | | johnmc330 -
"Our crawler was not able to access the robots.txt file on your site."
Hi Mozzers! I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed as this seems to be fine: https://k3syspro.com/robots.txt and Google isn't flagging anything up to us. Does anyone know why this may be? Thanks, Matthew
Product Support | | K3Syspro0 -
Error Code 612 with Squarespace
Hi Moz community,
Product Support | | BrownBox
I'm getting an apparent error code 612 on my homepage. I've checked my robot.txt file with https://httpstatus.io/ as well as https://webmaster.yandex.com/ like others have suggested, but they are both showing nothing wrong. Is this something to do with square space? Is it incompatible with Moz? My site is https://brownbox.ca Thanks.0 -
HTTPS (SSL) Error Encountered.
Hi guys, We received the following error from our Moz report, our hosting companies says it's isn't an issue but I wanted to get your feedback as it seems a bit odd. We recently moved eventfull.co.nz to a VPS and setup an EV SSL. We moved another site to the same VPS and added EV SSL and it isn't reporting any issue. Hosting companies feedback below error message | Crawl Error Error Code 804: HTTPS (SSL) Error Encountered Your page requires an SSL security certificate to load (using HTTPS), but the Moz Crawler encountered an error when trying to load the certificate. Our crawler is pretty standard, so it's likely that other browsers and crawlers may also encounter this error. If you have this error on your homepage, it prevents the Moz crawler (and some search engines) from crawling the rest of your site. | | I am just running some tests on the SSL cert installation now, but so far all appears to be fine. I've checked your .htaccess file and there is nothing that should be blocking them, and the logs show nothing unexpected or any SSL failures. SSL Labs test returns an A rating with no errors. One possibility (though, it's a long shot) is that we never increased the HSTS time beyond 3600, so it's possible that the crawler is failing because of this perceived insecurity. However, if there is some other issue going on, increasing the HSTS time would be a bad idea, so I suggest some further monitoring and testing. You may need to contact Moz about the issue and see if they can help. Is there any other evidence of SSL failing to load? Have you experienced any other issues, or is it only the Moz report that is indicating any problem? |
Product Support | | ModowestNZ0 -
Number of pages crawled = 1; Why?
Since November, we've been trying to figure out why, when I select Crawl Diagnostics, my number of pages crawled is only 1. In mid-november, we changed our URL. That is, we went from www.example.com/home-page/ to www.example.com/new-home-page/. My first assumption was that I needed to re-create my Moz profile. That didn't fix it. The only crawl error we get is the no rel="cannonical" found -- but it's there. We find it on every page, including the home page. Our content shows up in search. Moz bar shows us info for every page. I just don't know what else to check. Everything else in my dashboard seems to look as expected. Specifically, I've turned to Crawl Diagnostics to find 4XX errors on our site. Typically we find one or two per week. Sometimes 0. Sometimes 4 or more. But it's been 0 since November. I highly doubt we've arrived at perfection. Any thoughts?
Product Support | | seo-nicole0 -
Why has my Google Analytics dropped completely from the results on this weeks crawl?
The Account code hasn't changed in either Moz or on site? Also I added 53 new keywords to my campaign and they haven't all been ranked, it's like Moz has done half a job?! S.O.S.
Product Support | | danwebman0 -
Crawl Limit Question
I'm a little confused as to how the crawl limit works. Since there seems to be a 10K per week max, the crawl limit can't be per week, so what is the time period? Also, does that include crawling sites entered as competitors? Right now I'm at 14/25 sites and most of them are under 1,000 pages so I'm not sure how I hit that limit (other than a one-time spike of 28,000 in November).
Product Support | | David_Moceri0 -
Moz crawls
Hi all! I'm running two separate campaign crawls at the moment that I'm having a few issues with. The first is http://www.muchbetteradventures.com/. I'm tracking this at root to try an pick any problems with a previous version of the site, which is on the v1.muchbetteradventures subdomain. However, I'm only seeing 100 or so pages as being crawled in Moz, compared to thousands in Google. There's no access blocked alerts in Moz either. The second is http://www.sothebys.com/. I started this crawl at root a few days ago and no pages at all have been processed. Very odd! Any advice would be much appreciated.
Product Support | | neooptic0