Crawler triggering Spam Throttle and creating 4xx errors
-
Hey Folks,
We have a client with an experience I want to ask about.
The Moz crawler is showing 4xx errors. These are happening because the crawler is triggering my client's spam throttling. They could increase from 240 to 480 page loads per minute but this could open the door for spam as well.
Any thoughts on how to proceed?
Thanks! Kirk
-
Thank you Dave!
-
Hey Kirk! We built our crawler to obey robots.txt crawl-delay directives. In the future, if this is ever an issue, you can use the crawl delay to slow Rogerbot down to a more reasonable speed. However, we don't recommend adding a crawl delay larger than 10 or Rogerbot might not be able to finish the crawl of your site.
Just add a crawl delay directive to your robots.txt file like this:
User-agent: rogerbot
Crawl-delay: 10Here's a good article that explains more about this technique: https://mza.seotoolninja.com/learn/seo/robotstxt. I hope this helps, feel free to reach out if you have any other questions!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do can the crawler not access my robots.txt file but have 0 crawler issues?
So I'm getting this errorOur crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster.https://www.evernote.com/l/ADOmJ5AG3A1OPZZ2wr_ETiU2dDrejywnZ8kHowever, Moz is saying I have 0 Crawler issues. Have I hit an edge case? What can I do to rectify this situation? I'm looking at my robots.txt file here: http://www.dateideas.net/robots.txt however, I don't see anything that woudl specifically get in the way.I'm trying to build a helpful resource from this domain, and getting zero organic traffic, and I have a sinking suspicion this might be the main culprit.I appreciate your help!Thanks! 🙂
Moz Bar | | will_l0 -
Find SEO errors
Hi, I have a Moz Pro account. Is there any way to automatically find images without ALT tag, and also noindex/nofollow pages? Cheers,
Moz Bar | | viatrading10 -
Error in Duplicate Content Being Reported - Pages Aren't Actually Duplicates
The recent crawl of one of our sites revealed a high number of duplicate content issues. However, when I viewed the report for pages with duplicate content I noticed almost all of them are not duplicates. For example, these two pages are marked as dupes:
Moz Bar | | M_D_Golden_Peak
https://www.writersstore.com/publishers/hollywood-creative-directory
https://www.writersstore.com/authors/g-miki-hayden These are thin as far as content goes but definitely not duplicates. Any recommendations or ways to adjust the settings so that these false positives aren't clogging up our site crawl report?0 -
Signed up for moz reports - have received Moz error report - need someone who is capable to take report and perform cleanup edits within Joomla site?
Looking for someone in the US - please contact me at [email protected] If available and interested in task. Thanks Mary
Moz Bar | | PortlandWebDesign0 -
We Launched a new site and Rogerbot is still reporting on links/errors from the old site, is there a way to clear those out?
We are mostly a Branding agency, and have not put a lot of effort into SEO for ourselves... SEO tends to take a backseat to design most of the time, making it a little difficult for me at times when it comes to SEO. We recently launched a new site, http://Roninadv.com/ and the developer and I have done quite a bit of work to make it work well for Google. I was really looking forward to a new crawl report from Roger, but alas, It's like Roger crawled the old site? The new site has been up since last Monday. Is there a way to clear out the old errors? Do I just need to give roger more time?
Moz Bar | | PaulRonin0 -
403 Error on WMT but not on MOZ?
Hello, 2 days ago I found there are about 1200 of 403 errors by Google WMT when I tried to fetch my domain - Please see attached HTTP/1.1 403 Access Forbidden Cache-Control: private Content-Type: text/html ETag: "" Server: Set-Cookie: ASPSESSIONIDSSBARTSD=BEHMJHJBKJOEJEALECNNIPFH; path=/; HttpOnly X-Powered-By: Date: Tue, 18 Feb 2014 13:54:10 GMT Content-Length: 1233 <title>403 - Forbidden: Access is denied.</title> Server Error <fieldset> 403 - Forbidden: Access is denied. You do not have permission to view this directory or page using the credentials that you supplied. </fieldset> I ran a complete report using MOZ but I was shocked not see any 4xx , 5xx errors. Google: 246 of 404 errors No Google, Yahoo or Bing blocking HTTP status code: ALL 200 301 redirect: none? I have done about 2500 over 4 years. The website is losing indexed pages. I'm not sure what's going and which numbers to trust. Please help. Thank you. Adam
Moz Bar | | homs830 -
Unspecified errors
Why am I getting an Unspecified Error when adding my keywords? Screen_Shot_2013-10-21_at_1.10.03_PM.png
Moz Bar | | RandyMilanovic1 -
Screaming Frog, Moz and other crawlers
Hi Ignorant question, but is it possible to use Screaming Frog or the Moz crawler or any other reputable crawler for a site still in development i.e. it is yet to be indexed? If so, could someone provide some quick instructions on how this can be done. Thanks in advance for any support. Neil
Moz Bar | | mccormackmorrison0