Rogerbot's crawl behaviour vs google spiders and other crawlers - disparate results have me confused.
-
I'm curious as to how accurately rogerbot replicates google's searchbot
I've currently got a site which is reporting over 200 pages of duplicate/titles content in moz tools. The pages in question are all session IDs and have been blocked in the robot.txt (about 3 weeks ago), however the errors are still appearing.
I've also crawled the page using screaming frog SEO spider. According to Screaming Frog, the offending pages have been blocked and are not being crawled. Webmaster tools is also reporting no crawl errors.
Is there something I'm missing here? Why would I receive such different results. Which one's should I trust? Does rogerbot ignore robot.txt? Any suggestions would be appreciated.
-
Thanks for your response. I was beginning to think this question had been left to rot.
I'm not getting any errors in WMT. What is concerning is that Roger is returning almost 300 errors of dupe content, which is obviously a problem. Screaming frog is no longer finding the pages (they've been blocked in the robot.txt) I guess what I'm trying to ask here is how can I be sure that my dupe content has been effectively blocked from google's spider.
Is there anyway to check?
Thanks for your help.
-
I've see similar concerns from others, it seems "rogerbot" does ignore certain things that other bots consider.
Don't worry about it, if it's not being flagged in WMT it shouldn't be an issue.
Take Roger as a guide rather than an iron fist bot like googlebot.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
moz crawl is stopped?
moz stopped indexing the links due to some updates? can some one confirm me thanks
Moz Pro | | 42409300125323700 -
Why is my crawl STILL in progress?
I'm a bit new here, but we've had a few crawls done already. They are always finished by Wednesday night. Our website is not large (by any means), but the crawl still says it's in progress now 3 days later. What's the deal here?!?
Moz Pro | | Kibin0 -
SEO Web Crawler IP addresses
What are the IP addresses for the SEO Web Crawler? There is a firewall on my clients website before it goes live, I would like to crawl the site before it goes live, but need to provide the web crawlers IP addreses. Thank you for your time
Moz Pro | | sfchronicle1 -
SEOMOZ Crawl Test
Guys I really have an issue that i know have but cannot see if that makes sense. Basically 3 months ago i did a site wide 301 from economyleasinguk.co.uk to www.economy-car-leasing.co.uk Every thing looks good get all the correct header responses , all canonicals work perfectly , Google webmaster tools is updated fetch as google bot shows the old site is 301 I tried the seomoz crawl test today on the old domain and got this message Oh no! Looks like the page you were trying to access is temporarily down which at first thought ok because the site was not there it wont do it on an old 301 domain, however i tried it on a domain i know has just been 301'd and i got this message The URL http://www.site1.com/ redirects to http://site2.com/. Do you want to crawl http://site2.com/ instead?
Moz Pro | | kellymandingo
Would you like to:
Continue with www.site1.com
Continue with site2.com I really do not know what to do, its either the redirect script is missing something however its doing what it should or the server is a problem but again its doing what it should so why would SEOMOZ not be able to crawl the old URL like it example site above. Now the strange thing is Open Site Explorer does see the 301 and asks if i want to check the new URL instead Ps the redirect is done using PHP redirect which i am asking him to change to a htaccess as its now on a apache server and was wondering if this could be an issue, all pages go to correct pages as requested Thanks in Advance1 -
Set crawl frequency
Current crawl frequency is weekly, is it possible for me to set this frequency our-self?
Moz Pro | | bhanu22170 -
Campaign crawl re - schedule
Hello, On the last crawl of a website of mine, seomoz pointed out about 1500 errors (ouch!) on my site. I have made some corrections and i just want to see if they are at the right way but the next crawl is in a week. Is there any way so i can force a crawl before the scheduled date? Thanks!
Moz Pro | | Tz_Seo0 -
Blocking all robots except rogerbot
I'm in the process of working with a site under development and wish to run the SEOmoz crawl test before we launch it publicly. Unfortunately rogerbot is reluctant to crawl the site. I've set my robots.txt to disallow all bots besides rogerbot. Currently looks like this: User-agent: * Disallow: / User-agent: rogerbot Disallow: All pages within the site are meta tagged index,follow. Crawl report says: Search Engine blocked by robots.txt Yes Am I missing something here?
Moz Pro | | ignician0 -
Is there a whitelist of the RogerBot IP Addresses?
I'm all for letting Roger crawl my site, but it's not uncommon for malicious spiders to spoof the User-Agent string. Having a whitelist of Roger's IP addresses would be immensely useful!
Moz Pro | | EricCholis1