Moz "Crawl Diagnostics" doesn't respect robots.txt
-
Hello, I've just had a new website crawled by the Moz bot. It's come back with thousands of errors saying things like:
- Duplicate content
- Overly dynamic URLs
- Duplicate Page Titles
The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored):Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /reviews/
Disallow: /home/Many thanks for any info on this issue.
-
Hi Si, has this issue been resolved?
-
Hey Si,
Thanks for writing in. It doesn't seem that we are having an overarching issue with our crawler ignoring robots.txt files so I did some research in Google Webmaster Tools and it looks like most crawlers require an asterisk in the disallow directive to recognize that all pages of a dynamic URL are being disallowed. If you look in the "Pattern Matching" section of this resource here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449, that should give you more information about setting up the robots.txt with the correct disallow directives to block those pages.
If you add in the astrisk to the disallow directive and you are still seeing these pages crawled, it would help if you sent in an email with your campaign information to our support desk at [email protected] so we can have our engineers look into this more directly.
I hope this helps.
Chiaryn
-
If you have an "index,(no)follow" meta on those pages I think they will be crawled even though you have them blocked in robots.txt. So by adding "noindex" on those pages it might work as you want it to.
-
Is the / actually in the URL at that spot? Or is your link like http://www.example.com/abcd?p=147
If you give an example full URL that includes one of your blocked dynamic URLs we can take a better look. If your robots is setup correctly, it shouldn't find that stuff but give us more info if you're able.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Moz Keyword Explorer help target keywords for Google Images results?
I'm wondering if I can use Keyword Explorer (or maybe another tool?) to target keywords for image rankings. I'd like to play around with optimizing images so that they appear in search results and thus provide traffic - but wasn't sure the best way to track that kind of progress. My ultimate goal is to analyze the difficulty of ranking for a certain keyword via Google images. (I do know to optimize alt tag/title tag/place in relevant article etc, but wanted to know if I could research the difficulty). Any help is much appreciated. Thanks!
Moz Bar | | naturalsociety0 -
605 : Page banned by robots.txt
Hello everyone, I need experts help here, Please suggest, I am receiving crawl errors for my site that is , X-Robots-Tag: header, or tag. my robots.txt file is: User-agent: * Disallow:
Moz Bar | | bhomes0 -
500 errors showing up differently on moz and google wmt
Lately, I've been having the issue of a large increase in 500 errors. These errors seem to be intermittent, in other words, Google and Moz are showing that I have server 500 errors for many pages but, when I actually check the links, everything's fine. I've run tests to see if there is any virus on the server or if I have any corrupt files and as far as I can tell, there are none. I'm left with the possibility that maybe one of my plugins is causing this issue (I'm built on top of Wordpress). Moz is showing that I had nearly five hundred 500 server errors on the 12th or the 11th. On the other hand, Google shows that on the 13th I had 179 server errors and then an additional 200 for the 15th. I'm assuming Google is slow to find or report these things? I would like to know which is more reliable so that I can try to figure out which of these plugins may be causing the problem, if any or if I'm investigating this the wrong way, I'd love to have more suggestions. Thanks in advance! Sorry, the url is http://www.heartspm.com if you'd like to take a look.
Moz Bar | | GerryWeitz0 -
Why does Moz uses the Bing search volume instead of Google?
If i want to check search volume of a key word its done in Bing, but i would like to know the Google search volume for my customers. How do you change this?? Anybody ??
Moz Bar | | Boo5t_Marketing0 -
SEO MOZ ERROR
Hello moz comunity, I tried to use the moz keyword difficulty service in the last 2 days and I get this error over and over again... see photo: http://www.evernote.com/shard/s238/sh/5775a179-1be7-4e76-8563-cf087c37cf2b/576bda1a72f446a8806a0f1914193829 Oops Gosh! It looks like something has gone a bit wrong. Don't worry though, we know and are fixing it. How Can I solve this? I need to check a lot of keywords for my websites. Any alternatives? Thank you !!!
Moz Bar | | Sebastyan220 -
Have any insight into why our Moz Rank dropped?
I'm working on a site with a very low domain authority to start and in viewing our historical MozRank comparison to competitors I see that we had a MozRank between 2 and 3 two months ago, but now have a MozRank of 0. What could have triggered this dropoff? It's clear we need to boost domain authority, but we have never had any so we're no worse in that department now than we were two months ago. Any insight here would be useful. Thanks! W2A1u2D.png
Moz Bar | | bshanahan0 -
Does Moz Pro generate similar keyword phrases in a list (preferably showing their difficulty %) or is it only one phrase at a time with no similar words/phrases suggested?
I just signed up for Moz Pro but the keyword research seems to only let you try one keyword phrase at a time. Is there a way for it to give related keywords along with their difficulty % info, etc. It is far too slow and inconvenient doing one at a time.
Moz Bar | | SavingSpotlight0 -
Moz Dupe content crawl anomaly
Hi Moz has completed a crawl for a site i'm working on which also has a development area (hence with lots of dupe content) on a sub domain (and this dev area hasn't been hidden from crawlers via password, robots, gwt etc etc). Moz dupe content report is not showing any of these urls though even though my campaign setting is on 'root' domain so i would have thought report should be listing the subdomain urls as dupe content (because they are dupe content). Any ideas ? Cheers Dan
Moz Bar | | Dan-Lawrence0