Moz "Crawl Diagnostics" doesn't respect robots.txt
-
Hello, I've just had a new website crawled by the Moz bot. It's come back with thousands of errors saying things like:
- Duplicate content
- Overly dynamic URLs
- Duplicate Page Titles
The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored):Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /reviews/
Disallow: /home/Many thanks for any info on this issue.
-
Hi Si, has this issue been resolved?
-
Hey Si,
Thanks for writing in. It doesn't seem that we are having an overarching issue with our crawler ignoring robots.txt files so I did some research in Google Webmaster Tools and it looks like most crawlers require an asterisk in the disallow directive to recognize that all pages of a dynamic URL are being disallowed. If you look in the "Pattern Matching" section of this resource here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449, that should give you more information about setting up the robots.txt with the correct disallow directives to block those pages.
If you add in the astrisk to the disallow directive and you are still seeing these pages crawled, it would help if you sent in an email with your campaign information to our support desk at [email protected] so we can have our engineers look into this more directly.
I hope this helps.
Chiaryn
-
If you have an "index,(no)follow" meta on those pages I think they will be crawled even though you have them blocked in robots.txt. So by adding "noindex" on those pages it might work as you want it to.
-
Is the / actually in the URL at that spot? Or is your link like http://www.example.com/abcd?p=147
If you give an example full URL that includes one of your blocked dynamic URLs we can take a better look. If your robots is setup correctly, it shouldn't find that stuff but give us more info if you're able.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I use json-ld in our site but schema not found in markup tool bar moz
i use json-ld in our site. structure data in test without any error but in test markup moz in our site schema is not found! Is it time for Google to identify the code to verify the schema?
Moz Bar | | shokoufezand0 -
Is the update site crawl feature following robot.txt rules?
I noticed that most of the errors would not be occurring if Moz's tool followed the rules implemented in sites robots.txt. Has anyone else seen this problem and do you know if Moz will fix this?
Moz Bar | | jamestown0 -
Site Crawl - MOz Pro
Hi There, When i look into my site crawl i have thousands of duplicate content issues. Now they are essentially product pages which are in multiple categories - however we have added the canonical tag so im confused as to why all of these are appearing as if there is an error, does the MOZ bot not take canonicals into account? Kind Regards Gemma
Moz Bar | | acsilver0 -
Why isn't the Moz bar data populating for Yahoo sites?
The Moz bar isn't populating information for Yahoo homepage or it's verticals (i.e. homes, autos, finance, etc.), but I can get this data for other portals like AOL or MSN. I'm specifically looking for PA, mR, and DA information, but instead I get a generic "Search Profile" bar with no page/site-specific data.
Moz Bar | | AllieBell
Is there a reason Open Site Explorer data isn't populating for this particular portal?0 -
Moz Crawl Test Trying to Crawl Contact Form Submit Button Location?
Moz Crawl Test for some reason is trying to Crawl a contact form Widget Submit Location. My obvious guess is that obviously the crawl cannot submit to the required fields…..I believe this because they're only kicking back these errors on the pages I have a contact form widget on. http://crawfordspest.com/pest-control/[email protected] 1412553693 404 : Received 404 (Not Found) error response for page. Error attempting to request page; see title for details. 404
Moz Bar | | Funk-Creative-Media
http://crawfordspest.com/tree-services/[email protected] 1412553693 404 : Received 404 (Not Found) error response for page. Error attempting to request page; see title for details. 404
http://crawfordspest.com/lawn-care/[email protected] 1412553693 404 : Received 404 (Not Found) error response for page. Error attempting to request page; see title for details. 404
http://crawfordspest.com/specialty-services/[email protected] 1412553693 404 : Received 404 (Not Found) error response for page. Error attempting to request page; see title for details. 404 Can you shed any insight to this? I'm a bit worried that I'll have to complete gut the contact form which was one of the major requests my client requested. Or in a worse scenario make all fields not required. It would let so much spam in. I have never seem anything like this at all. But I've learned a lot from Moz, and with major errors like 404 damage Domain Authority greatly. I've fixed 404 issues with newly acquired clients existing sites and tracked through Moz and the domain authority flies up once these errors are fixed. Along with fixing what Webmaster Tools through Google reports back. ..... Let me know if you have any expertise on this matter.0 -
Moz and Google Analytics Trouble?
Is anyone else having trouble connecting their MOZ Pro account to Google Analytics?
Moz Bar | | PhiladelphiaWebsiteSolutions0 -
Does Moz Analytics Allow for Goal Tracking?
We're currently tracking goals in GA, but would be interested to learn if that can be accomplished in Moz Analytics.
Moz Bar | | CGalownia0 -
Emails from Moz makes my Outlook unresponsive
Did anybody else notice this? It started a few weeks ago, every time that I receive an email from Moz regarding a Q&.A update and I try to open it, my Outlook becomes unresponsive and I have to restart it.
Moz Bar | | echo10