SEOMOZ crawler is still crawling a subdomain despite disallow
-
This is for our client with a subdomain. We only want to analyze their main website as this is the one we want to SEO. The subdomain is not optimized so we know it's bound to have lots of errors.
We added the disallow code when we started and it was working fine. We only saw the errors for the main domain and we were able to fix them. However, just a month ago, the errors and warnings spiked up and the errors we saw were for the subdomain.
As far as our web guys are concerned. the disallow code is still there and was not touched. User-agent: rogerbot Disallow: /
We would like to know if there's anything we might have unintentionally changed or something we need to do so that the SEOMOZ crawler will stop going through the subdomain.
Any help is greatly appreciated!
-
Thanks Peter for your assistance.
Hope to hear from the SEOMOZ team soon with regards to this issue.
-
John,
Thanks for writing in! I would like to take a look at which project you guys were working with that this is happening. I will go ahead and start a ticket so we can better answer your questions You should hear from me soon!
Best,
Peter
Moz Help Team. -
I have heard of this before recently, I think possibly the moz crawler all or sometimes now just ignores the disallow because it is not a usual S.E crawler.
Hopefully one of the staff can provide some insight in this for you.
All the best.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console Crawl Errors?
We are using Google Search Console to monitor Crawl Errors. It seems Google is listing errors that are not actual errors. For instance, it shows this as "Not found": https://tapgoods.com/products/tapgoods__8_ft_plastic_tables_11_available So the page does not exist, but we cannot find any pages linking to it. It has a tab that shows Linked From, but if I look at the source of those pages, the link is not there. In this case, it is showing the front page (listed twice, both for http and https). Also, one of the pages it shows as linking to the non-existant page above is a non-existant page. We marked all the errors as fixed last week and then this week they came up again. 2/3 are the same pages we marked as fixed last week. Is this an issue with Google Search Console? Are we getting penalized for a non existant issue?
Intermediate & Advanced SEO | | TapGoods0 -
Sites still rank who don't seem like they should. Why?
So you've been MOZing and SEOing for years and we're all convinced of the 10x factor when it comes to content and ranking for certain search terms... right? So what do you do when some older sites that don't even produce content dominate the first page of a very important search term? They're home pages with very little content and have clearly all dabbled in pre Panda SEO. Surely people are still seeing this and wondering why?
Intermediate & Advanced SEO | | wearehappymedia0 -
Should I disallow all URL query strings/parameters in Robots.txt?
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters: /Mulligan-Practitioner-CD-ROM
Intermediate & Advanced SEO | | jmorehouse
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROM Additionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors. As I see it, I have two options: Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result). Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original. Any thoughts?0 -
Significant Google crawl errors
We've got a site that continuously like clockwork encounters server errors with when Google crawls it. Since the end of last year it will go a week fine, then it will have two straight weeks of 70%-100% error rate when Google tries to crawl it. During this time you can still put the URL in and go to the site, but spider simulators return a 404 error. Just this morning we had another error message, I did a fetch and resubmit, and magically now it's back. We changed servers on it in Jan to Go Daddy because the previous server (Tronics) kept getting hacked. IIt's built in html so I'm wondering if it's something in the code maybe? http://www.campteam.com/
Intermediate & Advanced SEO | | GregWalt1 -
PDFs and images in Sub folder or subdomain?
What would you recommend as best practice? Our ecommerce site has a lot of PDFs supporting the product page. Currently they are kept in a sub domain and so are all images. Would it be better to keep them all in a subfolder? I've read about blogs being hosted on a subfolder to be better than subdomain but what about pdfs and images? thoughts?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
SubDomain vs. SubFolder
I know this subject has been discussed many, many times before. But it is now 2013, and Google continues to tweak and change their algo to build upon the best delivered results for users. So the questions are: Does Google still treat subdomains as a completely separate and unique domain from the root? If so, is it a good SEO strategy to split up, when it fits, a website into subdomains with links pointing back to the root or main domain? As a company we have several subdomains with some of our categories. For example our main site is www.iboats.com. This site has all our boat products. But we set up subdomains several years ago for the following: boatcovers.iboats.com boatpropellers.iboats.com biminitops.iboats.com And we have our fourms as a subdomain: forums.iboats.com Splitting them out were originally done for SEO reasons, but now is more for better managing our main categories. It appears that Google is treating our subdomains as part of our main root domain anyway, so I don't see the SEO value anymore. If we were to move the subdomains into subfolders of the root, I'm wondering if we might see a boost in SEO value having more pages within the main website? I'd be interested in everyone's thoughts on this subject.
Intermediate & Advanced SEO | | tdawson090 -
How to remove duplicate content, which is still indexed, but not linked to anymore?
Dear community A bug in the tool, which we use to create search-engine-friendly URLs (sh404sef) changed our whole URL-structure overnight, and we only noticed after Google already indexed the page. Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on. <code>Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake)</code> After that, we ... Changed back all URLs to the "Right URLs" Set up a 301-redirect for all "Wrong URLs" a few days later Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon. What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"? Best, David
Intermediate & Advanced SEO | | rmvw0 -
Google places: 7 weeks and still not verified. Reasons?
I've created Google Places entries for the business' 25 locations in 8 countries. It's been 7 weeks and Google hasn't validated my bulk upload yet. Which of the following would you say might do the trick? URL: I have pointed to http://site.com/country/city for every location. Should I point to http://site? I've created categories in English, but being this about local I guess I should do it in the local language of each country... Should I cancel the bulk upload and do one country per country? My list of countries include Latin America, Europe, Africa and Asia so I don't know if manual verification is done by google centrally and that might be a problem. Otherwise the entries are (IMHO) well written, detailed and all. But I'm kind of desperate now because it's 7 weeks already... thanks for you help
Intermediate & Advanced SEO | | TIBA0