How do I disallow crawl on a directory when it's a prefix to my site's URL?
-
I am trying to disallow our media repository (hosted elsewhere, but appears as a directory on our site) from being crawled by robots but it is not a subdirectory of the site, it's a prefix.
So I need to disallow: mediabank.mywebsite.org
Not: mysite.org/mediabank
What would I need to put in my robots.txt and/or the other host's robots.txt to make this happen?
Thanks!
-
Hey there! Tawny from Moz's Help Team here.
You'll want to add a robots.txt file for that subdomain, and then add a Disallow command to that robots.txt file. So, using your example, you'd want a file like mediabank.mywebsite.org/robots.txt that had a Disallow command for any robots you don't want crawling that subdomain.
For all user-agents, that would look something like this:
User-agent: *
Disallow: /That would stop any user-agents from crawling any pages on that subdomain.
I hope this helps! If you've still got questions, feel free to send us a note at [email protected] and we'll do our best to sort things out for you.
-
Hi,
Please check this old thread on the same topic @ https://mza.bundledseo.com/community/q/block-an-entire-subdomain-with-robots-txt
Thanks
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz on-page grader can't see keyword in H1
Hi, Trying to do grade for whatiswhere.com, keyword 'poi search'. The tool can't see keyword in h1, but It is there. Could you please check what is the problem? Thank you, Andrei.
Moz Bar | | Anazar_20010 -
Site crawl only shows homepage
Hi everyone, A client of us has a quite new website with a lot of URLs. (Google Search Console indicates around 5300.) However, when I execute a site crawl with screaming frog, or a crawl test in MOZ, it only shows me one URL, the homepage. Does somebody have an idea why the other pages of the website are not showing up? Thanks,
Moz Bar | | WeAreDigital_BE
Jens0 -
Canonical in Moz crawl report
I'm wondering if the moz bot is seeing my rel="canonical" on my pages. There are 2 notices that are bothering me: Overly Dynamic URL Rel Canonical Overly Dynamic URL - This notice is being generated by urls with query strings. On the main page I have the rel="canonical" tag in the header. So every page with the query string has the canonical tag that points to the page that should be indexed. So my question...Why the notice? Isn't this being handled properly with the canonical tag? I know I can use my robots.txt or the tool in Google search console but is it really necessary when I have the canonical on every page? Here is one of the links that has the "Overly Dynamic URL" notice, as you can see the the canonical in the header points to the page without the query string: https://www.vistex.com/services/training/traditional-classroom/registration-form/?values=true&course-title=DMP101 – Data Maintenance Pricing – Business Processes&date=March 14, 2016 Rel Canonical - Every page in my report has this notice "Using rel=canonical suggests to search engines which URL should be seen as canonical". I'm using the rel="canonical" tag on all of my pages by default. Is the report suggesting that I don't do this? Or is it suggesting that I should? Again...why the notice?
Moz Bar | | Brando160 -
How do I cancel a crawl request?
I was farting around and exploring the Crawl Test tool, and accidentally sent out a crawl for a competitor's site (I wanted to see if the tool would decline to crawl without verification). I do NOT want to actually crawl that site, nor do I want the competitor to see that we requested it (for obvious reasons) - how do I cancel it?
Moz Bar | | mkbeesto0 -
Onpage Grader not Finding Keyword in URL
I've noticed that the Onpage Grader is not including my keyword in the URL when the keyword is in the domain. If I grade an inner page and the keyword is in the sub-directory, it finds it. Is this intentional? If so, why does the grader not include my keyword in the domain as Keyword in the URL?
Moz Bar | | Dino640 -
I'm getting a Crawl error 605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag
The website is www.bigbluem.com and is a wordpress site. I'm getting the following error: 605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag But what is weird is the domain it lists below that is http://None/BigBlueM.com Any advice?
Moz Bar | | TumbleweedPDX1 -
Why'd Moz stop showing the list of users?
Curious to know if anyone else noticed that Moz stopped showing most of the active community users http://moz.com/community/users. It was nice to see who's who from visiting profiles and try to connect with them via email or see their websites, etc. There used to be pagination at the bottom. Why did they stop?
Moz Bar | | WhiteboardCreations0 -
Where to find one off crawl report
Hello, I don't know if I am being a bit daft but I don't seem to be able to find the area where I can request a one off crawl report anymore (rather than setting up a campaign). Can someone let me know where this is now? Thanks!
Moz Bar | | RikkiD220