What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.
-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example
-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help: buy domain from Tradenames.com?
Hello to all, I'm Silvia. I am writing to ask if any of you know this site: tradenames.com. It is a domains broker. They contacted my client and would like to sell the .com business domain (my client currently has the .it). Does anyone know them? Thanks you for your help.
Technical SEO | | advmedialab0 -
Google not indexing /showing my site in search results...
Hi there, I know there are answers all over the web to this type of question (and in Webmaster tools) however, I think I have a specific problem that I can't really find an answer to online. site is: www.lizlinkleter.com Firstly, the site has been live for over 2 weeks... I have done everything from adding analytics, to submitting a sitemap, to adding to webmaster tools, to fetching each individual page as googlebot and then submitting to index via webmaster tools. I've checked my robot files and code elsewhere on the site and the site is not blocking search engines (as far as I can see) There are no security issues in webmaster tools or MOZ. Google says it has indexed 31 pages in the 'Index Status' section, but on the site dashboard it says only 2 URLS are indexed. When I do a site:www.lizlinketer.com search the only results I get are pages that are excluded in the robots file: /xmlrpc.php & /admin-ajax.php. Now, here's where I think the issue stems from - I developed the site myself for my wife and I am new to doing this, so I developed it on the live URL (I now know this was silly) - I did block the content from search engines and have the site passworded, but I think Google must have crawled the site before I did this - the issue with this was that I had pulled in the Wordpress theme's dummy content to make the site easier to build - so lots of nasty dupe content. The site took me a couple of months to construct (working on it on and off) and I eventually pushed it live and submitted to Analytics and webmaster tools (obviously it was all original content at this stage)... But this is where I made another mistake - I submitted an old site map that had quite a few old dummy content URLs in there... I corrected this almost immediately, but it probably did not look good to Google... My guess is that Google is punishing me for having the dummy content on the site when it first went live - fair enough - I was stupid - but how can I get it to index the real site?! My question is, with no tech issues to clear up (I can't resubmit site through webmaster tools) how can I get Google to take notice of the site and have it show up in search results? Your help would be massively appreciated! Regards, Fraser
Technical SEO | | valdarama0 -
Image Height/Width attributes, how important are they and should a best practice site include this as std
Hi How important are the image height/width attributes and would you expect a best practice site to have them included ? I hear not having them can slow down a page load time is that correct ? Any other issues from not having them ? I know some re social sharing (i know bufferapp prefers images with h/w attributes to draw into their selection of image options when you post) Most importantly though would you expect them to be intrinsic to sites that have been designed according to best practice guidelines ? Thanks
Technical SEO | | Dan-Lawrence0 -
Blog.furnacefilterscanada.com/ or furnacefilterscanada.com/blog/
My shopping cart does not allow to instal a WordPress blog on a sub-domain like: furnacefilterscanada.com/blog/ But I can host my blog on another server with a sub-domain like: blog.furnacefilterscanada.com In a SEO point of view is there a difference between the 2? Link juice? Page authority? Thank you, BigBlaze
Technical SEO | | BigBlaze2050 -
How do we ensure our new dynamic site gets indexed?
Just wondering if you can point me in the right direction. We're building a 'dynamically generated' website, so basically, pages don’t technically exist until the visitor types in the URL (or clicks an on page link), the pages are then created on the fly for the visitor. The major concern I’ve got is that Google won’t be able to index the site, as the pages don't exist until they're 'visited', and to top it off, they're rendered in JSPX, which makes things tricky to ensure the bots can view the content We’re going to build/submit a sitemap.xml to signpost the site for Googlebot but are there any other options/resources/best practices Mozzers could recommend for ensuring our new dynamic website gets indexed?
Technical SEO | | Hutch_e0 -
Root domain ranks higher than sub pages
Our website is over 10 years old and has been very successful with google. We rank highly for our keywords, but recently a strange thing has happend. Google has indexed a video from our front page (which no longer exists) and shows this listing instead of the sub pages. This means the listing is irrelevant to the search term. Term is "Junior Cricket Bats"
Technical SEO | | NickKer
https://www.google.co.uk/search?sourceid=navclient&ie=UTF-8&rlz=1T4GGLL_en-GBGB377GB377&q=junior+cricket+bats should some up with this page
http://www.cricketsupplies.com/junior-cricket-bats.asp but shows the root domain
http://www.cricketsupplies.com/ with this video link which does not exist We should be ranking really well , and were , but now not so good , even though SEOmoz loves our pages if anybody can see anything obvious i would be thanksful. regards0 -
Domain Transfer Process / Bulk 301's Using IIS
Hi guys - I am getting ready to do a complete domain transfer from one domain to another completely different domain for a client due to a branding/name change. 2 things - first, I wanted to lay out a summary of my process and see if everyone agrees that its a good approach, and second, my client is using IIS, so I wanted to see if anyone out there knows a bulk tool that can be used to implement 301's on the hundreds of pages that the site contains? I have found the process to redirect each individual page, but over hundreds its a daunting task to look at. The nice thing about the domain transfer is that it is going to be a literal 1:1 transfer, with the only things changing being the logo and the name mentions. Everything else is going to stay exactly the same, for the most part. I will use dummy domain names in the explanation to keep things easy to follow: www.old-domain.com and www.new-domain.com. The client's existing home page has a 5/10 GPR, so of course, transferring Mojo is very important. The process: Clean up existing site 404's, duplicate tags and titles, etc. (good time to clean house). Create identical domain structure tree, changing all URL's (for instance) from www.old-domain.com/freestuff to www.newdomain.com/freestuff. Push several pages to a dev environment to test (dev.new-domain.com). Also, replace all instances of old brand name (images and text) with new brand name. Set up 301 redirects (here is where my IIS question comes in below). Each page will be set up to redirect to the new permanent destination with a 301. TEST a few. Choose lowest traffic time of week (from analytics data) to make the transfer ALL AT ONCE, including pushing new content live to the server for www.new-domain.com and implementing the 301's. As opposed to moving over parts of the site in chunks, moving the site over in one swoop avoids potential duplicate content issues, since the content on the new domain is essentially exactly the same as the old domain. Of course, all of the steps so far would apply to the existing sub-domains as well, IE video.new-domain.com. Check for errors and problems with resolution issues. Check again. Check again. Write to (as many as possible) link partners and inform them of new domain and ask links to be switched (for existing links) and updated (for future links) to the new domain. Even though 301's will redirect link juice, the actual link to the new domain page without the redirect is preferred. Track rank of targeted keywords, overall domain importance and GPR over time to ensure that you re-establish your Mojo quickly. That's it! Ok, so everyone, please give me your feedback on that process!! Secondly, as you can see in the middle of that process, the "implement 301's" section seems easier said than done, especially when you are redirecting each page individually (would take days). So, the question here is, does anyone know of a way to implement bulk 301's for each individual page using IIS? From what I understand, in an Apache environment .htaccess can be used, but I really have not been able to find any info regarding how to do this in bulk using IIS. Any help here would be GREATLY APPRECIATED!!
Technical SEO | | Bandicoot0 -
My website pages (www.ommrudraksha.com) is getting good rank slowly. But no good sales ?
My website has been doing good slowly. I have been using seomoz recommendations. And it is a great help to see that my pages are slowly coming to the first page. I am also running PPC on google. I see there are many visitors to my website. But i do not get good conversion - or not getting customer buying products. My website : www.ommrudraksha.com My target keyword is : rudraksha
Technical SEO | | Ommrudraksha0