Best way to block a sub-domain from being indexed
-
Hello,
The search engines have indexed a sub-domain I did not want indexed its on
old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them.
My main domain default robots.txt says :-
Sitemap: http://www.domain.com/sitemap.xml
global
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /? -
Hi,
CleverPhD has some interesting ideas with robots.txt and Google Webmaster Tools, but simply password protecting all dev pages should keep pages out of Google's index. There's no best practice here, since a password wall will keep Googlebot out on its own.
To be doubly safe, you can also include a meta noindex tag on dev pages.
Keep in mind that once a page is in Google's index, it's going to take awhile for it to leave (unless you use CleverPhD's method). But, having a blank page in Google's index really isn't all that bad. It's there, but it won't rank for much.
Hope this helps,
Kristina
-
I've never tried a method like this - FreshFireOne, did you?
-
First and foremost when you finish all this - password protect your dev instances. A url will leak out eventually and then this happens. I know it is a PIA, but it is worth it.
To remove subdomains. Go into GWT and register the subdomains as separate websites in GWT. Create a robots.txt for each subdomain (not the one you mention, you need a robots that is specific to that subdomain that disallows all files. If you cant do that, have your subdomains include a noindex meta tag on all pages. You have to be careful with this as you do not want to push out your dev. robots.txt or the noindex meta tags to your production server, but it can be done. Talk to your devs. Then go into GWT and use the URL removal tool. Just leave it blank and it will remove the whole site.
Poof. Gone. You can then watch the GWT accounts. They will show errors for the dev site like "Severe health issues are found on your site - Some important page has been removed by request." This is a good error as it confirms that that subdomain is removed.
We actually used this not on a dev site but on our www1 server that was indexed. We use a load balancer with multiple copies of the site. www1 was completing with www. Using this above did the trick.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Old domain with HTTPS to new domain with HTTPS
I am a bit boggled about https to https we redirected olddomain.com to https://www.newdomain.com, but redirecting https://www.olddomain.com or non-www is not possible. because the certificate does not exist on a level where you are redirecting. only if I setup a new host and add a htaccess file will this work. What should I do? just redirect the rest and hope for the best?
Intermediate & Advanced SEO | | waqid0 -
Best way to implement canonical tags on an ecommerce site with many filter options?
What would be the best way to add canonical tags to an ecommerce site with many filter options, for example, http://teacherexpress.scholastic.com? Should I include a canonical tag for all filter options under a category even though the pages don't have the same content? Thanks for reading!
Intermediate & Advanced SEO | | DA20130 -
Google Indexed my Site then De-indexed a Week After
Hi there, I'm working on getting a large e-commerce website indexed and I am having a lot of trouble.
Intermediate & Advanced SEO | | Travis-W
The site is www.consumerbase.com. We have about 130,000 pages and only 25,000 are getting indexed. I use multiple sitemaps so I can tell which product pages are indexed, and we need our "Mailing List" pages the most - http://www.consumerbase.com/mailing-lists/cigar-smoking-enthusiasts-mailing-list.html I submitted a sitemap a few weeks ago of a particular type of product page and about 40k/43k of the pages were indexed - GREAT! A week ago Google de-indexed almost all of those new pages. Check out this image, it kind of boggles my mind and makes me sad. http://screencast.com/t/GivYGYRrOV While these pages were indexed, we immediately received a ton of traffic to them - making me think Google liked them. I think our breadcrumbs, site structure, and "customers who viewed this product also viewed" links would make the site extremely crawl-able. What gives?
Does it come down to our site not having enough Domain Authority?
My client really needs an answer about how we are going to get these pages indexed.0 -
Does using a sub-domain lessen the effectiveness of your main domain?
For example a website without a blog and is a simple html site with no blogging capabilities. We go out to Blogger or Wordpress and set up the blog portion of the website using something like blog.yourdomain.com. Does this make a difference SEO wise? Is is more effective to be sure that you are using the main domain and not a sub-domain? I have heard both sides before but can't seem to find the concrete answer. Thanks for any advise out there.
Intermediate & Advanced SEO | | d25kart0 -
Best strategy for "product blocks" linking to sister site? Penguin Penalty?
Here is the scenario -- we own several different tennis based websites and want to be able to maximize traffic between them. Ideally we would have them ALL in 1 site/domain but 2 of the 3 are a partnership which we own 50% of and why are they are off as a separate domain. Big question is how do we link the "products" from the 2 different websites without looking spammy? Here is the breakdown of sites: Site1: Tennis Retail website --> about 1200 tennis products Site2: Tennis team and league management site --> about 60k unique visitors/month Site3: Tennis coaching tip website --> about 10k unique visitors/month The interesting thing was right after we launched the retail store website (site1), google was cranking up and sending upwards of 25k search impressions/day within the first 45 days. Orders kept trickling in and doing well overall for first launching. Interesting thing was Google "impressions" peaked at about 60 days post launch and then started trickling down farther and farther and now at about 3k-5k impressions/day. Many keywords phrases were originally on page 1 (position 6-10) and now on page 3-8 instead. Next step was to start putting "product links" (3 products per page) on site2 and site3 -- about 10k pages in total with about 6 links per page off to the product page (1 per product and 1 per category). We actually divided up about 100 different products to be displayed so this would mean about 2k links per product depending on the page. FYI, those original 10k pages from site2 and site3 already rank very well in Google and have been indexed for the past 2+ years in there. Most popular word on the sites is Tennis so very related. Our rationale was "all the websites are tennis related" and figured that the links on the latest and greatest products would be good for our audience. Pre-Penguin, we also figured this strategy would also help us rank for these products as well for when users are searching on them. We are thinking through since traffic and gone down and down and down from the peak of 45 days ago, that Penguin doesn't like all these links -- so what to do now? How to fix it and make the Penguin happy? Here are a couple of my thoughts on fixing it: 1. Remove the "category link" in our "product grouping" which would cut down the link by 1/3rd. 2. Place a "nofollow" on all the links for the other "product links". This would allow us to get the "user clicks" from these while the user is on that page. 3. On our homepage (site2 & site3), place 3 core products that change frequently (weekly) and showcase the latest and greatest products/deals. Thought is to NOT use the "nofollow" on these links since it is the homepage and only about 5 links overall. Heck part of me debated on taking our top 1000 pages (from the 10k page) and put the links ONLY on those and distribute about 500 products on them so this would mean only 2 links per product -- it would mean though about 4k links going there. Still thinking #2 above could be better? Any other thoughts would be great! Thanks, Jeremy
Intermediate & Advanced SEO | | jab10000 -
Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page. To avoid duplicates, all of the existing pages were added to the robots file. That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index? Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index. Anyone got any ideas about why this is happening, and how I can get my new pages indexed?
Intermediate & Advanced SEO | | corp08030 -
What is the best way to run a blog?
Hi, I was wondering what is the best way to run a blog? The options I thought of are: Completely separate domain with many links to my main site. blog.domain.com www.domain.com/blog Thanks
Intermediate & Advanced SEO | | BeytzNet1 -
Cross Sub Domain Canonical Links
I currently have 1 website, but am planning on dividing it into sub-domains specific to geographic locations such as xxx.co.uk, xxx.it, xxx.es, etc... We are working on creating original content for the sub-sites, however upon launch many will be duplicate pages. Is there a problem with cross sub-domain canonical links? Thanks!
Intermediate & Advanced SEO | | theLotter0