Robots.txt and robots meta

Highland

I have an odd situation. I have a CMS that has a global robots.txt which has the generic

User-Agent: *
Allow: /

I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?

TheEspresseo

I see. Have you considered putting it behind an htpasswd?

Highland

I can control it (it's a custom piece of software) but it's not as easy a fix as adding a meta to the template.

The main problem is we have a junk TLD we use to test some new ideas off the live server (lets clients give us feedback) but it gets spidered and indexed and starts ranking for client sites before they're ready to live in their own TLD. This means we have to compete against ourselves (even with a 301). There's nothing sensitive or it would live behind a password.

TheEspresseo

Do you need to control access to the site beyond the SERPS? I would not rely on robots.txt to shield any sensitive data.

For a breakdown of robots.txt and robots meta-tags checkout: http://www.robotstxt.org/robotstxt.html and http://www.searchtools.com/robots/robots-meta.html/, and for a great post on using these standards in SEO check out: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

I am also concerned that you are unable to control your robots.txt! If your CMS doesn't let you do that and overwrites it when you change it manually, you have some major control problems on your hands that you should remedy.

fabioricotta-84038

Blocking it at the robots.txt will not guarantee that your site will not appear at Google's index. I think you can use meta robots NOINDEX to guarantee that Google will not show your pages when someone try to Google it.

It is important to say that Googlebot and other spiders will continue to visit your page.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt and robots meta

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Robots.txt in subfolders and hreflang issues

My SERP meta description is displaying 315 characters...

Very strange: META descriptions not showing

Multi-domain content and meta data feed

Robots.txt

Our homepage currently uses a Meta refresh. Is it worth $1,000 to get it fixed?

Getting home page content at top of what robots see

Robots.txt