Google Indexing Development Site Despite Robots.txt Block
-
Hi,
A development site that has been set-up has the following Robots.txt file:
User-agent: *
Disallow: /
In an attempt to block Google indexing the site, however this isn't the case and the development site has since been indexed.
Any clues why this is or what I could do to resolve it?
Thanks!
-
Hi so I'm assuming your on IIS (I'm no expert on ISS I think you will need to configure the web.config) and I'm just going to step back now and get my coat as I only have experience with Apache
-
Thanks for your help! Much appreciated
-
It's generally best to noindex/nofollow using the meta robots tag in the header. If it's not too much of a stretch for you, you can also password protect the test site. The over-so-lovely and charming Googles will still display results blocked by robots.txt - though it won't generally cache the content. If you would like, you can hookup the test site with Webmaster Tools and remove the URL(s) from the index.
-
Its my understanding that htaccess is PHP based and as we code in .net we don't have a htaccess file.
Do you know of this this happening before because its not something that I've heard of.
-
You would need to block access via htaccess rather than robots file as the robots.txt is only advisory
If you are using wordpress I use this simple plugin JF3 Maintenance Redirect
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site not getting indexed by googlebot.
The following question is in regards to http://footeschool.org/. This site is not getting indexed with google(googlebot) This only happens when the user agent is set googlebot. This is a recent issue. We are using DNN as CMS. Are there any suggestion to help resolve this issue?
Technical SEO | | bcmull0 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
WMT "Index Status" vs Google search site:mydomain.com
Hi - I'm working for a client with a manual penalty. In their WMT account they have 2 pages indexed.If I search for "site:myclientsdomain.com" I get 175 results which is about right. I'm not sure what to make of the 2 indexed pages - any thoughts would be very appreciated. google-1.png google-2.png
Technical SEO | | JohnBolyard0 -
How can I stop google indexing an image
I have put a map of cornwall on my site on the Corwnall Page, and for some reason Google.de has picked it up and shows it up in the top 4 images for a search for cornwall? The result is I am getting about 80% of the traffic coming to my site for the search Cornwall (I get about 50 unique visits per day, over 40 a day are landing on the Cornwall page. Is this a problem for my normal SEO as a Close up Magician? Will google start to think my site is about Cornwall? Should I noindex the image (I say that like I know how! - How do I noindex that image? ) Or is any traffic to a site good traffic, I imagine they will be clicking on the link landing on the page and then leaving, which I suspect is not good for google reputation. Any thoughts anyone Thanks Roger http://www.rogerlapin.co.uk Where they land http://www.google.de/imgres?imgurl=http://www.rogerlapin.co.uk/wp-content/uploads/2013/09/map-of-cornwall.jpg&imgrefurl=http://www.rogerlapin.co.uk/magician-cornwall-magicians-hire-cornwall&h=904&w=1000&sz=167&tbnid=9GFlDv3BTz4ikM:&tbnh=99&tbnw=110&zoom=1&usg=__-b4bUYWREU_wAy2M04LrsrkzZpw=&docid=AUFmzso0arbGDM&sa=X&ei=HLZ2UpGYDMrY0QWXp4D4Dg&ved=0CEgQ9QEwAw&dur=2958
Technical SEO | | rnperki0 -
Site Indexed but not Cached?
I launched a new website ~2 weeks ago that seems to be indexed but not cached. According to Google Webmaster most of the pages are indexed and I see them appear when I search site:www.xxx.com. However, when I type into the URL - cache:www.xxx.com I get a 404 error page from Google.
Technical SEO | | theLotter
I've checked more established websites and they are cached so I know I am checking correctly here... Why would my site be indexed but not in the cache?0 -
No indexing url including query string with Robots txt
Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks!
Technical SEO | | HMK-NL0 -
What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT. I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap. Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
Technical SEO | | DotCar0 -
What is the sense of robots.txt?
Using robots.txt to prevent search engine from indexing the page is not a good idea. so what is the sense of robots.txt? just for attracting robots to crawl sitemap?
Technical SEO | | jallenyang0