Have you ever seen or experienced a page indexed which is actually from a website which is blocked by robots.txt?

vtmoz

Hi all,

We use robots file and meta robots tags for blocking website or website pages to block bots from crawling. Mostly robots.txt will be used for website and expect all the pages to not getting indexed. But there is a condition here that any page from website can be indexed by Google even the site is blocked from robots.txt; because crawler may find the page link somewhere on internet as stated here at last paragraph. I wonder if this really the case where some webpages have got indexed.

And even we use meta tags at page level; do we need to block from robots.txt file? Can we use both techniques at a time?

Thanks

Gaston Riera

Hi vtmoz,

The most mandatory way to prevent any page to be indexed is by using a meta robots tag with a _noindex _parameter.
Then using robots.txt will help to optimize your server resources and is a way that prevent google to crawl any new page that do not have the meta robots tag.

And yeah, its very common to have indexed pages even the robots.txt file blocks the entire website.

If what you are looking for is to remove from index the pages, follow this steps:

Allow the whole website to be crawable (or at least that specific pages/section) in the robots.txt
add the robots meta tag with "noindex,follow" parametres
wait several weeks, 6 to 8 weeks is a fairly good time. Or just do a followup on those pages
when you got the results (all your desired pages to be de-indexed) re-block with robots.txt those pages
DO NOT erase the meta robots tag.

Hope it helps.
Best luck.
GR.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Have you ever seen or experienced a page indexed which is actually from a website which is blocked by robots.txt?

Browse Questions

Explore more categories

Related Questions

Need only tens of pages to be indexed out of hundreds: Robots.txt is Okay for Google to proceed with?

Site:www Issue - Homepage of the website is not showing in Google

Why different pages rank in different countries?

Can site blocked for US visitors rank well internationally?

A website Mobile versions for different languages

Is it possible that Google may have erroneous indexing dates?

Should social widgets be the kind that shares/likes a page, or the kind that adds followers to a brand social page?

Shouldn’t Google always rank a website for its own unique, exact +10 word content such as a whole sentence?