Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?

Voodak

Hi- I have a client that had thousands of dynamic php pages indexed by Google that shouldn't have been. He has since blocked these php pages via robots.txt disallow. Unfortunately, many of those php pages were linked to by high quality sites mulitiple times (instead of the static urls) before he put up the php 'disallow'.

If we create 301 redirects for some of these php URLs that area still showing high value backlinks and send them to the correct static URLs, will Google even see these 301 redirects and pass link value to the proper static URLs? Or will the robots.txt keep Google away and we lose all these high quality backlinks? I guess the same question applies if we use the canonical tag instead of the 301. Will the robots.txt keep Google from seeing the canonical tags on the php pages?

Thanks very much,

V

DmitriiK

No problem

Voodak

Hello Dmitrii,

Yes, that clarifies things perfectly. Thanks very much for your explanation. And I missed this particular WBF, so I will give it a close look as well.

Thanks again for your quick help.

DmitriiK

Hello, my friend.

You should realize how exactly htaccess' 301 redirects work. They are server side commands/operations. So, when bots request a page, they wait until server response. In case of 301s - they get response "Don't go here, go there". Now, they also may get response from robots.txt saying "you're not allowed to look at the contents of this file/directory", however this will not prevent the server response. That's why sometimes you can see indexed pages, which are saying "blocked by robots". They are indexed though.

Now, in case of canonical links you are correct, since canonical is IN the content of the page, then robots won't be able to read it, therefore won't be able to be told that there is a canonical page.

There is a recent WBF on this subject - https://mza.seotoolninja.com/blog/controlling-search-engine-crawlers-for-better-indexation-and-rankings-whiteboard-friday

Hope this clarifies some things.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

No: 'noindex' detected in 'robots' meta tag

301 Redirects Showing As 307 Redirects

What to do with 404 errors when you don't have a similar new page to 301 to ??

Remove html file extension and 301 redirects

Duplicate page errors from pages don't even exist

Googlebot does not obey robots.txt disallow

I'm redesigning a website which will have a new URL format. What's the best way to redirect all the old URLs to the new ones? Is there an automated, fast way to do this?

Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?