How does robots.txt affect aliased domains?

michaelj_me

Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.

I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.

I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.

Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?

THANK YOU!!!

Dr-Pete

I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.

Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).

Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.

Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How does robots.txt affect aliased domains?

Browse Questions

Explore more categories

Related Questions

Using one robots.txt for two websites

Robots.txt Disallow: / in Search Console

Parked Domains

"Extremely high number of URLs" warning for robots.txt blocked pages

Which domain should i set up a blog on?

OK to block /js/ folder using robots.txt?

Domain.com and domain.com/ redirect(error)

External Links from own domain