How does robots.txt affect aliased domains?
-
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.
I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.
I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.
Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?
THANK YOU!!!
-
I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.
Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).
Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.
Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Curious case of synonyms affecting our domain queries
Here is a curious case of synonyms affecting search suggestions with our domain name terms as well as search results rankings with our domain in the query. I have posted all the details here: https://productforums.google.com/forum/#!msg/webmasters/ESDluD9Q0-A/4qU4pRPP6OgJ Not sure if this is the right forum to get some tips on how to handle this case. Happy to take it down if this is not the right place. Any suggestions appreciated! Thanks
Technical SEO | | madhurk0 -
Can I rely on just robots.txt
We have a test version of a clients web site on a separate server before it goes onto the live server. Some code from the test site has some how managed to get Google to index the test site which isn't great! Would simply adding a robots text file to the root of test simply blocking all be good enough or will i have to put the meta tags for no index and no follow etc on all pages on the test site also?
Technical SEO | | spiralsites0 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
No indexing url including query string with Robots txt
Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks!
Technical SEO | | HMK-NL0 -
Domains and subdomains
When I started a campaign for my message, I got the message: "We have detected that the domain www.vamospaella.com and the domain vamospaella.com both respond to web requests and do not redirect. Having two "twin" domains that both resolve forces them to battle for SERP positions, making your SEO efforts less effective. We suggest redirecting one, then entering the other here." I wasn't sure whether I had said it was a subdomain when in fact it was a domain (or the other way round), so I started another campaign for the same website using the other option and the message didn't come up. However, I still don't understand what you meant by this and whether it's an issue. When I search for my website in Google, it shows as vamospaella.com when other websites come up as www. and then their domain name. If it is a problem, is it to do with my hosting package and how it's set up or is it to do with my local site on my computer? I did ring my web host, 1&1, but they said they couldn't see a problem. Please can you let me know how I can resolve this as my ranking is still quite low in Google and I'm not sure why. If it is because of "twin domains", then will Google see my content as duplicated and keep me low in their rankings? I'm new to SEO and not a website novice, so please answer in lay terms! Thanks Melissa
Technical SEO | | melissa10 -
Any way around buying hosting for an old domain to 301 redirect to a new domain?
Howdy. I have just read this QA thread, so I think I have my answer. But I'm going to ask anyway! Basically DomainA.com is being retired, and DomainB.com is going to be launched. We're going to have to redirect numerous URLs from DomainA.com to DomainB.com. I think the way to go about this is to continue paying for hosting for DomainA.com, serving a .htaccess from that hosting account, and then hosting DomainB.com separately. Anybody know of a way to avoid paying for hosting a .htaccess file on DomainA.com? Thanks!
Technical SEO | | SamTurri0 -
Subfolder to Root Domain, Should i or not?
Dear All, I am kinda of stuck and need your helps, I have major client's site which is located in subdirectory. Instance: http://www.majorclient.com.cn/cn/ and as you can see domain name is already localized and they are not planning to make any english version of the site, So there will be no /eng/ folder but clients whole website is /cn/ subdirectory which is totally unnecessary and i would like to 301 whole site to **http://www.majorclient.com.cn ** But the problem is that this site has been /cn/ folder forever since the beginning and lot of trusted links are pointing to www.majorclient.com.cn/cn/ So should i move it or just configure 301 www.majorclient.com.cn to www.majorclient.com.cn/cn/ leave it there and don't bother. Help?
Technical SEO | | DigitalJungle0 -
Robots.txt question
I want to block spiders from specific specific part of website (say abc folder). In robots.txt, i have to write - User-agent: * Disallow: /abc/ Shall i have to insert the last slash. or will this do User-agent: * Disallow: /abc
Technical SEO | | seoug_20050