I am trying to block robots from indexing parts of my site..
-
I have a few websites that I mocked up for clients to check out my work and get a feel for the style I produce but I don't want them indexed as they have lore ipsum place holder text and not really optimized... I am in the process of optimizing them but for the time being I would like to block them. Most of my warnings and errors on my seomoz dashboard are from these sites and I was going to upload the folioing to the robot.txt file but I want to make sure this is correct:
User-agent: *
Disallow: /salondemo/
Disallow: /salondemo3/
Disallow: /cafedemo/
Disallow: /portfolio1/
Disallow: /portfolio2/
Disallow: /portfolio3/
Disallow: /salondemo2/
is this all i need to do?
Thanks
Donny
-
Thanks
-
this is the correct approach when using the robots.txt method of blocking. Be aware however, that the only secure way to ensure 100% that such locations are not indexed is to put them behind a password protected gate-way. I always recommend to design agencies that there be a simple single log-in screen between the front end and design folders. This can be as complex as unique UIDs and Passwords for every client, or a single shared login, if all you want to do is bar search engines from seeing the content.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
We are migrating a site and are seeing alot of 301s and 302s already in the old site is it ok to leave those as is?
For the 3xx’s I’m not sure if it’s okay for us to redirect to these so please advise on that
Technical SEO | | lina_digital0 -
Is it problematic for Google when the site of a subdomain is on a different host than the site of the primary domain?
The Website on the subdomain runs on a different server (host) than the site on the main domain.
Technical SEO | | Christian_Campusjaeger0 -
Easy Question: regarding no index meta tag vs robot.txt
This seems like a dumb question, but I'm not sure what the answer is. I have an ecommerce client who has a couple of subdirectories "gallery" and "blog". Neither directory gets a lot of traffic or really turns into much conversions, so I want to remove the pages so they don't drain my page rank from more important pages. Does this sound like a good idea? I was thinking of either disallowing the folders via robot.txt file or add a "no index" tag or 301redirect or delete them. Can you help me determine which is best. **DEINDEX: **As I understand it, the no index meta tag is going to allow the robots to still crawl the pages, but they won't be indexed. The supposed good news is that it still allows link juice to be passed through. This seems like a bad thing to me because I don't want to waste my link juice passing to these pages. The idea is to keep my page rank from being dilluted on these pages. Kind of similar question, if page rank is finite, does google still treat these pages as part of the site even if it's not indexing them? If I do deindex these pages, I think there are quite a few internal links to these pages. Even those these pages are deindexed, they still exist, so it's not as if the site would return a 404 right? ROBOTS.TXT As I understand it, this will keep the robots from crawling the page, so it won't be indexed and the link juice won't pass. I don't want to waste page rank which links to these pages, so is this a bad option? **301 redirect: **What if I just 301 redirect all these pages back to the homepage? Is this an easy answer? Part of the problem with this solution is that I'm not sure if it's permanent, but even more importantly is that currently 80% of the site is made up of blog and gallery pages and I think it would be strange to have the vast majority of the site 301 redirecting to the home page. What do you think? DELETE PAGES: Maybe I could just delete all the pages. This will keep the pages from taking link juice and will deindex, but I think there's quite a few internal links to these pages. How would you find all the internal links that point to these pages. There's hundreds of them.
Technical SEO | | Santaur0 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
I have a site that has both http:// and https:// versions indexed, e.g. https://www.homepage.com/ and http://www.homepage.com/. How do I de-index the https// versions without losing the link juice that is going to the https://homepage.com/ pages?
I can't 301 https// to http:// since there are some form pages that need to be https:// The site has 20,000 + pages so individually 301ing each page would be a nightmare. Any suggestions would be greatly appreciated.
Technical SEO | | fthead90 -
How much of an issue is it if a site is somehow connected to a site that was penalized by Google?
I am working with someone that is about to launch a new site, and one of the sites was affected by the Panda update. Does it matter if the two sites are connected? Share the same hosting provider and same Google Webmaster's account?
Technical SEO | | nicole.healthline0 -
What is consider best practice today for blocking admins from potentially getting indexed
What is consider best practice today for blocking pages, for instance xyz.com/admin pages, from getting indexed by the search engines or easily found. Do you recommend to still disallow it in the robots.txt file or is the robots.txt not the best place to notate your /admin location because of hackers and such? Is it better to hide the /admin with an obscure name, use the noidex tag on the page and don't list in the robots.txt file?
Technical SEO | | david-2179970 -
Why Google did not index our domain?
Hi, We launched tmart 60 days ago and submitted to google, bing, yahoo 20 days later. But google had never indexed our website still when yahoo indexed it in one week. What we have checked or tried: 1. We got 20~50 inlinks in one month and now 81 inlinks via yahoo site explorer. 2. This domain has registered for 13 years and we purchased it from sedo last year. We
Technical SEO | | zt673
did not find any problems from domain archive pages. 3. Page similar: the homepage is 50% similar to one of our competitors when we just launched.
So we adjusted the page structure and modified the content one month later and decreased the similarity to 30% (by tools from webconfs.com) 4. Google Robots: googlebot crawled our website every day after we submitted for indexing.
We opened GWT account for it and added the xml sitemap last week. GWT said nothing
was wrong except the time of page loading. Our questions: Why google did not indexed our website? What should we do? Thanks, wu0