How to prevent directory from being accessed by search engines?
-
Pretty much as the question says, is there any way to stop search engines from crawling a directory? I am working on a Wordpress installation for my site but don't want it to be listed in search engines until it's ready to be shown to the world. I know the simplest way is to password-protect the directory but I had some issues when I tried to implement that so I'd like to see if there's a way to do it without passwords. Thanks in advance.
-
But don't forget to remove that Disallow out of Robots.txt when you go live - if you want those pages to be indexed (and also the Meta-robots noindex nofollow).
Otherwise you might be pulling your hair out trying to figure out why none of your pages are getting indexed in the SERPs.
-
You're absolutely right! I left that part out. Thanks
-
The robots.txt file does not guarantee that your pages will not show up in search results! Your best bet after password protection is adding a NoIndex meta tag to you page headers.
Google have openly said that they obey this tag (Matt Cutts).
-
Xee,
It always help, and it is very easy to implement. This function to show the path to the sitemap ir very good.
-
It's not required to have the ending slash. At least, it works for us without it.
-
As it is, my site is just phpBB3 forums (www.bearsfansonline.com); would a sitemap really help that much?
-
If you don't have an robot.txt file, you need to include some important stuff first.
First, do you have a sitemap.xlm for your website? If not, its very important and you should creat it at: http://www.xml-sitemaps.com/
Create a robot.txt file and include the follow:
User-agent: * allow: / disallow: /directoryname
Sitemap: http://www.yousite.com/sitemap.xmlWith this you will inform all robots where is your sitemap. You should read more about robots.txt in this great post: http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
shouldn't you put a slash at the end of the directory in the robots file?
you can create the robots file through the Google Webmaster Tools
-
I don't have a robots.txt file in my root. Do I just create a text file, put the above lines into it, and upload it to my root after changing the name?
-
I'm assuming you want all search engines blocked from this directory. If so, edit your robots.txt file to state the following. This will block all bots from accessing a folder/directory on your site
User-agent: *
Disallow: /directoryname
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I safely block my product listing from search? Does it even make sense?
Hi, I've an ecommerce website with more than 50k urls and only 10% or so are getting crawled regularly by Google.
Technical SEO | | GhillC
Product listing pages represent roughly 80% of these 50k pages. Trying to improve this, I was thinking to remove altogether all (most?) of my product listing from search (via Robot.txt) to keep only the product pages themselves and the product categories. My organic situation since Jan 2019:
Users: 2,300,000 (of which 9% are visiting product listing pages)
Page views: 8,000,000 (of which 5% are product listing pages). Am I about to unleash armageddon (or more like harakiri) on my website by doing so or actually get Google to crawl much more relevant resources (product pages, product categories, blog content and so on)? Thanks,
G0 -
Image Search / sudden drop in traffic
One of our sites in Germany had a very sudden drop in traffic (starting Oct. 7th). The site gets most of it's organic traffic from Image Search. Checking in Search Console revealed that search volume for keywords increased in that period our average position is stable our click rate dropped dramatically (we double checked - searching the keyword in "anonymous mode" still showed our results for main keywords in top image positions (first 2 rows)). As an example (see attached screencopy) - keyword had clickrate of 1% (average) - dan dropped to 0.06% while the position remained stable. Germany is still using the "old" version of image search (unlike the rest of the world) - which gives the site preview rather than just the image slider when you click on a result in image search. Our first thought that this was changed - but it seems that it didn't change. Ideas what might cause this dramatic drop in click%? There have been no major technical modifications on the site for the last 2 months. thanks, Dirk GjlV8CW.jpg
Technical SEO | | DirkC0 -
Google displaying "Items 1-9" before the description in the Search Results
We see our pages coming up in Google with the category page/product numbers in front of our descriptions. For example: Items 1 - 24 of 86 (and than the descriptions follows). Our website is magento based. Is there a fix for this that anyone knows of? Is there method of stopping Google from adding this on to the front of our Meta Description?
Technical SEO | | DutchG0 -
Is there a reason why a host would be reluctant to give up Cpanel access info?
Granted, a strange question here... My client lost her cpanel login credentials, or never bothered to get them (she didn't even know she had a hosting account). Apparently she has a friend who is hosting her website for her, free of charge. I need to get into the cpanel, but they are being extremely difficult. The client asked them and they didn't want to give it to her either. Still trying, but is there any reason why they would be so difficult? How does it benefit them? It can't be because they're afraid of losing her account because she isn't paying them anything. Totally confused by this. Any ideas?
Technical SEO | | Masbro1 -
URL or sitemap submit to search engines?
Hello, I have just updated content at some URL site links, and I also added new URL content. Should I submit URL or re-create a sitemap then submit it to search engines? And please advise me some tools for submit them?
Technical SEO | | JohnHuynh0 -
Why isn't Google pushing my Schema data to the search results page
I believe we have it set up right. I'm noticing all my competitors schema data is showing up which is really giving them a leg up on us. We have a high ranking website so I'm just not sure why it's now showing up. Here is an example URL http://www.airgundepot.com/3576w.html I've used the Google webmaster tools tester and it all looks fine. Any ideas? Thanks in advance.
Technical SEO | | AirgunDepot0 -
Do search engines treat 307 redirects differently from 302 redirects?
We will need to send our users to an alternate version of our homepage for a few hours for a certain event. The SEO task at hand is to minimize the chance of the special homepage getting crawled and cached in the search engines in place of our normal homepage. (This has happened in the past so the concern is not imaginary.) Among other options, 302 and 307 redirects are being discussed. IE, redirecting www.domain.com to www.domain.com/specialpage. Having used 302s and 301s in the past, I am well aware of how search engines treat them. A 302 effectively says "Hey, Google! Please get rid of the old content on www.domain.com and replace it with the content on /specialpage!" Which is exactly what we don't want. My question is: do the search engines handle 307s any differently? I am hearing that the 307 does NOT result in the content of the second page being cached with the first URL. But I don't see that in the definition below (from w3.org). Then again, why differentiate it from the 302? 307 Temporary Redirect The requested resource resides temporarily under a different URI. Since the redirection MAY be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field. The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s) , since many pre-HTTP/1.1 user agents do not understand the 307 status. Therefore, the note SHOULD contain the information necessary for a user to repeat the original request on the new URI. If the 307 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
Technical SEO | | CarsProduction0 -
Google search result going to a page that I did not put on my site
Hi, I am seeing a very strange result in google for my site. When doing a search for the term "london reflexology" my site comes up 18th in the results. But when I click the link or check the URL it shows up as: http://www.reflexologyonline.co.uk/reflexologyonline.php?Action=Webring This is not right at all. It looks like some sort of cloaking but I am not sure. I am new to SEO and I do not know why goole is showing this URL that does not exist on my site and of witch the content is totally wrong. Can anyone please help with this? See the 2 linked images for more details. It seems to me the site might be hacked or something to that effect. Please help.... jyJdP.png 71Mf4.png
Technical SEO | | RupDog0