What are your thoughts on security of placing CMS-related folders in a robots.txt file?
-
So I was just about to add a whole heap of CMS-related folders to my robots.txt file to exclude them from search, and thought "hey, I'm publicly telling people where my admin folders are"...surely that's not right?!
Should I leave them out of the robots.txt file, and hope for the best that they never get indexed? Should I use noindex meta data on every page?
What are people's thoughts?
Thanks,
James
PS. I know this is similar to lots of other discussions around meta noindex vs. robots.txt, but I'm after specific thoughts around the security aspect of listing your admin folders in a robots.txt file...
-
surly your admin folders are secured?, it would not matter if someone knows where they are.
-
As a rule, you want to avoid using robots.txt files whenever possible. It does not consistently protect you from crawlers and when it does block crawlers it kills any PR on those pages.
If you can block those pages with a noindex tag, it would be a preferable solution.
With respect to security for a CMS site, it really needs to be a comprehensive effort. Many site owners take a couple steps and then have a false-sense of security. Here are a few thoughts:
-
try the site address with /administrator after it to access Joomla and other sites
-
try the site address or blog with /wp-admin/ after it to access Joomla sites
-
make up a webpage and try accessing it to view the site's 404 page
-
right-click on a page and choose View Page Source. Often you will see the name of the CMS clearly listed. Other times you will see clear clues such as /wp/ in folder names. Other times you will find unique extensions such as Yoast SEO which will give you an idea of the CMS
Once a bad guy knows which CMS is in use, they know the default folder structure and more. The point is it requires a lot more effort then most people realize to hide the CMS in use. I applaud your effort, but be very thorough about it. There is a lot more involved then simply covering your robots.txt file.
-
-
I found three options for you: http://www.techiecorner.com/106/how-to-disable-directory-browsing-using-htaccess-apache-web-server/
I think if you do it with.htacces that is a folder specific file than nobody will be able to detect where admin contet is located.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Thoughts on Botify?
Has anyone used Botify? Is this type of software necessary for a site with under 5K pages?
Technical SEO | | SoulSurfer80 -
Robots.txt & meta noindex--site still shows up on Google Search
I have set up my robots.txt like this: User-agent: *
Technical SEO | | RoxBrock
Disallow: / and I have this meta tag in my on a Wordpress site, set up with SEO Yoast name="robots" content="noindex,follow"/> I did "Fetch as Google" on my Google Search Console My website is still showing up in the search results and it says this: "A description for this result is not available because of this site's robots.txt" This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.1 -
Should I block Map pages with robots.txt?
Hello, I have a website that was started in 1999. On the website I have map pages for each of the offices listed on my site, for which there are about 120. Each of the 120 maps is in a whole separate html page. There is no content in the page other than the map. I know all of the offices love having the map pages so I don't want to remove the pages. So, my question is would these pages with no real content be hurting the rankings of the other pages on our site? Therefore, should I block the pages with my robots.txt? Would I also have to remove these pages (in webmaster tools?) from Google for blocking by robots.txt to really work? I appreciate your feedback, thanks!
Technical SEO | | imaginex0 -
GWT returning 200 for robots.txt, but it's actually returning a 404?
Hi, Just wondering if anyone has had this problem before. I'm just checking a client's GWT and I'm looking at their robots.txt file. In GWT, it's saying that it's all fine and returns a 200 code, but when I manually visit (or click the link in GWT) the page, it gives me a 404 error. As far as I can tell, the client has made no changes to the robots.txt recently, and we definitely haven't either. Has anyone had this problem before? Thanks!
Technical SEO | | White.net0 -
Google is indexing blocked content in robots.txt
Hi,Google is indexing some URLs that i don't want to be indexed and also is indexing the same URLs with https. This URLs are blocked in the file robots.txt.I've tried to block this URLs through Google WebmasterTools but Google doesn't let me do it because this URL are httpsThe file robots.txt is correct so, what can i do to avoid this content to be indexed?
Technical SEO | | elisainteractive0 -
Help needed please with 301 redirects in htaccess file.
In summary, we're currently having issues with our htaccess file. 301 redirects are going through to the new described URL but in addition the new URL is followed by a ? and the old URL. How can we get rid of the ? and previous URL so they don't appear as an ending. None of the examples we've found re this issue online appear to work. Can anyone please offer some advice? Can we use a RewriteRule to stop this happening? Here's a summary of the htaccess file REDIRECT CODE BEGINS HERE LONG LIST OF REDIRECTS, which appear to be set up perfectly fine. REDIRECT CODE ENDS DirectoryIndex index.php <ifmodule mod_rewrite.c="">RewriteEngine On Options +FollowSymLinks
Technical SEO | | petersommertravels
DirectoryIndex index.php
RewriteEngine On
RewriteCond $1 !^(images|system|themes|pdf|favicon.ico|robots.txt|index.php) [NC]
RewriteRule ^.htaccess$ - [F]
RewriteRule ^favicon.ico - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?/$1 [L]</ifmodule> DirectoryIndex index.php0 -
Restricted by robots.txt does this cause problems?
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
Technical SEO | | ocelot0 -
Duplicate content on the one domain (related to country targetting)
Hi - We have a client who has a TLD that they wish to target several markets using .com/au .com/us .com/sg Each will use duplicate content with slight variations to cater for the local market (spelling, industry jargon). They seem reluctant to register a TLD for each target market (which was our suggestion) and I am wondering what SEO penalties would apply for having a majority of duplicate content on the same domain – perhaps using subdomains would be better? Thanks!
Technical SEO | | E2E0