SSL and robots.txt question - confused by Google guidelines
-
I noticed "Don’t block your HTTPS site from crawling using robots.txt" here: http://googlewebmastercentral.blogspot.co.uk/2014/08/https-as-ranking-signal.html
Does this mean you can't use robots.txt anywhere on the site - even parts of a site you want to noindex, for example?
-
Hi Luke,
Just make sure that your robots.txt file located at https://www.example.com/robots.txt doesn't block search engine spiders. Of course there may be some folders or filetypes you want to block but it certainly shouldn't look like below which would block everything:
User-agent: *
Disallow: /
Hope that helps
-
No that's not what they mean - it means Google recommends you allow the secure version of your site(where applicable) to be crawled. You can still block certain pages/sections should you choose to do so.
With regards to noindexing you could also place this on the actual page as an alternative.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Default Robots.txt in WordPress - Should i change it??
I have a WordPress site as using theme Genesis i am using default robots.txt. that has a line Allow: /wp-admin/admin-ajax.php, is it okay or any problem. Should i change it?
Intermediate & Advanced SEO | | rootwaysinc0 -
Robots.txt and redirected backlinks
Hey there, since a client's global website has a very complex structure which lead to big duplicate content problems, we decided to disallow crawler access and instead allow access to only a few relevant subdirectories. While indexing has improved since this I was wondering if we might have cut off link juice. Since several backlinks point to the disallowed root directory and are from there redirected (301) to the allowed directory I was wondering if this could cause any problems? Example: If there is a backlink pointing to example.com (disallowed in robots.txt) and is redirected from there to example.com/uk/en (allowed in robots.txt). Would this cut off the link juice? Thanks a lot for your thoughts on this. Regards, Jochen
Intermediate & Advanced SEO | | Online-Marketing-Guy0 -
Question about robots file on mobile devices
Hi We have a robots.txt file, but do I need to create a separate file for the m.site or can I just add the line into my normal robots file. Ive just read the Google Guidelines (what a great read it was) and couldn't find my answer. Thanks in Advance Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Fetch as Google - Redirected
Hi I have swaped from HTTP to HTTPS and put a redirect on for HTTP to redirect to HTTPS. I also put www.xyz.co.uk/index.html to redirect to www.xyz.co.uk When I fetch as Google it shows up redirect! Does this mean that I have too many 301 looping? Do I need the redirect on index.html to root domain if I have a rel conanical in place for index.html htaccess (Linix) - RewriteCond %{HTTP_HOST} ^xyz.co.uk
Intermediate & Advanced SEO | | Cocoonfxmedia
RewriteRule (.*) https://www.xyz.co.uk/$1 [R=301,L] RewriteRule ^$ index.html [R=301,L]0 -
Technical Site Questions
When i do a google cache of our site, i see 2 menus, our developers say that's because the 2nd is for the mobile menu - is that correct, as when i look up other sites that have mobile rendering they only have one menu visible. Plus GWT's has the number of internal links per page at least x2 what they should have - are they connected? Secondly when i do a spider test through http://tools.seobook.com/general/spider-test/ it shows all "behind the scenes text" eg font names, portals, sliders, margins - "font size px" is shown as 17 times and a density of 2.15% - surely this isnt correct as google will be thinking that these are my keywords !? My site is www.over50choices.co.uk Thanks Ash
Intermediate & Advanced SEO | | AshShep10 -
Can URLs blocked with robots.txt hurt your site?
We have about 20 testing environments blocked by robots.txt, and these environments contain duplicates of our indexed content. These environments are all blocked by robots.txt, and appearing in google's index as blocked by robots.txt--can they still count against us or hurt us? I know the best practice to permanently remove these would be to use the noindex tag, but I'm wondering if we leave them they way they are if they can still hurt us.
Intermediate & Advanced SEO | | nicole.healthline0 -
Google + Local Pages
Hi, If I have a company with multipul addresses, Do I create separate Google + page for each area?
Intermediate & Advanced SEO | | Bryan_Loconto0 -
Google, Links and Javascript
So today I was taking a look at http://www.seomoz.org/top500 page and saw that the AddThis page is currently at the position 19. I think the main reason for that is because their plugin create, through javascript, linkbacks to their page where their share buttons reside. So any page with AddThis installed would easily have 4/5 linbacks to their site, creating that huge amount of linkbacks they have. Ok, that pretty much shows that Google doesn´t care if the link is created in the HTML (on the backend) or through Javascript (frontend). But heres the catch. If someones create a free plugin for wordpress/drupal or any other huge cms platform out there with a feature that linkbacks to the page of the creator of the plugin (thats pretty common, I know) but instead of inserting the link in the plugin source code they put it somewhere else, wich then is loaded with a javascript code (exactly how AddThis works). This would allow the owner of the plugin to change the link showed at anytime he wants. The main reason for that would be, dont know, an URL address update for his blog or businness or something. However that could easily be used to link to whatever tha hell the owner of the plugin wants to. What your thoughts about this, I think this could be easily classified as White or Black hat depending on what the owners do. However, would google think the same way about it?
Intermediate & Advanced SEO | | bemcapaz0