Googlebot soon to be executing javascript - Should I change my robots.txt?
-
This question came to mind as I was pursuing an unrelated issue and reviewing a site's robots/txt file.
Currently this is a line item in the file:
Disallow: https://* According to a recent post in the Google Webmasters Central Blog: [http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better") Googlebot is getting much closer to being able to properly render javascript. Pardon some ignorance on my part because I am not a developer, but wouldn't this require Googlebot be able to execute javascript? If so, I am concerned that disallowing Googlebot from the https:// versions of our pages could interfere with crawling and indexation because as soon as an end-user clicks the "checkout" button on our view cart page, everything on the site flips to https:// - If this were disallowed then would Googlebot stop crawling at that point and simply leave because all pages were now https:// ??? Or am I just waaayyyy over thinking it?...wouldn't be the first time! Thanks all! [](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better")
-
Excellent answer. Thanks so much Doug. I really appreciate it! Adding a "nofollow" attribute to the Checkout button is a good suggestion and should be fairly easy to implement. I realize that internal nofollows are not normally recommended, but in this instance, may not be a bad idea.
-
Hi Dana,
When you click on the checkout button - what's the mechanism for taking people to the https:// site. Is it just that the checkout link uses https:// in it's link? Is there some javascript wizardry you're particularly concerned about?
Even though googlebot follows this one link to the https version of the cart, it will still have all the other links on the previous page queued up to follow (non-https) so I don't think this will stop the crawl at that point. It would be a nightmare if googlebot stopped crawling hte entire site everytime it went down a rabbit hole!
That's not to say that you wouldn't want to consider no-following your checkout button. I'm sure neither you, nor google want to the innards of the cart pages to be indexed? There's probably other pages you'd rather Googlebot spent it's time finding right?
My take on the Google blog about understanding Javascript is that the aim is to try and do a better job discovering content that might be hidden by Javascript/Ajax. It's a problem for google when the raw html that they're crawling doesn't accurately reflect the content that is displayed in front of a real visitor.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404s in Google Search Console and javascript
The end of April, we made the switch from http to https and I was prepared for a surge in crawl errors while Google sorted out our site. However, I wasn't prepared for the surge in impossibly incorrect URLs and partial URLs that I've seen since then. I have learned that as Googlebot grows up, he'she's now attempting to read more javascript and will occasionally try to parse out and "read" a URL in a string of javascript code where no URL is actually present. So, I've "marked as fixed" hundreds of bits like /TRo39,
Algorithm Updates | | LizMicik
category/cig
etc., etc.... But they are also returning hundreds of otherwise correct URLs with a .html extension when our CMS system generates URLs with a .uts extension like this: https://www.thompsoncigar.com/thumbnail/CIGARS/90-RATED-CIGARS/FULL-CIGARS/9012/c/9007/pc/8335.html
when it should be:
https://www.thompsoncigar.com/thumbnail/CIGARS/90-RATED-CIGARS/FULL-CIGARS/9012/c/9007/pc/8335.uts Worst of all, when I look at them in GSC and check the "linked from" tab it shows they are linked from themselves, so I can't backtrack and find a common source of the error. Is anyone else experiencing this? Got any suggestions on how to stop it from happening in the future? Last month it was 50 URLs, this month 150, so I can't keep creating redirects and hoping it goes away. Thanks for any and all suggestions!
Liz Micik0 -
Google cant read my robots.txt from past 10 days
http://awesomescreenshot.com/08d1s6aybc hi, my robots.txt is http://wallpaperzoo.com/robots.txt google says it cant read and has postponed the crawl.. its been 10 days and no crawl.. please help me in solving this issue.. this is save with http://hdwallpaperzones.com/robots.txt
Algorithm Updates | | toxicpls0 -
Drop in Traffic from Google, However no change in the rankings
I have seen a 20% drop in traffic from google last week (After April 29th). However when I try to analyze the rank of the keywords in the google results that send me traffic they seem to be the same. Today (6th March) Traffic has fallen further again with not much/any visible change in the rankings. Any ideas on what the reason for this could be? I have not made any changes to the website recently.
Algorithm Updates | | raghavkapur0 -
Title changed in local pack, unchanged in local plus?!
Google seems to have pulled the title from the homepage and put that as the title in the local pack in the SERP for my targeted keyword. The local plus page title remains unchanged. Any way to influence this back to the way it was? The local plus title looks much better in results (even though it's just the brand name (which is the same as the domain name) and not the city + industry).
Algorithm Updates | | Mozzin1 -
Changing the # of results per page in Google search settings displays totally different results. Why is this?
Curious what's going on here. This is the first time I've seen this before. What's happening is this ... In Google, I search for "mobile apps orange county" and get a standard list of 10 results. I go to Google's search settings in the top right corner of the page (button is grey with a gear) to change the number of results per page from 10 to 50 (also did 100). When I go back to Google and search again for "mobile apps orange county" I get a much larger list but with completely different results. This time around the top 10-12 are dominated by the same website (ocregister.com) What's going on here that Google would now show different results? Why is this one website all of a sudden dominating the first 12 results? Thanks everyone! ByteLaunch
Algorithm Updates | | ByteLaunch0 -
Google.co.uk vs pages from the UK - anyone noticed any changes?
We've started to notice some changes in the rankings of Google UK and Google pages from the UK. Pages from the UK have always typically ranked higher, however it seems like these are slipping, and Google UK pages (pages from the web) are climbing. We've noticed a similar thing happening in the Bing/Yahoo algorithm as well. Just wondered if anyone else has anyone else noticed this? Thanks
Algorithm Updates | | Digirank0 -
Rankings changing based on location within a country... normal?
I recently had a satellite office across the country come to me and say that they couldn't find us on Google, based on a number of keywords they were searching on. I thought that isn't right... I know we rank for those terms. So, I did a search here, and there we were for those very terms, and ranking quite nicely. Sooo, what's going on there? I know there are variations from Google.com to Google.ca in terms of ranking. But within Google.ca I've not seen this before. Can anyone shed some light on that?
Algorithm Updates | | atcosl0 -
Google changing case of URLs in SERPs?
Noticed some strange behavior over the last week or so regarding our SERPs and I haven't been able to find anything on the web about what might be happening. Over the past two weeks, I've been seeing our URLs slowly change from upper case to lower case in the SERPs. Our URLs are usually /Blue-Fuzzy-Widgets.htm but Google has slowly been switching them to /blue-fuzzy-widgets.htm. There has been no change in our actual rankings nor has it happened to anyone else in the space. We're quite dumbfounded as to why Google would choose to serve the lower case URL. To be clear, we do not build links to these lower case URLs, only the upper. Any ideas what might be happening here?
Algorithm Updates | | Natitude0