Does Googlebot Read Session IDs?
-
I did a raw export from AHREFs yesterday and one of our sites has 18,000 backlinks coming from the same site. But they're all the same link, just with a different session ID. The structure of the URL is:
[website].com/resources.php?UserID=10031529
And we have 18,000 of these with a different ID.
Does Google read each of these as a unique backlink or does it realize there's just one link and the session ID is throwing it off? I read different opinions when researching this so I'm hoping the Moz community can give some concrete answers.
-
Safest bet, set up canonicals that point to the page minus the parameter so even if Google does read the session IDs it will understand that they relate to the canon link. Honestly, I'm not 100% sure if Google reads those sessions IDs or not either and have seen conflicting information. I know they read other parameters as separate URLs... I had a few issues with the way one of our sites handled products (sometimes it was ?model= and sometimes it was ?prod_id= and some old products also had ?sku=). But adding the canonicals will solve this problem if it exists and if the problem doesn't exist it won't hurt having a self-referential canonical sitting in the code in case someone scrapes your site.
-
You have to inform yourself and really watch out for this kind of stuff and SE bots.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does 302 redirects pass link juice ? - I've read conflicting reports
Hi Mozzers Ive noticed that I have some 302 redirects on my website which have been there for some time . They should really 301's but I am wondering if 302s pass link juice or not as from what I've read they don't so I just wanted to check if anyone knew for sure, thanks pete
Intermediate & Advanced SEO | | PeteC120 -
Would you rate-control Googlebot? How much crawling is too much crawling?
One of our sites is very large - over 500M pages. Google has indexed 1/8th of the site - and they tend to crawl between 800k and 1M pages per day. A few times a year, Google will significantly increase their crawl rate - overnight hitting 2M pages per day or more. This creates big problems for us, because at 1M pages per day Google is consuming 70% of our API capacity, and the API overall is at 90% capacity. At 2M pages per day, 20% of our page requests are 500 errors. I've lobbied for an investment / overhaul of the API configuration to allow for more Google bandwidth without compromising user experience. My tech team counters that it's a wasted investment - as Google will crawl to our capacity whatever that capacity is. Questions to Enterprise SEOs: *Is there any validity to the tech team's claim? I thought Google's crawl rate was based on a combination of PageRank and the frequency of page updates. This indicates there is some upper limit - which we perhaps haven't reached - but which would stabilize once reached. *We've asked Google to rate-limit our crawl rate in the past. Is that harmful? I've always looked at a robust crawl rate as a good problem to have. Is 1.5M Googlebot API calls a day desirable, or something any reasonable Enterprise SEO would seek to throttle back? *What about setting a longer refresh rate in the sitemaps? Would that reduce the daily crawl demand? We could set increase it to a month, but at 500M pages Google could still have a ball at the 2M pages/day rate. Thanks
Intermediate & Advanced SEO | | lzhao0 -
Block Googlebot from submit button
Hi, I have a website where many searches are made by the googlebot on our internal engine. We can make noindex on result page, but we want to stop the bot to call the ajax search button - GET form (because it pass a request to an external API with associate fees). So, we want to stop crawling the form button, without noindex the search page itself. The "nofollow" tag don't seems to apply on button's submit. Any suggestion?
Intermediate & Advanced SEO | | Olivier_Lambert0 -
GoogleBot Mobile & Depagination
I am building a new site for a client and we're discussing their inventory section. What I would like to accomplish is have all their products load on scroll (or swipe on mobile). I have seen suggestions to load all content in the background at once, and show it as they swipe, lazy loading the product images. This will work fine for the user, but what about how GoogleBot mobile crawls the page? Will it simulate swiping? Will it load every product at once, killing page load times b/c of all of the images it must load at once? What are considered SEO best practices when loading inventory using this technique. I worry about this b/c it's possible for 2,000+ results to be returned, and I don't want GoogleBot to try and load all those results at once (with their product thumbnail images). And I know you will say to break those products up into categories, etc. But I want the "swipe for more" experience. 99.9% of our users will click a category or filter the results, but if someone wants to swipe through all 2,000 items on the main inventory landing page, they can. I would rather have this option than "Page 1 of 350". I like option #4 in this question, but not sure how Google will handle it. http://ux.stackexchange.com/questions/7268/iphone-mobile-web-pagination-vs-load-more-vs-scrolling?rq=1 I asked Matt Cutts to answer this, if you want to upvote this question. 🙂
Intermediate & Advanced SEO | | nbyloff
https://www.google.com/moderator/#11/e=adbf4&u=CAIQwYCMnI6opfkj0 -
Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?
Basically, we use a number of parameters in our URLs for event tracking. Google could be crawling an infinite number of these URLs. I'm already using the canonical tag to point at the non-tracking versions of those URLs....that doesn't stop the crawling tho. I want to know if I can do conditional 301s or just detect the user agent as a way to know when to NOT append those parameters. Just trying to follow their guidelines about allowing bots to crawl w/out things like sessionID...but they don't tell you HOW to do this. Thanks!
Intermediate & Advanced SEO | | KenShafer0 -
Googlebot found an extremely high number of URLs on your site
I keep getting the "Googlebot found an extremely high number of URLs on your site" message in the GWMT for one of the sites that I manage. The error is as below- Googlebot encountered problems while crawling your site. Googlebot encountered extremely large numbers of links on your site. This may indicate a problem with your site's URL structure. Googlebot may unnecessarily be crawling a large number of distinct URLs that point to identical or similar content, or crawling parts of your site that are not intended to be crawled by Googlebot. As a result Googlebot may consume much more bandwidth than necessary, or may be unable to completely index all of the content on your site. I understand the nature of the message - the site uses a faceted navigation and is genuinely generating a lot of duplicate pages. However in order to stop this from becoming an issue we do the following; No-index a large number of pages using the on page meta tag. Use a canonical tag where it is appropriate But we still get the error and a lot of the example pages that Google suggests are affected by the issue are actually pages with the no-index tag. So my question is how do I address this problem? I'm thinking that as it's a crawling issue the solution might involve the no-follow meta tag. any suggestions appreciated.
Intermediate & Advanced SEO | | BenFox0 -
Googlebot + Meta-Refresh
Quick question, can Googlebot (or other search engines) follow meta refresh tags? Does it work anything like a 301 in terms of passing value to the new page?
Intermediate & Advanced SEO | | kchandler1 -
We would like to know where googlebots search from - IP location?
We have a client who has two sites for different countries, 1 US, 1 UK and redirects visitors based on IP. In order to make sure that the English site is crawable, we need to know where the googlebot searches from. Is this a US IP or a UK IP for a UK site / server?
Intermediate & Advanced SEO | | AxonnMedia0