Separate Servers for Humans vs. Bots with Same Content Considered Cloaking?
-
Hi,
We are considering using separate servers for when a Bot vs. a Human lands on our site to prevent overloading our servers. Just wondering if this is considered cloaking if the content remains exactly the same to both the Bot & Human, but on different servers.
And if this isn't considered cloaking, will this affect the way our site is crawled? Or hurt rankings?
Thanks
-
The additional massive complexity, expense, upkeep and risk of trying to run a separate server just for bots is nowhere near worth it, in my opinion. (Don't forget, you'd also have to build a system to replicate the content between each server every time content/code is added or edited. That replication process could well use more resources than the bots do!)
I'd say you'd be much better off using all those resources towards a more robust primary server and let it do it's job.
In addition, as Lesley says, you can tune GoogleBot, and can actually schedule Bing's crawl times in their Webmaster Tools. Though for me, I'd want the search engine bots to get in and index my site just as soon as they were willing.
Lastly, it's only a few minutes' work to source a ready-made blacklist of "bad bots" useragents that you can quickly insert into your htaccess file to completely block a significant number of the most wasteful and unnecessary bots. You will want to update such a blacklist every few months as the worst offenders regularly change useragents to avoid just such blacklisting.
Does that make sense as an alternative?
Paul
-
I second what Jonathan says, but I would also like to add a couple of things. One thing I would keep in mind is reserve power on your server. If you are running the server close enough to its maximum traffic limit where a bot would matter, I would upgrade the whole server. All it takes is one nice spike from somewhere like hacker news or reddit to take your site offline, especially if you are running close to the red.
From my understanding you can actually adjust how and when Google will crawl you site also, https://developers.google.com/search-appliance/documentation/50/help_mini/crawl_fullcrawlsched
-
I've never known search engine bots to be particularly troublesome and overload servers. However, there are a few things you could do:
1. Setup Caching
2. Setup something like Cloudflare which would be able to block other threats.
I cannot imagine you are intending to block google, bing etc as I would definitely advise against cloaking the site like that from Google.
Of course it is difficult to make any specific comment as I have no idea to the extent of the problem you are suffering from. But something like caching \ cloudflare security features will help alot.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Would this be duplicate content or bad SEO?
Hi Guys, We have a blog for our e-commerce store. We have a full-time in-house writer producing content. As part of our process, we do content briefs, and as part of the brief we analyze competing pieces of content existing on the web. Most of the time, the sources are large publications (i.e HGTV, elledecor, apartmenttherapy, Housebeautiful, NY Times, etc.). The analysis is basically a summary/breakdown of the article, and is sometimes 2-3 paragraphs long for longer pieces of content. The competing content analysis is used to create an outline of our article, and incorporates most important details/facts from competing pieces, but not all. Most of our articles run 1500-3000 words. Here are the questions: Would it be considered duplicate content, or bad SEO practice, if we list sources/links we used at the bottom of our blog post, with the summary from our content brief? Could this be beneficial as far as SEO? If we do this, should be nofollow the links, or use regular dofollow links? For example: For your convenience, here are some articles we found helpful, along with brief summaries: <summary>I want to use as much of the content that we have spent time on. TIA</summary>
White Hat / Black Hat SEO | | kekepeche1 -
Malicious links on our site indexed by Google but only visible to bots
We've been suffering from some very nasty black hat seo. In Google's index, our pages show external links to various pharmaceutical websites, but our actual live pages don't show them. It seems as though only certain user-agents see the malicious links. Setting up Screaming Frog SEO crawler using the Googlebot user agent also sees the malicious links. Any idea what could have caused this or how this can be stopped? We scanned all files on our webserver and couldn't find any of malicious links. We've changed our FTP and CMS passwords, is there anything else we can do? Thanks in advance!
White Hat / Black Hat SEO | | SEO-Bas0 -
Pages mirrored on unknown websites (not just content, all the HTML)... blackhat I've never seen before.
Someone more expert than me could help... I am not a pro, just doing research on a website... Google Search Console shows many backlinks in pages under unknown domains... this pages are mirroring the pages of the linked website... clicking on a link on the mirror page leads to a spam page with link spam... The homepage of these unknown domain appear just fine... looks like that the domain is partially hijacked... WTF?! Have you ever seen something likes this? Can it be an outcome of a previous blackhat activity?
White Hat / Black Hat SEO | | 2mlab0 -
Noindexing Thin Content Pages: Good or Bad?
If you have massive pages with super thin content (such as pagination pages) and you noindex them, once they are removed from googles index (and if these pages aren't viewable to the user and/or don't get any traffic) is it smart to completely remove them (404?) or is there any valid reason that they should be kept? If you noindex them, should you keep all URLs in the sitemap so that google will recrawl and notice the noindex tag? If you noindex them, and then remove the sitemap, can Google still recrawl and recognize the noindex tag on their own?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
URL Structure - forward slashes, hyphen separated, query paramters
I am having difficulty evaluating pros and cons of various URL structures with respect to SEO benefits. So I can have the following 1. /for-sale-in-<city>-<someothertext>-<uniqueid>.php
White Hat / Black Hat SEO | | proptiger
So in this case a term like 'for sale in San Francisco' is directly part of the URL. </uniqueid></someothertext></city> 2. /for-sale/<city>/<someothertext>uniqueId
Here 'for sale in San Francisco' is not so direct in the URL, so I think. Also I 'heard' that forward slash URLs are somehow considered as being 'lower down' in the directory structure. </someothertext></city> 3. /for-sale/<city>/<someothertext>/?pid=uniqueId</someothertext></city> someOtherText contains keywords we are targeting. 1. Is there a preference of one format over the other? 2. Does it even matter? 3. someOtherText - does it makes sense to put keywords in the URL for just SEO purposes? I do not per se need someOtherText for functionality.0 -
XML feeds in regards to Duplicate Content
Hi everyone I hope you can help. I run a property portal in Spain and am looking for an answer to an issue we are having. We are in the process of uploading an XML feed to our site which contains 10,000+ properties relating to our niche. Although this is great for our customers I am aware this content is going to be duplicated from other sites as our clients advertise over a range of portals. My question is, are there any measures I can take to safeguard our site from penalisation from Google? Manually writing up 10,000 + descriptions for properties is out of the question sadly. I really hope somebody can help Thanks Steve
White Hat / Black Hat SEO | | buysellrentspain0 -
I'm worried my client is asking me to post duplicate content, am I just being paranoid?
Hi SEOMozzers, I'm building a website for a client that provides photo galleries for travel destinations. As of right now, the website is basically a collection of photo galleries. My client believes Google might like us a bit more if we had more "text" content. So my client has been sending me content that is provided free by tourism organizations (tourism organizations will often provide free "one-pagers" about their destination for media). My concern is that if this content is free, it seems likely that other people have already posted it somewhere on the web. I'm worried Google could penalize us for posting content that is already existent. I know that conventionally, there are ways around this-- you can tell crawlers that this content shouldn't be crawled-- but in my case, we are specifically trying to produce crawl-able content. Do you think I should advise my client to hire some bloggers to produce the content or am I just being paranoid? Thanks everyone. This is my first post to the Moz community 🙂
White Hat / Black Hat SEO | | steve_benjamins0 -
Links In Blog Posts: 1 Paragraph VS. Full Article
Hey guys, I've been using an article network to post unique articles (not spun). Been posting 1 paragraph articles with 1 text link. Just wondering what the main difference would be if I were to post a full article with 2 or 3 text links vs 1 paragraph with 1 text link, besides the fact that you get more links and save more time writing only 1 paragraph. Will the full article with 3 backlinks improve keyword ranks more or not by much? Cheers!
White Hat / Black Hat SEO | | upick-1623910