Help Blocking Crawlers. Huge Spike in "Direct Visits" with 96% Bounce Rate & Low Pages/Visit.
-
Hello,
I'm hoping one of you search geniuses can help me.
We have a successful client who started seeing a HUGE spike in direct visits as reported by Google Analytics. This traffic now represents approximately 70% of all website traffic. These "direct visits" have a bounce rate of 96%+ and only 1-2 pages/visit. This is skewing our analytics in a big way and rendering them pretty much useless. I suspect this is some sort of crawler activity but we have no access to the server log files to verify this or identify the culprit. The client's site is on a GoDaddy Managed WordPress hosting account.
The way I see it, there are a couple of possibilities.
1.) Our client's competitors are scraping the site on a regular basis to stay on top of site modifications, keyword emphasis, etc. It seems like whenever we make meaningful changes to the site, one of their competitors does a knock-off a few days later. Hmmm.2.) Our client's competitors have this crawler hitting the site thousands of times a day to raise bounce rates and decrease the average time on site, which could like have an negative impact on SEO. Correct me if I'm wrong but I don't believe Google is going to reward sites with 90% bounce rates, 1-2 pages/visit and an 18 second average time on site.
The bottom line is that we need to identify these bogus "direct visits" and find a way to block them. I've seen several WordPress plugins that claim to help with this but I certainly don't want to block valid crawlers, especially Google, from accessing the site.
If someone out there could please weigh in on this and help us resolve the issue, I'd really appreciate it. Heck, I'll even name my third-born after you.
Thanks for your help.
Eric
-
Hi SirMax,
Thanks for your input. I appreciate it. We'll add Wordfence to our WordPress toolbox and see if that addresses the issue.
In response to previous posts, thanks to everyone for your input. We were able to apply some filters to remove the bogus bot traffic from the analytics and normalize the data, however, this did not actually resolve the issue and in my eyes is more of a BandAid fix. The evil crawlers are still there, we just can't see them.
Thanks again for all of your input.
Eric
-
Hostname filtering does not work any more. Unfortunately most of the spammers have adapted and are using your website as hostname.
For the WordPress I use Wordfence plugin( using paid version - not affiliated with them in any shape or form beyond paying for their services). In the advance blocking you can set limits on how fast and how many pages crawlers can request. You can also block by country or ip range. It can also show you live traffic with a lot of details ( a lot more then google analytic - more like server log ). It might not be the complete remedy but it can help.
-
I wish I had an answer for how to stop the bots from hitting your site at all - I don't think a good one exists, as any solutions that wouldn't also block real human traffic to your site are going to be easy for spam bots to get around. I think your best bet is just to do everything you can to keep your data as clean as possible.
-
Hi Ruth,
Thanks a bunch for taking the time to respond to my post. Great advice. This is reassuring on a number of levels, however, it doesn't address the underlying issue of how to stop these spam bots in the first place.
We've already started the process of filtering out some of this bogus data. We'll also be integrating some WordPress plugins to see if that helps. That said, if the spam bots are hitting Analytics directly, as opposed to the actual website, WP plugins won't do anything.
Anyway, I appreciate your input and advice. Thanks so much.
Eric
-
Hi Eric,
A few things to reassure you off the bat:
- For what it's worth, there is a huge, HUGE amount of crawler spam happening in the web today. Every site I work on is being hit hard with false referrals and direct visits. I know Google Analytics is working on a solution to better filter these visits out. So I wouldn't be too concerned that it is something a competitor is doing to your site, specifically - it's more likely that it's been caught up in the general wave of spam crawlers.
- It's important to note that when we talk about Google looking at bounce rate and dwell time as part of ranking your site, those numbers are specifically from clicks through from search - that's data that Google can get without using your private web analytics data as a ranking factor, which they've said repeatedly that they don't and won't do. So a bunch of direct visits with high bounce rates will NOT affect your rankings.
So, it's not dangerous, just annoying. On to how to get that data out of your reports:
- Make sure you're not filtering out spam referrers at a View level - this can cause those visits to incorrectly appear as direct traffic.
- You could set up an Advanced Segment in Google Analytics to filter out direct visits with visit times of, say, under 5 seconds. Some real traffic may get caught in that, but it will get the noise levels down.
- The best way to filter out spam bot traffic, in my opinion, is to set up hostname filtering. Here's a post on Megalytic on how to do that: https://megalytic.com/blog/how-to-filter-out-fake-referrals-and-other-google-analytics-spam. Make sure you've also got an "Unfiltered Data" View so you'll still have historic raw data if you need it.
Hope that helps! Good luck.
-
Check webserver log files, or log visits (ip address, user agent, __utma, __utmz, possibly browser fingerprint, etc...)
Analyzing those you can easily find out if the traffic is from scraping bot or humans.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search Console "Not found" errors
The Google Search Console is showing recent "Not found" errors for pages that haven't been on my site for years. The pages are not in my sitemap and there don't appear to be any internal links to the pages. Is this normal? Should I be concerned? (Google Search Console > Crawl > Crawl Errors > URL Errors)
Reporting & Analytics | | nkolson1 -
Internal Referral Traffic Issue due to https/http?
Hi Mozzers, we´re running a secured https account section on our website including a messaging center where lots of non secured own URLs are being shared among the users. Is there a possibility that a user clicking on one of the shared URLs within the https section triggering another session thats been counted as direct traffic? Thanks for your help! Greets
Reporting & Analytics | | LocalIM
Manson0 -
Huge Decline in Links
Good Morning Everyone, Looking for some feedback as to why all of my Backlink metrics might be way down (as well as rankings)... Please see the details below that show some of the metrics from MOZ reports from August and from October. Does anyone know why these metrics are all so down? We have not done any link removal exercises or anything that would cause this drop --- please let me know if there is any insight any of you have as to what is the reason for this drop. Thanks Linking C Blocks August 3: 124 October 14: 23 External Followed Links August 3: 4486 October 14: 1558 Total External Links August 3: 4795 October 14: 1680 Total Links August 3: 21338 October 14: 17809 Followed Linking Root Domains August 3: 323 October 14: 116 Total Linking Root Domains August 3: 442 October 14: 143
Reporting & Analytics | | Prime850 -
How can I remove parameters from the GSC URL blocking tool?
Hello Mozzers My client's previous SEO company went ahead and blindly blocked a number of parameters using the GSC URL blocking tool. This has now caused Google to stop crawling many pages on my client's website and I am not sure how to remove these blocked parameters so that they can be crawled and reindexed by Google. The crawl setting is set to "Let Google bot decide" but still there has been a drop in the number of pages being crawled. Can someone please share their experience and help me delete these blocked parameters from GSC's URL blocking tool. Thank you Mozzers!
Reporting & Analytics | | Vsood0 -
How to get crawled pages indexed?
Hi, I've got over 1k pages crawled but approx 100 pages indexed. Although, i submit them on Google Fetch and the links are indexable,they are not indexed. What shall i do the get max pages indexed? Any input highly appreciated. Thanks!
Reporting & Analytics | | Rubix0 -
How do I find "bad" backlinks
Hi, I have used open site explorer to review our back links, but I don't know how to determine which ones are bad? For example, I think I am getting penalized for links from strictlygifts.com, but not sure. StrictlyGifts is a second site of ours that has a web hosting plan (our main site does not have hosting - long story). We use this site to store things that link to our main site, trophycentral.com. As an example, we have several online catalogs with hundreds of pages that link to products on trophy central. I have a gut feeling that this is hurting us, but I don't know how to confirm it. I can easily remove the links but don't want to if they are helping us. Can anyone help? Thanks!!!
Reporting & Analytics | | trophycentraltrophiesandawards0 -
Direct Traffic Source?
Hi all, Having some trouble figuring out a metric I'm dissecting. We have a large amount of traffic going to deep pages and I'm looking at the traffic source and an alarming amount are coming as Direct traffic. The thing is this can't type in or bookmarked traffic, so what else could it be? We have numbers like 80% and 60% for direct traffic, which judging by our previous efforts, that just can't right. Anyone can figure out what I may be missing out? Deeper pages should usually not get as much Direct traffic, so what can it be?
Reporting & Analytics | | William.Lau0 -
Help with Analytics
Afternoon all - I hope someome might be able to help me getting some data out of Google Analytics. I want to try pull a report which will show me both what the last page before a goal completion was and the country that generated the click. Is that possible at all? When I go to look at goals, I can only pull data for the last page before the goal click ... and when I look at my goals in the Pages>Content menu, I can only view country. So is there any way of pulling the two set of data together at all? Any thoughts would be really appreciated. Thanks very much, Simon
Reporting & Analytics | | theshortstack0