Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce replatforming and redirects - how much traffic will I lose?
Hey there,I'm looking to hear your experiences in regards to replatforming an ecommerce store and SEO impacts.My company is analyzing the impacts of switching from Magento Entreprise to Shopify Plus. Some background info : 900k sessions / month 52% of sessions coming from SEO Multilingual store : half of traffic is French, half is English 945 domains linking to us, according to search console Competitive industry (retail) Moving to Shopify would force us to do two things: Redirect all category pages, brand pages and product pages. Shopify forces a specific URL structure for these pages that is different from our current one. Redirect the English section of the site to a subdomain (https://en.example.com/...). Have multiple stores on Shopify can't be done on the same domain. I'm especially afraid of the impact of moving the English section to a subdomain. I feel it would lose most of the domain authority - most backlinks go to the website root so very few will be redirected to the subdomain.Even if we spend a lot of time doing redirections, do you think the traffic will significantly suffer? Do you have stats to share on a similar migration you would have done, or other insights?Thanks a lot!
Intermediate & Advanced SEO | | Cheebee1540 -
How Can I Rank My Website Quickly and get traffic 20k per months
Hello moz webmasters, PLZ tell me How Can I Rank My Website Quickly and get traffic 20k per months. if you have backlinks lists of edu and gov sites plz donate me. check my site https://www.steemseo.com [Link removed by a forum moderator.]
Intermediate & Advanced SEO | | tushartosi0 -
How will this affect the rankings and traffic of the new site once this happens?
Hi, we will be moving a clients’ site address from one domain to another and will of course be doing 301 redirects and notifying Google of the site address change in WMT. The problem is, that at some point in the future (say 3-6 months), the old domain will be going live with a new site as the current client does not own the domain and the owner will be wanting it back unfortunately. How will this affect the rankings and traffic of the new site (new domain) once this (old domain with new site) happens? Will the site address change be enough to keep the rankings but it will lose backlink traffic? Or will rankings go down since the 301 redirects will in essence no longer be in affect? Many thanks for your help in advance.
Intermediate & Advanced SEO | | WSIDW0 -
Is it possible to find out where traffic is comming from on someone elses website?
Is it possible to find out where traffic is coming from on someone else website? I want to know where the new buyers are coming from who are interested in outsourcing. Attached are some of the pages they would be looking at. Who are visiting these pages and where are they coming from: https://www.upwork.com/blog/ https://www.upwork.com/hiring/ https://www.upwork.com/i/howitworks/client/ https://www.upwork.com/signup/create-account/client_direct https://www.upwork.com/o/profiles/browse/ https://www.upwork.com/press/ https://www.freelancer.com/ https://www.freelancer.com/about https://www.freelancer.com/info/how-it-works.php https://www.freelancer.com/showcase https://www.freelancer.com/community https://www.freelancer.com/hire/ https://www.freelancer.com/contest/ https://www.freelancer.com/feesandcharges/ https://www.freelancer.com/freelancers/ http://www.guru.com/ http://www.guru.com/howitworks.aspx http://www.guru.com/about/ http://www.guru.com/help/ http://www.guru.com/blog/ http://www.guru.com/blog/category/hiring-advice/ http://www.guru.com/d/freelancers/ http://www.guru.com/directory http://www.guru.com/answers/
Intermediate & Advanced SEO | | Hall.Michael0 -
Will I lose traffic from Google for re-directing a page?
I’m currently planning to a retire a discontinued product and put a 301 redirect to a related product (although not identical). The thing is, I’m still getting significant traffic from people searching for the old product by name. Would Google send this traffic to the new pages via the re-direct? Is Google likely to display the new page in place of the old page for similar queries or will it serve other content? I’d like to answer this question so that I can decide between the two following approaches: 1) Retiring the old page immediately and putting a 301 redirect to the new related pages. This will have the advantage of transferring the value of any link signals / referring traffic. Traffic will also land on the new pages directly without having to click through from another page. We would have a dynamic message telling users that the old product had been retired depending on whether they had visited out site before. 2) Keep the old product pages temporarily so that we don’t lose the traffic from the search engines. We would then change the old pages to advise users that the old product was now retired, but that we have other products that might solve their problems. When this organic traffic decreases over time, then we will proceed with the re-direct as above. I am worried though that the old product pages might outrank the new product pages. I’d really appreciate some advice with this. I’ve been reading lots of articles, but it seems like there are different opinions on this. I understand that I will lose between 10% - 15% of page rank as per the Matt Cutts video.
Intermediate & Advanced SEO | | RG_SEO0 -
How do I get the best from our Blog and build quality links and drive traffic to our site?
We have recently setup a Wordpress focused blog (blog.towelsrus.co.uk) which is very much work in progress. Because of financial constraints we had to host this on a separate sub domain. I need to get to grips with blogs is new to me (only been doing SEO this for 3 months now) and have have read may posts on the forums here that this is one of the best ways to build links and engage the audience. How do I go about getting people to read my blog, and should I use this to pull in traffic on keywords we cannot through the main site or should i use this to re-enforce and build traction on those keywords we are trying to rank for on the main site?
Intermediate & Advanced SEO | | Towelsrus0 -
My traffic dropped over 60% - was I penalized?
Hi all, We launched a major update of our site in the middle of June. We have lots of pages and were indexed very quickly, and started ranking well for long tail terms. Last week, our organic traffic suddenly dropped over 60% as our pages started ranking much lower. One issue we discovered was that our site was responding to all subdomains, not just www, and Google did seem to be crawling two alternate subdomains -- Webmaster Tools shows crawl activity, but no pages indexed on these. We fixed that problem a couple days ago (all subdomains 301 to the www). Is that something that would have caused a sudden drop like we saw? This would have been an issue since the relaunch, though one of the subdomains only started getting crawled (~1,000 pages/day) in August. We have investigated a few other things that may have been a factor: We sent out a press release via iReach a few weeks ago which makes up the majority of our recent backlinks. Our site occasionally returns a 502 no gateway error when under heavy load, Google sees this 3-10 times at day. GA shows a page load spike the day before the drop, but we had worse spikes in the past that did not seem to have an impact. Did we just get lucky with a "honeymoon" phase with Google? This is the site: http://goo.gl/3DCbl Indexing continues -- we now have over 500k pages indexed and Google is crawling faster than ever, about 30,000 pages per day. Thanks!
Intermediate & Advanced SEO | | tact0 -
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash.
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash. Should we rewrite to just the trailing slash or without because of duplicates. The other question is, if we do a rewrite, google has indexed some pages with the slash and some without - i am assuming we will lose rank for one of them once we do the rewrite, correct?
Intermediate & Advanced SEO | | Profero0