Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Splitting an expression or not ?
Hello, I am wondering if to write and rank I should slipt my expression. Example : "Alsace bike tour ", should I write a paragraph with keywords that match "Alsace" and then a paragraph with keywords related to "bike tours" or should I write something with keywords related to "alsace bike tour" I imagine that when it is something that exist I don't need to split it for example " la loire velo" because it is a network of bike path in the Loire valley but for other things like above I do need to split it ? An input on that would be great. Thank you,
Intermediate & Advanced SEO | | seoanalytics1 -
Creating two websites from one and building up traffic to the new domain quickly
A client has an existing successful website that sells niche products - they are well known in their marketplace. They have two sets of key customers, let's call them (a) and (b), that need addressing in different ways to maximise sales. (a) is the more specialist end of the market, where people have complex needs - there are fewer of them but repeat business is likely, and we can talk to them in more technical language. (b) is the layman's end of the market - there is a vast pool of potential customers but they'll be more casual buyers and need to be addressed more in layman's terms. So what they want to do is to take their existing website, and essentially split it into two different websites, one for each market. The one that will use the existing domain, with all the links that have built up over the years pointing to it, will be the site for the more specialist end of the market (a). The domain name suits it better, which is why he wants to use the existing domain with that site and not the other. (b) will be a brand new domain. The client will write new product descriptions across the board so that the two sets of product information are not duplicate. I'd rather he didn't do this at all, because of the risk involved, and the difficulty of building up the traffic to the new site, which is after all the one with the best chance of mass market sales. But given that the client has decided that this is definitely what he wants, does anyone have any thoughts on what the action plan should be?
Intermediate & Advanced SEO | | helga730 -
How to Canonicalise all filter pages (URL parameters) to the main category
Hi guys, I am working on an e-commerce site that's running in Shopify. I noticed that the filter pages do not have canonical tags pointing to their respective main categories. I doubt that the action needed is to canonicalise each filter pages to the main category as it would take time (there are a lot of filter URLs involved). Do you know any technical coding to do in Shopify to have all filter pages canonicalise to its main category? Keen to hear from you. Cheers
Intermediate & Advanced SEO | | brandonegroup0 -
Changing URLS: from a short well optimised URL to a longer one – What's the traffic risk
I'm working with a client who has a website that is relatively well optimised, thought it has a pretty flat structure and a lot of top level pages. They've invested in their content over the years and managed to rank well for key search terms. They're currently in the process of changing CMS and as a result of new folder structuring in the CMS the URLs for some pages look to have significantly changed. E.g Existing URL is: website.com/grampians-luxury-accommodation which ranked quite well for luxury accommodation grampians New URL when site is launched on new CMS would be website.com/destinations/victoria/grampians My feeling is that the client is going to lose out on a bit of traffic as a result of this. I'm looking for information or ways or case studies to demonstrate the degree of risk, and to help make a recommendation to mitigate risk.
Intermediate & Advanced SEO | | moge0 -
Huge Spike in Organic/Direct traffic from Mexico
So here's my situation: My company's website usually receives around 80 organic visits/month and 50 direct visits/month from Mexico. However, in July we saw a small uptick to around 170 for each and then in the last 7 days we are in the middle of a massive spike which has put us up to 1400 visits for organic and 820 visits for direct in August. The traffic spike continues as we are almost up to 500 visits just today! Things to know: The visitors are purchasing from our store, staying on our site, browsing around, basically acting like real traffic. I was unable to identify any new links, press, and we did not do any specific Mexico optimization (spanish keywords). We sell a ball and it is called The One World Futbol, but it's always been called a futbol before so nothing new here. our website is www.oneworldplayproject.com. Everyone coming organically is searching our name, not keywords. We updated our shopping cart a few days before the massive traffic spike and significantly lowered the cost to ship to Mexico. Our Latin America director went to Mexico to work there for a month a few days before the spike and sent out a bunch of emails, texts, phone calls, what's app notifications to his large network. From what I am told by others here he has a vast network throughout Mexico, Central America and South America. We have also seen large traffic increases in other Latin American countries during this same time period just nothing like Mexico. We just hired an awesome social media coordinator who is extremely focused and is implementing a kick-ass social strategy We launched a branding campaign called #MakeLifePlayFull with press releases and ad spend behind it. PHEW! That was a lot of info for you to digest. So on the surface this seems like great news. BUT I want to understand WHY this is happening. Could it really just be the combination of all these things listed above or is it just a combination of our connected guy being in Mexico with better shipping costs? Why is it mainly happening in Mexico? Why is it so sustained? I suspect that if it is from our guy it would drop off quickly. Any thoughts on what to look at? I'm stumped.
Intermediate & Advanced SEO | | Eric_OWPP0 -
Traffic and keyword drop
Hello, On one of the sites that I manage - http://www.zalikihotel.gr/ , there was a significant decrease in keyword positions over the last 10-15 days. Sample screenshot is attached. Some of the keywords even dropped for 17-18 positions. From the end of April, organic traffic dropped by 30 percent. Website is mobile optimized, so that shouldn't be a problem. In the last 3-4 months, we had traffic increase. Domain authority went up by 3 points after the last index. On-site SEO was completed, and currently I'm focusing on link-building and working on bringing back to life forgotten social media. Does anybody knows what might be the case for this negative affects on our site? Do you think it's a temporary fluctuation or not? Thanks in advance. 8dSBELm.png?1
Intermediate & Advanced SEO | | socrateskirtsios0 -
Almost no organic traffic
Hi, We have an online store, it is up & running since January 1st. Since then we really didn't see any improvements on our organic traffic at all. About 10% of our traffic is coming from organic search, and more than 20% of organic search actually coming from branded keywords. We haven't paid a lot of attention to SEO so far. I mean, we paid attention to the practices, however we focused on a better customer/user experience more than SEO. We improved our product pages, reduced checkout process to one step, used bigger icons / buttons. According to our customers, our website is pretty easy to navigate and shop. We haven't received any major complaint so far. Except couple of products, all the content we have is original, we didn't use any manufacturer product content or copied from another website. However, looks like all these efforts don't mean a lot to Google, unless we have a solid backlinks. Currently i am considering to make category pages NOINDEX and implement microdata from schema.org. However, Is it good idea to make category pages NOINDEX for an ecommerce website? I would like to hear your comments/recommendations what else we can do to create some organic traffic.
Intermediate & Advanced SEO | | serkie0 -
Keywords Directing Traffic To Incorrect Pages
We're experiencing an issue where we have keywords directing traffic to incorrect child landing pages. For a generic example using fake product types, a keyword search for XL Widgets might send traffic to a child landing page for Commercial Widgets instead. In some cases, the keyword phrase might point a page for a child landing page for a completely different type of product (ex: a search for XL Widgets might direct traffic to XL Gadgets instead). It's tough to figure out exactly why this might be happening, since each page is clearly optimized for its respective keyword phrase (an XL Widgets page, a Commercial Widgets page, an XL Gadgets page, etc), yet one page ends up ranking for another page’s keyword, while the desired page is pushed out of the SERPs. We're also running into an issue where one keyword phrase is pointing traffic to three different child landing pages where none of the ranking pages are the page we've optimized for that keyword phrase, or the desired page we want to rank appears lower in the SERPs than the other two pages (ex: a search for XL Widgets shows XL Gadgets on the first SERP, Commercial Widgets on the second SERP, and then finally XL Widgets down on the third or fourth SERP). We suspect this may be happening because we have too many child landing pages that are targeting keyword terms that are too similar, which might be confusing the search engines. Can anyone offer some insight into why this may be happening, and what we could potentially do to help get the right pages ranking how we'd like?
Intermediate & Advanced SEO | | ShawnHerrick0