Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keywords and keyword traffic
Hi I am struggling to know what keywords i should be targeting and how the website should be best optimised for said keywords. The website offers bespoke service in the lake district UK a popular tourist destination, The business operates within say a 30 km riadus of the area. So target vistors to the website would specifically be looking for services in the lake district. The trouble is for many targeted keywords for the area are quite low or no data shown. For example: tipi camping lake district, tipi hire lake district, Glamping lake district However nationally keywords for the service have a lot higher traffic i.e. tipi hire or tipi camping, glamping what keywords should be my target? and should I targeting my website for? I don't want to target customers looking for these services outside of the lake district and also by targeting keywords without the term lake district means my competition is greater as i'm competing with the whole of the Uk for serivces It can't provide. please advise thanks
Intermediate & Advanced SEO | | Bengo-990 -
Is there anything I need to worry about if... We show/hide header navigation based upon visit from external traffic?
Scenario: So imagine if LinkedIn turned off their main navigation/header if you landed on your personal profile via a search engine or via an external link. But if you were on LinkedIn when you found it, the navigation remains the same.
Intermediate & Advanced SEO | | mysitesrock0 -
Traffic drop on this site
I am SEO'ing this site but need some assistance in the analysis. it was doing not too bad but in the last 4 months the google traffic has really fallen off, i suspect the keywords may need improving but any tips or observations would be great.
Intermediate & Advanced SEO | | crowng0 -
How to increase traffic?
This may seem a bit of a broad question but grateful for all knowledgeable input! There are many things to focus on when looking at organic SEO, but where should I be spending the most time/effort in order to increase traffic.
Intermediate & Advanced SEO | | seoman100 -
Natural Fluctuation in Search Traffic
This is going to sound like a weird question... I'm curious to know whether there is a natural fluctuation in the actual number of searches being made online each week. It would be great to relate this to the performance of my own organic traffic each week. For example, if organic search traffic is down 10% week on week, is that because search in general is down 10%? Has anybody ever looking into this?
Intermediate & Advanced SEO | | ausmed0 -
Can't seem to get traffic back post Panda / Penguin. WHY?
I have done and am doing everything I can think of to bring back lost traffic after the late 2012 updates from google hit us. I just is not working. We had some issues with our out of house web developers which screwed up our site in 2012 and after taking it in house we have Eden doing damage control form months now. We think we have fixed pretty much everything. URL structure filling up with good unique content(under way. Lots still to do) making better category descriptions redesigned homepage. Updated product pages (CMS is holding things back on that part otherwise they would be better. New CMS under construction) started more link building(its a real weak spot on our SEO as far as I can see) audited bad links from dodgy irelavent sites. hired writers to create content and link bait articles. Begun making high quality video's for both YouTube (brand awareness and viral) and on site hosting (link building and conversions) (in the pipeline not online yet). Flattened out site architecture. optimise internal link flow (got this wrong by using nofollows. In the process of thinking of a better way by reducing nun wanted Nav links on page.) i realise its not all done but I have been working ever since the drop in traffic and I'm just seeing no increase at all. I have been asking a few questions on here for the past few days but still can't put my finger on the issue. Am I just impatient and need to wait on the traffic as I am doing all the correct things? Or have I missed something and need to fix it. you anyone would like to have a quick look at my site and see if there is an obvious issue I have missed It would be great as I have been tearing my hair out trying to find the issues with my site. It's www.centralsaddlery.co.uk Criticism would me much appreciated.
Intermediate & Advanced SEO | | mark_baird0 -
Big drop in traffic on 18 Oct
Hi mozzers, One of my client’s sites has a big drop in traffic. The site is topdealshotel.com. The drop was in 18 Oct and I know that there weren’t any algo changes. The site has 20+ language versions all optimized properly for all languages. In june I made some big changes in site’s structure and I also changed all URLs. After these new improvements the organic traffic started to grow naturally and very good, but dropped drastically in last 30 days. Unfortunately I couldn’t 301 old URLs into new ones and in WMT at crawl stats I have almost 2 million 404 errors. The site has many hotel pages with content that is from hotelscombined.com, but we also add our own content for all pages. In city pages we have unique content, written by our copywriters. I also made a reconsideration request and there was no manual penalty. I do not have any idea for this drop in traffic. Do you have any suggestions?
Intermediate & Advanced SEO | | tudormarius0 -
When you provide traffic estimates, do you factor in CTR?
There are several studies that show CTR based on position. When a client asks for traffic estimates do you multiply CTR by estimated search volume? Why or why not?
Intermediate & Advanced SEO | | nicole.healthline0