Last week I posted on RegEx basics, talking about the “language” of RegEx and giving a pretty complex filter. This week I want to dial it back even further and make sure we cover some of the more basic uses.
There are some best practices you should follow when setting up Google Analytics, and they use RegEx. Basic stuff, trivial, hardly worth mentioning. Except that people forget to do it time and time again. These include: not counting traffic from your own IP (because you know why YOU’RE visiting), and counting www.site.com and site.com as the same page.
Ignoring your (or your devlopers, or anyone else’s) IP address is eeeeaaaaassssyyyy. In your filter manager just create a custom “exclude” filter that excludes “visitor IP address” and enter the IP you want excluded under “filter pattern”. (you could also just use the predefined “exclude” “traffic from the IP addresses” “that are equal to” filter, but that just seems more complex) But let’s say you have the IP addresses of 192.168.1.1 to 192.168.1.254 (yes, internal addresses). What then?
Theoretically the following would cover all ranges from .1 to .255 (some would also cover beyond that number):
^192.168.1.[0-9]{1,3}$
^192.168.1.d{1,3}$
^192.168.1.[0-9]+
^192.168.1.[0-9][0-9]?[0-9]?$
Or
^192.168.1..+(of course this will match EVERYTHING starting with 123.456.7.)
Which do you choose? I’ve found in the past that I get iffy results from using ranged repetitions and d in Google Analytics (this could have changed since, I suppose) . Also Google recommends a pattern like this:
^192.168.1.([1-9]|[1-9][0-9]|1([0-9][0-9])|2([0-4][0-9]|5[0-5]))$
Lets take a closer look at that last bit.
([1-9]|[1-9][0-9]|1([0-9][0-9])|2([0-4][0-9]|5[0-5]))
So basically accept any single digit of 1-9, OR 10-99, OR 100-199, OR 2(00-49, OR 50-55)
You could also add the bar modifier ( | ) to add another address or range in there. For instance, if you have multiple offices.
That’s a very specific filter. I like it. I wonder, though, wouldn’t it be easier just to use .+ at the end? It’s not like our stated IP addresses are going to ever be outside of those confines. However, specificity is always safe.
You know that how you have www.yoursite.com and yoursite.com pointing at the same page? That’s two URLs in Google Analytics… unless you set it up right.
Filter Type : Custom filter > Advanced
Field A : Hostname
Extract A : ^(www.)?(.*)
Field B : Hostname
Extract B : –
Output To : Hostname
Constructor : $A2
So A field extracts from the hostname a line starting with the group www 0 or 1 times (so with www or without), then makes a group of everything following. It then outputs to hostname the second group in field A, aka. the name without the www. Now every host name will be counted without the www. Simple, right?
Actually, this time, yeah it is.
As consumers become increasingly digitally savvy, and more and more brand touchpoints take place online,…
Marketers are on a constant journey to optimize the efficiency of paid search advertising. In…
Unassigned traffic in Google Analytics 4 (GA4) can be frustrating for data analysts to deal…
This website uses cookies.