One of the questions we see regularly on Google Groups from consultants and end users in general regards the identity and structure of Google Analytics Cookies.
Some good articles exist (e.g.: Cookies Set By Google Analytics and Justin Cutroni’s Google Analytics Short Cuts) which have been helpful in understanding GA cookies and we wanted to contribute something real that relied on that knowledge: a JavaScript class for slicing and dicing Google cookies to extract those juicy bits. We present it in the context of a project.
The problem:
Some sites are eCommerce sites (they have a shopping cart), others are not and some are in between. And then there are those that earn revenue without a shopping cart but by displaying ads; these are eCommerce sites on steroids. Each ad displayed is like an item sold for which the advertiser is paid. Shipping is free.
But in this context, what exactly constitutes a transaction and how are Order Id’s assigned?
The solution:
Turns out that what constitutes a transaction is in the eyes of the beholder. The requirement was to track revenue by visit and to identify the highest (and lowest!) earning pages and site sections. So:
Transaction | => | Visit |
Product Name | => | Unique Ad Description |
Product SKU | => | Page |
Product Category | => | Site Section |
To identify a Transaction (Order ID), we needed to identify the visit uniquely across all pageviews without creating our own tracking system.
GA already identifies the session uniquely on the visitor’s machine as the start time (to the nearest second) of visitor’s current session and stores it in the __utma cookie.
However, visitors’ sessions could have the same start times. GA also identifies the visitor uniquely with an anonymized id. The 2 values together would be more than adequately unique.
However, we did not want the visitor id to be recorded with in the reports. Although not personally identifying visitors, it would allow revenue to be accumulated per individual visitor within GA. That would probably go against Google’s Terms of Service and, more importantly, against it’s core principle “Don’t be evil”.
By dividing the visitor id by the session start time the value would be sufficiently unique but would change on each visit, making it impossible to track revenue by individual visitor.
The implementation:
There have been other scenarios where we wrote code to extract data from the cookies but it was time to write something more generic and enduring.
I had used Adam Vandenberg’s Querystring class to disect querystrings. Since both query strings and cookies are stored as a delimited series of name=value pairs, an adaptation of his code, with some GA cookie-specific enhancements resulted in
gaVKIcookies.js in fairly short order.
Those and other elements of the GA cookies are shown in the following table which is populated in real time in an iframe using gaVKIcookies.js:
The calculation of a unique visit id is as follows:
Run the sample useGAcookies.html page that shows the use of gaVKIcookies.js which you are free to download and enjoy.
Most of the GA cookies and their components, including the __utmz campaign/referrer cookie are isolated in the code.
The detail of each component will be covered in a later post but, in the meantime, is apparent from gaVKIcookies.js.
The latest version of WASP (v0.73) now displays GA’s cookies:
WASP post and download link
An important note about Google Analytics and privacy:
GA has a strict policy against tracking Personally Identifiable Information (PII). It enforces this, not only in its Terms of Service (Google Analytics Privacy Policy) but also by basing the design of the reporting on that Privacy Policy. GA does not report on visitors.
Any users preventing GA from tracking their use of web sites is doing themselves a disservice – but that’s a different soap box for another post.