Pretty basic one today: referring traffic. We all know what referring traffic is. That’s easy, right? But how does Google Analytics determine referring traffic? What is it at a cookie level? Well you should know the header basics:
Here’s a sample ‘GET’:
GET / HTTP/1.1 Host: www.cardinalpath.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: https://blog.vkistudios.com/
So the browser calls on cardinalpath.com, sending the referer from blog.vkistudios.com. Simple enough. Meanwhile, Google Analytics calls on document.refer via javascript and recalls the result.
But what happens after that? First data is written into the utmz cookie. From a couple of weeks ago:
utmz
The utmz cookie is often considered the “campaign” cookie.
Here’s a cookie from the Cardinal Path Blog:
113869458.1305319923.101.101.utmcsr=cardinalpath.com|utmccn=(referral)|utmcmd=referral|utmcct=/blog/disposable-printers-is-there-a-better-option
How do you read this? The first number is the domain hash, the second the time stamp, the third is the session number (101? Jesus) and the fourth is the campaign number. Then, on the end, are some familiar looking queries, eh?
utmccn is the campaign value of the visit. In this case, referral.
utmcmd is the medium, again referral.
utmctr is the keywords (if there is one, there isnt here)
utmcct is campaign content
and sometimes you’ll get a utmgclid which defines your ad click ID.
Then if no campaign variables are available that would overwrite the referrer data, the referring URL is written into utmr on the utm.gif request, and bam, you have a referred site. Simple right?
But there’s more you can do with this, because Google knows that you want to be able to control what counts as referrals, and what counts as direct.
_addIgnoredOrganic()
_addIgnoredOrganic(newIgnoredOrganicKeyword)
Sets the string as ignored term(s) for Keywords reports. Use this to configure Google Analytics to treat certain search terms as direct traffic, such as when users enter your domain name as a search term. When you set keywords using this method, the search terms are still included in your overall page view counts, but not included as elements in the Keywords reports._addIgnoredRef()
_addIgnoredRef(newIgnoredReferrer)
Excludes a source as a referring site. Use this option when you want to set certain referring links as direct traffic, rather than as referring sites. For example, your company might own another domain that you want to track as direct traffic so that it does not show up on the “Referring Sites” reports. Requests from excluded referrals are still counted in your overall page view count._addOrganic()
_addOrganic(newOrganicEngine, newOrganicKeyword, opt_prepend)
Adds a search engine to be included as a potential search engine traffic source. By default, Google Analytics recognizes a number of common search engines, but you can add additional search engine sources to the list.
For more on these functions and their implementation see Tracking Code: Search Engines and Referrers.
So to summarize, header information is read from document.referrer and written into cookies. If there are no campaign variables that would overwrite the utmr query, then it’s written into there on the utm.gif and sent away.