A lot of people ask about integrating Google Analytics data with other types of data. Most people are interested in some type of CRM integration so that’s what I’m going to write about.
Let me start by saying that there is no API to Google Analytics. There is no formal way to pull data out and integrate it into some other application. Sure, there are some hacks out there to manipulate the report URLs but, in my opinion, this is a pretty big hack and I don’t recommend it to our clients.
When we talk about integrating Google Analytics with a CRM we’re talking about pulling information about the visitor’s originating source and sending it to a CRM system. We’re not going to pull the visitor’s entire history, just where they came from and attach it to other information that they enter into a form.
Why do this?
Pulling a visitor’s source information is very helpful to a sales team. Image if a sales rep could identify how a sales lead found the website before picking up the phone and contacting them? Remember, a visitor’s source info describes how the visitor found the site. Did they respond to a specific email with a certain offer? Or did they come from search and, if so, what was the keyword they used? This type of info can help a sales team understand the intent of the prospect but where in the buying process they are. That makes life for the team a little easier.
So how do we connect a visitor’s source to a visitor?
Conceptual Overview
Google Analytics stores a visitor’s source data in a cookie. That cookie is named __utmz
. The data can be extracted from the cookie and added to a lead generation form. When the visitor submits the form the source information is connected to the other information that the visitor entered into the form (usually her name and other contact information).
If the contact form is integrated with a CRM application, like SalesForce.com or NetSuite, it may be possible to store the marketing information with the individual’s contact information. Direct CRM integration depends on your CRM system. Some systems allow form fields to be pulled directly into the application. Check with your CRM provider for information about your specific system.
I would like to note that you don’t need to use some fancy CRM to take advantage of this technique. You can use the technique below even if your lead form generates an email or dumps the data into a database. The key is that we’re leveraging the data in the Google Analytics cookies and connecting it with information that the visitor sends you.
Detailed Instructions
Ok, so how do you actually do this? It takes some coding, either client side coding (JavaScript) or server side code (PHP, ColdFusion, .NET, Java, etc.). My example uses JavaScript. Here’s a basic explanation of what the code needs to do:
1. Extract the visitor’s source data from the __utmz
cookie
2. Manipulate data as needed
3. Place data in hidden form fields
When the visitor submits the form the source data in the hidden form fields will be sent back to the server where it can be used.
The sample code below extracts the source information from the cookie and places it in some hidden form fields. Then, when the form is submitted the information is passed to the server.
Now, if this form is directly connected to a CRM then the data in the hidden form fields will go directly into the CRM along with other form info. That’s the magic. Again, you can use this method to collect source information even if the form is not connected to a CRM. Any type of lead generation form will be more valuable if you use this technique.
How The Code Works
When the above page loads in the browser the JavaScript starts to execute and extracts data from the cookies. First, it extracts the value of the __utmz
cookie and stores it in a variable named z
. Then it parses the z
variable and looks for information about the visitor’s source. The __utmz
cookie has a number of name-value pairs separated by a pipe (‘|’) character. Each name=value pair holds a different attribute of the visitor’s source. Here’s an example of the __utmz
cookie.
12454562.1193706926.14.5.utmcsr=google|utmccn=(organic)| utmcmd=organic|utmctr=google%2Banalytics%2Bshortcut
You can see that the name value pairs look very similar to the parameters we use for link tagging. Just by looking at the above cookie you can figure out that the visitor performed an organic search on Google for the term ‘google%2Banalytics%2Bshortcut’ or ‘google analytics shortcut’. That’s the type of information that we want to put in hidden form fields and send back to the server. [You can learn more about the
__utmz
cookie in the reference section below.]
Getting back to the code, we were talking about how the code extracts the information from the z variable. It uses a function named _uGC()
, which is found in the urchin.js JavaScript, to do all of the work. __uGC()
extracts the value part of all the name-value pairs in the cookie. We call _uGC()
for each name-value pair that exists in the cookie. It parses the cookie and pulls out the information that we want. [If you want to know more about _uGC()
please see the reference section at the end of this post.]
Once the information is out of the z
variable the populateHiddenFileds()
functions puts the data in a series of hidden form fields. Then, when the form is submitted, the data is sent to the server.
You’ll notice a few things about the above code. I’ve added some logic to deal with AdWords auto-tagging. Auto tagging populates part of the cookie with a value named gclid
. This variable hides some of the info that we need, like source and medium. The logic in the above code populates data that would otherwise be missing. I’ve also added some code that extracts the custom segment value which is stored in the __utmv
cookie. I thought it would be useful to send this info back to the server as well.
Conclusion
Pulling visitor source data and connecting it with a visitor is very valuable. While your implementation will almost certainly be different, the concept illustrated above is the foundation for all implementations. Regardless of your implementation the business use for this data is fantastic.
Good luck!
Reference
About the _uGC() Function
_uGC() takes three arguments:
_uGC(string, start-string, end-string)
• A string to search (target string)
• A start string
• An end string
The function will return the string between start string and end string. If the start string is not found then the function will return a dash (-).
About The _utmz Cookie
The __utmz
cookie is the referral-tracking cookie. It tracks all referral information regardless of the referral medium or source. This means that all organic, CPC, campaign, or plain referral information is stored in the __utmz
cookie. By default the cookie expires in six months, but that can be customized by changing the tracking code.
Cookie Format:
domain-hash.ctime.nsessions.nresponses.utmcsr= X(|utmccn=X|utmctr=X|utmcmd=X|utmci
Data about the referrer is stored in a number of name-value pairs, one for each attribute of the referral:
utmcsr
Identifies a search engine, newsletter name, or other source specified in the
utm_source query parameter See the “Marketing Campaign Tracking”
section for more information about query parameters.
utmccn
Stores the campaign name or value in the utm_campaign query parameter.
utmctr
Identifies the keywords used in an organic search or the value in the utm_term query parameter.
utmcmd
A campaign medium or value of utm_medium query parameter.
utmcct
Campaign content or the content of a particular ad (used for A/B testing)
The value from utm_content query parameter.
utmgclid
A unique identifier used when AdWords auto tagging is enabled This value
is reconciled during data processing with information from AdWords.
Salesforce.com has built in visitor source (lead) tracking technology, an extension of the Google AdWords integration that is available to all salesforce customers. Find more information on the Salesforce for Google AdWords blog at: http://blogs.salesforce.com/adwords/2007/01/new_feature_upd.html
Your Google Analytics integration script presents an interesting method of dissecting the analytics cookie. We will certainly have a look at your code and possibly package a solution for our users to take advantage of.
Thanks,
-Kraig
Hi Kraig,
Thanks for the information. I knew that there was some integration between AdWords and SalesForce, but I did not know the details. It looks like you guys have this integration thing wrapped up! :)
The one great thing about the approach above is that it can be used with almost any type of lead generation form. You don’t need a CRM to collect and store the data. You could use an email or a simple text file.
Thanks again for the information and for reading.
Justin
Thanks again for the comment and thanks for reading.
Man, I just spent 6 hours last week writing this exact same functionality for our CRM!
Great work, as always, Justin. I think your code is cleaner than mine, so I’ll borrow a few lines.
Cheers,
Tyson @ NMG
Hey Tyson,
Sorry I did not get this post out sooner! I’m glad you took the initiative to try this. Let me know how it works out.
Thanks for reading and thanks for the feedback.
Justin
Haven’t had to deal with this one yet, but its good to know where to look in case I ever have to! I love the cookie/variable definitions at the end – great!
Thanks Justin!
Hey Justin,
Kudos for such an excellent article for GA power users.
– Ophir
Justin,
In your code, if(gclid ==”-“) is always evaluating to true, meaning every visitor gets set to google(cpc). I’ve tried with multiple browsers and systems, all with the same result. Should it be != instead?
Also, I prefer to pass this.id as the argument, and add the hidden fields dynamically using the form.appendChild() function. This eliminates having to add the form fields in the HTML, which means less manual edits. Just another option, I guess.
Tyson
Hi Tyson,
Thanks for catching that. I copied an old version of the code that was incorrect. I’ve update the post to reflect the code.
Also, thanks for the suggestion for improvements. I’m really glad that you’ve taken my idea and made it better! :) I am by no means the best programmer out there. My goal is to come up with some ideas, and have smarter people clean them up.
Thanks for the feedback,
Justin
Ophir,
Thanks! I’ve been reading your blog for a while and am flattered that you like the post.
Thanks,
Justin
Justin,
I can’t thank you enough for writing about this subject! I can’t wait to get this implemented using our contact form.
Also, you mention some good reasons for WHY someone would want this info which are all valid. Our biggest thing is knowing whether a banner ad or some other form of advertising works, so by including the source on the contact form, we’ll be able to track whether that lead converted to a sale. For us, it may only take one or two sales to know that the banner ad, etc. paid off and had a positive return on investment.
Thanks again for the great post.
Leslie
Justin
This is awesome code, we are using it on our site. However we still get quite a lot of visitors where it says “google | cpc | – “.
Any ideas why this might be?
Hello,
Thanks for your post, it will be very useful. In trying to implement your code, I get the following error in Firefox/2.0.0.9:
Error: syntax error
Line: 243, Column: 38
Source Code:
csegment = csegment.match([1-9]*?\.(.*));
I never really came to grips with regexps so would appreciate your help :)
Thanks
Blair
Hi Blair,
There was a missing character that was causing the error in FF 2.0.0.9. Things should run smoothly now.
Thanks for the heads up and thanks for reading the blog.
Justin
Alex,
The ‘-‘ is hsowing up in the term location, that means that there must be an issue with getting the term. The only thing I can think of is that this is what happens when there is traffic coming from the content network.
Other than that, I would just double check to make sure the code is configured the right way.
Thanks for reading and thanks for sharing your experience.
Justin
We tried it and can’t get the data to print on the form. Can anyone offer suggestions on what we are doing wrong? We are running coldfusion and we inserted the code Justin provided, but the fields are not getting populated on the emailed form.
Tracking Data ========================================================================
=========
Source :
Medium :
Term :
Content :
Campaign :
Segment :
Hi Justin
Yes this is a great function and we have been using it for over 2 years.
Do you know if this will still work (uGC) if we switch to the new google analytics cookie ga.js
Thanks
Hi Tom,
Unfortunately, the new GA tracking code does not include the _uGC() function. If you switch to the new ga.js the best course of action is to manually add the _uGC function to your web pages.
Hope that helps,
Justin
hello all you smart people :D
this may seem like a daft question but i was wondering if you could help anyway….
having looked at my cookie files i have noticed multiple _utmz cookies, each from a different domain. how would this script react to that? would it read the values in all of the _utmz cookies??
the reason i ask is because i want to read the contents of the _utmz cookie for my domain only.
thanks ;)
Hi Hopeful,
The code should only pull the data from the cookies that are specific to your website. GA uses first party cookies which means that the code will only work on the __utmz cookies that were set by your site.
All of the other __utmz cookies, which probably were set by other sites, will be ignored.
“hopefully” that helps you out!
Thanks for the question.
Justin
Hi Justin –
How do you map the gclid with a particular campaign in your Adwords account? The gclid seem to be very long alpha-numeric strings. In my AdWords account, all my campaigns have nice human-readable names. If I hover over the link to a particular campaign in my Adwords account, I am able to identify a “campaignId” parameter, but this is a an 8-digit numeric value- which seems to have no correlation to the gclid.
Thanks!
Cailin.
Thanks Justin!
Hi Cailin,
Unfortunately there is no way to map GCLID to an actual campaign or ad group. Google does the mapping when processing the data. The only we can discern from GCLID is that the visitor cam from a Google cpc ad.
Also, as you point out, the campaignId parameter is not related to GCLID.
Thanks for the question,
Justin
Hi Justin,
Great article. I have a question about the _uGC function. I couldn’t find any official documentation from Google about this function, so I’m afraid to use it in my code in the event that Google decides to rename it or remove it someday. Do you have any idea on how likely that is?
Thanks.
Hi again,
I have another comment. I wonder if there’s a mistake in your code above, specifically the 6 lines which use _uGC. It looks like they assume each key/value pair will be followed by a pipe “|” character, but I don’t think the trailing key/value pair is followed by a pipe.
Thanks.
Hi Timothy,
The _uGC function is in the urchin.js library, you should be able to find it in there. The last I checked it was still there. The pipe characters in the code are the delimiter values used in the utmz cookie to separate the various pieces of referral information.
As always, if you have any concerns about this code working you should test it on your own development server. This post is meant to be a guide. Given all the different websites out there your implementation will almost certainly be different.
Best of luck,
Justin
Hi,
Thanks a lot for the post. I have a small doubt. We use Adwords a lot, so i want to know what are the exact searched query typed by the user. I am able to get that keyword in utmctr only for organic traffic but not for the paid search from any search engine. utmctr is not existing in case of paid search. What approach should i follow to get the data for paid search.
Thank you
Vipul
Justin,
Your O’Reilly G.A. PDF book is a great resource. I’m glad to see the updated JavaScript here. We want to save Adword data that brought users to our site.
Although it is clear how to get these __utmz cookie paramaters, it is still unclear to me what they really mean:
* utmccn (campaign name) – Does not seem to match our Adword Campaigns in our testing (is either not set, organic, direct) What campaign is this?
* utmcmd (campaign medium) – Usually is ‘not set’. What would be an example of data in this parameter?
Thank you for the information you supply,
Nat
Hey Nat,
Because AdWords is integrated with Analytics we don’t get all the usual PPC data in the utmz cookie. That’s why there is no value for campaign, medium and source.
However, we do have a value called gclid. If this value is present in the cookie then we can force some of the other values, specifically source (to google) and medium (to cpc), to a value.
To see some other values for source and medium check out the All Traffic Sources report. It lists all combinations of the source and medium values.
Unfortunately we can’t get the campaign name when we’re collecting data from AdWords, that’s obscured by the auto-tagging feature. GA automatically pulls that value from AdWords during data processing.
Hope that helps.
Justin
Excellent technique, we use it quite a lot. I posted an example of how it can work with the wufoo form building web app at http://www.borism.net/2008/12/24/integrating-google-analytics-and-wufoo/
Rock on,
– Boris
Vipul,
Capturing the exact search term can be challenging as GA passes the bid term. You would need to tweak the script to capture the exact search term. I think this is a great idea and I’ll add it in a newer version of the script.
Thanks for the idea,
Justin
This is great, exactly what I am looking for! We are using the ga.js tracking code. It sounds like we need to add extra code to all the pages for this to work with the _uGC() function? Can you confirm this and what exactly needs to be added and where in the code?
Thanks!
Hi Michelle,
I’ll try to get the new code out ASAP in an updated post.
Justin
Hi Justin,
Hope you are well. Is there a new version of this for the new GA code?
Also, when is there a new version of the book out?
Hey Matt,
Great to hear from you. I hope all is well.
I’m working on a new post and the book. I’m actually late on the book and the post :)
Can you get me more than 24 hrs in a day?!
Justin
Justin,
Very useful post. I’ve been looking at the GA cookie and noticing that the ucmccn variable is getting set to organic even when someone comes to my site from an adword campaign. I don’t have conversion goals setup in analytics, but that shouldn’t affect the cookie right? Any ideas.
@Michelle – You can now do all this with the Google Analytics data export API, but the technique above is still valid.
You could also just inspect urchin.js and rip out the function you need that allows you to read cookie values ;).
This kind of insight is perfect for working out the lifetime value of a customer and getting true ROI for Google AdWords advertising and of course other mediums.
Hey Justin,
Your O’Reilly Shortcut is great!
For the “term” variable, I’ve used the “unescape” function before assigning it to the hidden input tag and I find the output more readable. Do you know of any unexpected errors this might cause?
Thanks Leo! The Short Cut is in dire need of an update, which is coming soon!
Great idea adding an unescape. Off the top of my head I can’t think of any issues that it may cause. That’s a great addition!
Justin
Does anyone have any insight into Server Side Scripting for Ruby? Would the Client Side Java Script work as well as a client side script?
Will additional Java Scrip slow down site?
@Taylor: I’m not quite sure what you’re asking for re: Ruby. If you’re looking for Ruby help there are a number of great resources out there. Re: the JS slowing down the site, it is possible. But it all depends on your site, the server and the JS code. The code in this post is means to be an example. You should always create code that is compatible with your site.
Best,
Justin
I use this script to send the cookie data from analytics via a form. It worked perfectly. Since last week it doenst work anymore. Does somebody know if something has changed the way the cookies are stored?
Rick, there was a change to the GA.JS earlier this week. But the change was quickly reversed. I’ll update the block post to include more safeguards soon.
great post, thanks a lot! ive been researching about analytics and crm then I found your post. :)
Hmm, this seems a great idea to get into. Well, on a different note since I hate coding myself, I’ll better be showing this to my web developer and he’ll go implement it. Thank you mate!
Excellent and helpful method/explanation.
I was hoping to also pull the user’s Network (or ‘Service Provider’) and pass through such a form, but I can’t seem to find a Google Analytics cookie that stores this (though the data is collected/displayed in reports).
hey justine!
i wonder why the codes you posted cannot be seen on your article?
i tried to save the whole page, thought that there’s a file attached to your article, but i didn’t find anything.
i can’t find your codes. :(
eirika