Count Me Out: GA.JS Version

A while back I wrote a post called Count Me Out! that explained how to exclude Google Analytics data based on the custom segment value.

My previous post was based on the old, urchin.js tracking code, and a lot of people have been waiting for an update. It’s taken a while, but here it is.

I will mention that my favorite way to exclude traffic from Google Analytics is using an IP exclude filter. An IP based exclude filter is very accurate unless you having a changing IP. The method below works best if you have a dynamic or changing IP address.

Even if you’re not interested in this post, there is a fun ‘group activity’ below. Please try it!

The old version of this hack method required you to add a new page to your website. That page would set the GA custom segment cookie (named __utmv) on your computer.

This technique works fine, but who wants to add a new page to their site? It can be a pain.

I’ve simplified this technique by removing that page. You can enter the JavaScript directly into your browser.

Step 1: Set Custom Segment Cookie

Go to the site that you are tracking with Google Analytics and view a page.

Copy the code below and paste it into the location bar of your browser and click ‘enter’ on your keyboard.

You should see a message that says, “Custom segment has been set. Time to create a filter.”

Here’s a tip, you can bookmark this JS to make it easy to reset the cookie in the future. I set the cookie every time I fire up my browser.

Step 2: Create Exclude Filter

Next, create an exclude filter in Google Analytics to exclude the user defined segment (i.e. the cookie) you just created:

This filter will exclude anyone with a custom segment cookie with a value of ‘remove-me’.

Remember, cookies are specific to a browser and computer. If you use mulitple browsers or multiple computers you need to set the cookie using all the browsers on all the computers you use.

Having Some Fun With This!

You’ve probbaly figured out that you can set the custom segment cookie on anyone’s website as long as they’re using GA. This means that you can add data to their User Defined report. Let’s try this on my site!

Navigate to www.cutroni.com/blog and place the following code in the location bar of your browser after the page has loaded. Change FOO to whatever you want and press enter on your keyword. That’s it. You’re now in my data.

I’ll post some of the more popular and creative values in Twitter and maybe here at a later date.

Please try to keep it clean. I often review data with my 4 year old son :)

Thoughts on the Old Method

The old version of this technique uses a form to set the custom segment cookie which is pretty handy if you have a lot of people in remote locations that need to be excluded from the data. Just send all your coworkers, contractors, etc. a link to the page and ask them to set the cookie on their computer. It’s a little easier than asking them to paste JS into their browser.

If you’re interested in using this technique here is a new version of the page and the process.

Step 1: Create a new page on your site using the code below.

** Note ** The information below is in an iFrame. If you receive this post via email you may not see the contents.

Step 2: Go to the new page you just added, fill out the form, and click the ‘Create Cookie’ button. Keep track of the value you enter into the form, you need it for step 3.

Step 3: Finally, create an exclude filter in Google Analytics to exclude the value that you entered into the form. Remember, you need to use a regular expression for the filter field. So if you entered ‘remove-me’ in the form, enter ‘remove-me’ as the Filter Field.

That’s it. Sorry for the lame post, but I’m trying to update a lot of the old code and posts on the site.

UPDATED: Integrating Google Analytics with a CRM

A while back I wrote a post about how you can integrate Google Analytics with a CRM system. That posts referenced the old urchin.js code and is in need of an update. Plus a lot of people have been asking me about different integration options and I thought it would be a good time to revisit the topic.

So here it is, an update. I’m not going to re-do the entire post, just a few key points.

Why Do This?

This entire hack configuration is based on one simple idea: to build a more robust view of customers/prospects/leads/whatever-you-call-them. The more information we know about people that we sell to the more we can adjust our marketing. This technique let’s us combine where the visitor came from (referral, campaign, direct, search engine) and how often they’ve been to the site with CRM data.

How It Works

We’re going to extract data from the GA tracking cookies and add it to a lead generation form. The GA data will be stored in hidden form elements that the visitor can not see. When the visitor submits the form the GA will be connected to the other information that the visitor entered into the form (usually name, contact information, etc.).

Adding GA data to a CRM

Here’s the data that we’re going to pull from the GA cookies:

* Current referral information (i.e. where the visitor came from)
* Custom segment value (if it exists)
* How many times the visitor has been to the site

You’re probably wondering why we’re not extracting the data via the GA API. More on that in a minute.

The Code

Here it is, the updated code:

The new code differs from the original code in three ways:

First, the inclusion of ga.js. No explanation needed there.

The second change is the addition of the _uGC() function. This function was in the original urchin.js, and because I don’t include that file I need to include the code in the script.

The function is used to parse pieces of data and return certain parts. For example, I use to to parse the cookies (which are just a string) and return the value of the campaign cookie (__utmz), the custom segment cookie (_utmv) and the visitor identification cookie (__utma).

Finally, I added some new functionality to include this visitor’s number of visits. This data is pulled from the __utma cookie (which identifies the visitor). The last integer in __utma is actually a visit counter. I think it is interesting to understand behavior based on frequency so I added this nugget of data.

If you want to see the code in action try this:

In my previous post I included a reference section that documents the format of the campaign cookie and the _uGC() function. Check it out if you’re looking for more technical information.

An Alternate Approach

Remember, all the data we’re pulling is in first party cookies. This means that you could extract the info at the server level rather than the browser level (first party cookies are available to server side code).

You don’t need to do this in JavaScript, as I did. You could use ColdFusion, PHP, .NET or some other server side language.

I’m just trying to provide some inspiration here :)

What About the Google Analytics API?

I’m sure there are a lot of people out there wondering about the GA API and why I don’t just use it to pull data out from GA. The reason is the data in GA is all anonymous, aggregated data. There’s no way to reach into GA and say, “tell me where John Doe came from.” That’s why we pull data from the cookie, because we can combine it with a visitor’s info when they supply it.

Sure, it is possible to pull out GA data and place it in a CRM (like traffic from an email campaign, bounce rate from paid search, etc.) but it’s tough to identify individuals in that data.

Is it impossible? No, not from a technical standpoint. You would need to place a unique identifier in GA to link the data.

But we’re getting into a risky area here.

Remember, it is against the GA terms of service to capture data in GA that identifies an individual. This could be an email address or database ID number. Not my rule, Google’s, and they will crack down.

I’d be interested to hear people’s thoughts on this. Feel free to comment!

“Enterprise” Google Analytics

Is Google Analytics an “enterprise” class analytics solution? That’s debatable, and in fact, it has already been debated.

In my opinion, it depends. It depends on your analytics needs.

We’ve worked with plenty of “enterprise” class organizations that were new to web analytics. They had very simple needs and GA met most of them easily. We’ve also told companies that GA is not right for them because it did not fit their core needs.

Your organization may be different. You may need a tool that integrates with ODBC data sources, something that GA doesn’t do very well. If that’s the case then you might need to go with a different tool. But again, it all depends.

Google Analytics Enterprise-ness

But the point of this post is not to debate GA’s “enterpise-y-ness”, but to address some of the common issues that we usually see during an enterprise installation.

Issue #1. Tracking All Sites Logically

Major League Baseball

Large organizations tend to have more sites, and more sites mean more data. Collecting the data in an organized fashion, that allows room for growth and appropriate access for users, takes time and planing.

During an enterprise implementation we usually create a series of accounts and profiles that segments the data based on business logic and access needs. We create a data hierarchy that provides high level aggregate tracking across the entire online experience (i.e. roll-up reporting) and detailed tracking for each individual property.

Let’s consider the websites for Major League Baseball. Each team has their own site located on a subdomain. There is also an MLB store and different micro sites dedicated to things like the All Star Game and the World Series.

Lots of content on many different sites. While the exact implementation solution will depend on their specific needs, it probably involves collecting all the data in a single profile for roll-up reporting and then creating profiles for each team and micros site for detailed reporting.

Issue #2. Unique Visitors

Tracking lots of domains usually leads to an issue with unique visitor tracking. GA uses a first party cookie to identify each visitor. This means that if a visitor visits 3 different domains they will receive 3 different cookies and appear as three different unique visitors.

Now, I know GA has a cross domain tracking feature. But what happens if an enterprise wants to know the unique visitor count across 50 web properties? Installing cross domain tracking on that scale is a huge task. In fact, it’s a pain in the ass.

Many of the clients that I’ve worked with have compromised and ignored unique visitor tracking.

You may be different. Unique visitors may the one critical metric that you can’t live without. Could you use GA? Maybe, but you should carefully weigh the implementation needs vs. your analysis needs.

Unique Visitors are Unique!

Issue #3: Page Tagging

When I first started working with GA I never thought that tagging pages would be an issue, but it is. It’s not so much a technical issue as it is an organizational issue. Big companies can have so many sites with some many nooks and crannies. It can take a lot of work to identify every site, find an owner and then get the tags placed in the appropriate place.

And let’s not forget non-HTML pages. Tracking non-HTML content with Google Analytics can be a huge challenge. You can’t slap a JavaScript Tag on a PDF. When we work with large organizations we usually help then develop an automated click tracking script. This takes more time and more effort and doesn’t always work (usually due to page rendering delays).

Issue #4. URL Structure

URL Structures can be manually created using Google Analytics.

This is probably one of the most difficult challenges we face when working with large sites that have hundreds of thousands of pages. GA will only track 50,000 unique URLs per day. While this is completely adequate for most sites “enterprise” sites can exceed this limit, especially if the site is content based (think about a some of today’s largest community sites, they have forums, blogs, and tons of user generated content).

What happens when you fill GA with 50k unique URLs in a day? You start to see ‘(other)’ in your content reports and you can no longer identify which pages visitors are viewing on your site.

To resolve this issue we usually need to create some type of bucketing strategy to ‘roll up’ pageview data into different content categories. This is normally done by matching requested URL patterns at the server level, and then generating a ‘virtual’ pageview in GA.

Sometimes we segment the data into different profiles, thus giving us more ‘buckets’ to store the data.

Again, the exact solution depends on many different factors, but this issue can be mitigated with some effort.

Issue #5. Campaign Tracking

This is a problem for everyone! I find very few clients whoa are diligent about tracking their marketing campaigns using link tagging. A general rule of thumb, the bigger the client the more challenging it is to track all online campaigns. Why?

Big organizations have different people running different campaigns. Many times they’re using one or more agencies to help run their campaigns. Getting everyone to use a cohesive link tagging strategy is a lot of work due to the sheer number of people that are involved. This is more of a training/process issue rather than a technical issue.

Wrapping Up

If you’re an enterprise organization, or consider yourself an enterprise organization, don’t discount GA without taking a hard look at your real analytics experience and your needs. GA might just work for you.

If you do decide to use GA don’t expect to slap the tags on your site and finish the configuration in a week. Like every tool out there, it takes time and planning to get things right.

Do you have experience with GA in a large, “enterprise” environment? Leave a comment and share your thoughts.

Tying clicks & content to conversion in GA

Many site owners spend a lot of time creating content that is supposed to drive conversions. But what’s the best way to measure the performance of this content with Google Analytics? How can we measure a specific piece of content, be it a page or a piece of creative, and it effect on conversions?

Google Analytics has a metric called $Index to help measure the “value” of site pages. But the problem with $Index is that it is an average, and averages can be skewed very easily. $Index is about the performance of a page, not the content on a page. Also, many people want to know how many times a piece of content directly led to a conversion. We just can’t get that with $Index.

We could view this type of analysis as a navigation analysis. Google Analytics has the All Navigation report and the Initial Navigation report, but these reports track things that happen in under 3 clicks and not everyone converts in 3 clicks.

Rather than tackle this problem using navigational analysis, let’s consider it a content challenge. What we want to do is see if a specific piece of content ultimately lead to conversion.

Google Analytics site overlay report.

Given this approach we could use the Site Overlay report, which is supposed to show the performance of each link on a page. But, in my experience, the Site Overlay report is buggy at best (and I’m being nice).

We need is a way to link a piece of content, i.e. a pageview, to a conversion. There’s a very simple way to do this using the Funnel Visualization report.

The Concept

Each funnel has a ‘required step’ setting. When enabled, this setting requires that the visitor views the first step in the funnel prior to conversion. If the visitor does not see the first step then the Funnel Visualization report will not count a conversion. The conversion will still be recorded in all other reports, but not the Funnel Visualization report.

What few people know is that it does not matter when the ‘required step’ is viewed, as long as it is viewed prior to the conversion.

We’re going to use this setting to associate a conversion with the content we want to evaluate.

Example

Let’s say I want to track how many people view the About Me page on this blog before subscribing to my RSS feed. I can create a goal and funnel that links the About Me pageview to the RSS subscription goal.

Step 1: Set up the goal

The first step is to create the goal. Just set up the goal like any other normal goal. Identify the goal URL, give the goal a name and a value (if you so desire).

Google Analytics goal settings

Step 2: Identify the “required step”

Now let’s turn to the funnel. Remember, step 1 in the funnel, the ‘required step’, is really the piece of content (i.e. the pageview) we want to evaluate in terms of conversions. Simply add the page URL to the Step 1 URL field, give the step a name, and check the “required step” checkbox.

Google Analytics funnel settings.

That’s it! There’s nothing else to do. The funnel visualization report, for this specific funnel, will only show a conversion if the visitor views the About Me page at some point prior to conversion. GA doesn’t care when the visitor sees the page, as long as they see the page prior to conversion.

The Data

Here’s a sample Funnel report. We can see that 4 conversions occurred after viewing the about me page. Remember, it does not matter when the About Me page was viewed, as long as it was viewed prior to conversion.

Google Analytics funnel visualization report.

Now, if we compare the number of conversions in the Funnel Visualization report, to the overall number of conversions for this goal, we notice there is a difference.

Google Analytics goal conversions.

The difference is the number of visits that did not include the About Me page prior to subscribing to the RSS feed. There were 8 total RSS Subscription conversions, but only 4 of those conversions viewed the About Me page prior to converting. Now we know how effective the About Me page was at driving RSS subscriptions.

Taking it Further

What about associating a set of pages with a conversion activity? No problem, just use a regular expression to define your required step. Here’s the same example, but I’ve tweaked to to track visits that include the About Me page or the All Posts page.

GA funnel to associate multiple pages with a goal.

And remember, the pageview you specify for your required step does not need to be a real “pageview.” It can be a virtual pageview, generated with the pageTracker._trackPageview() method. In fact, that’s what I’m doing on the blog. I generate a virtual pageview every time someone clicks on the RSS icon.

This technique is very useful if you want to measure how well a specific piece of content on a page is performing. Generate a pageview when a visitor clicks on the content and use it as step 1 in the funnel.

Think this is a good idea? Got one that’s better? Leave a comment!

Google Analytics E-Commerce Tracking Pt. 1: How It Works

This post is the first in a series of e-commerce transaction tracking with Google Analytics. Why is e-commerce tracking important? Well, transaction data is a vital piece of information when analyzing online business performance.

Sure, it’s great to measure things like conversion rate, but revenue is much more tangible to many business owners. Having the e-commerce data in your web analytics application makes it easier to perform analysis. Do you need to set up e-commerce tracking? No, but it sure helps. :)

The Big Pictures

E-commerce tracking is based on the same principal as standard pageview tracking. JavaScript code sends the data to a Google Analytic servers by requesting an invisible gif file. The big difference is that e-commerce data is sent rather than pageview data.

But how does Google Analytics get the e-commerce data? That’s the tricky part. You, the site owner, must create some type of code that inserts the transaction data into the GA JavaScript. Sounds tricky, huh? Well, its not that bad.

Step by Step: How it Works

Let’s break it down and walk through what actually happens.

1. The visitor submits their transaction to your server.

2. Your server receives the transaction data and processes the transaction. This may include a number of steps at the server level, such as sending a confirmation email, checking a credit card number, etc.

3. After processing the transaction the server prepares to send the receipt page back to the visitor. While preparing the receipt page your server must extract some the transaction data and insert it into the Google Analytics JavaScript. This is the code that you must create.

4. The receipt page is sent to the visitor’s browser.

5. While the receipt page renders in the visitor’s browser the e-commerce data is sent to Google Analytics via special GA JavaScript.

Here’s a basic diagram of the process. Again, the biggest challenge during implementation is adding code to your web server that inserts the transaction data, in the appropriate format, into the receipt page. I’ll cover the setup in part 2 of this series.

What Data can be Tracked?

Google Analytics collect two types of e-commerce data: transaction data and item data. Transaction data describes the overall transaction (transaction ID, total sale, tax, shipping, etc.) while item data describes the items purchased in the transaction (sku, description, category, etc.). All of this data eventually ends up in GA reports. Here’s a complete list of the data:

Transaction Data

  • Transaction ID: your internal transaction ID [required]
  • Affiliate or store name
  • Total
  • Tax
  • Shipping
  • City
  • State or region
  • Country

Item Data

  • Transaction ID: same as in transaction data [required]
  • SKU
  • Product name
  • Product category or product variation
  • Unit price [required]
  • Quantity [required]

A few notes about the data. First, the geo-location data is no longer used by Google Analytics. The new version of GA tries to identify where the buyer is located using an IP address lookup.

Also, you should avoid using any non-alpha numeric characters in the data. Especially in the numeric fields. Do not add a currency identifier (i.e. dollar sign) in the total, tax or shipping fields. this can cause problems with the data.

Like this post? Check out the rest of the series:

Google Analytics E-Commerce Tracking Pt. 2: Installation & Setup
Google Analytics E-Commerce Tracking Pt. 3: Why EVERYONE Should Use It
Google Analytics E-Commerce Tracking Pt. 4: Tacking Lead Gen Forms

2008 Google Analytics Resolutions

Happy new year! I can’t believe 2007 is over. Continuing with a tradition I began last year, I give you my 2008 GA resolutions.

Before I get into the list, I want to thank everyone who reads and contributes to Analytics Talk. 2007 was an incredible year for me, and I really owe a lot to you guys. Thank you for reading, posting questions and helping me learn so much.

1. I will migrate to the new GA.JS tracking code.

Google announced a new version of the tracking code, ga.js, in October 2007 and launched the new code in December. After some minor launch problems things seem to be running smoothly. While you don’t need to migrate to ga.js, you should start to think about it because Google will no longer add features to urchin.js. In my opinion, you should tackle this problem sooner rather than later.

2. I will contemplate Event Tracking and how I can use it.

The reason Google introduced a new version of the tracking code was to enable a powerful new feature called Event Tracking. While most folks might think of event tracking as a ‘web 2.0′ tracking tool geared towards video players and Ajax, it’s really a flexible framework for data collection. I was skeptical at first, but now I’m a convert. All of us can take advantage of this new feature.

I’ll be writing more about Event Tracking and its uses when Google pushes the feature to everyone. In the mean time, check out this series of posts to learn more:

Event Tracking Pt. 1: Overview & Data Model
Event Tracking Pt. 2: Implementation
Event Tracking Pt. 3: Reporting & Analysis

3. I will get creative with profiles.

This is something I’ve been talking about for a while. Profiles are so much more than website data. They’re a collection of data and business rules. Last year, as part of my 2007 resolutions, I mentioned setting up test profiles as a way to insure your configuration settings are correct.

For 2008 I suggest setting up profiles for all major marketing campaigns and mediums. Why? So you can segment reports that normally can not be segmented. Check out Segmenting Visitor Loyalty Reports in GA for more information.

4. I will try some type of ‘advanced’ Google Analytics configuration.

Most of us have a fairly basic implementation of GA. We don’t need to do much more than add the tracking code to our site, set up goals, and configure on site search reporting.

Why not try something new this year? How about using an ‘advanced’ feature like custom segmentation, event tracking or even e-commerce tracking? All of these features can help you learn more about your visitors and what they do. That’s why we use these features and try these hacks: to gain insight and knowledge.

5. I will keep track of website changes and Google Analytics changes.

This is something that I wrote about a long time ago, but it’s still really important. It’s a good idea to keep track of your GA configuration changes so you can better understand the data. Any modifications, like a change to a goal, funnel or filter, should be recorded.

It’s also a good idea to keep track of website changes and online marketing changes. Knowing what’s going to happen with your online business helps drive analysis and you’ll be able to deliver data that will make people happy.

You don’t need anything elaborate, a simple Google Spreadsheet, like this one, will work just fine.

There you have it, a few ideas to spice up your 2008 Google Analytics plans. Got a better idea or think that I missed something? Leave a comment below. And happy new year!

My Regular Expression Tool Box

Love em or hate em, regular expression are a part of Google Analytics. They provide a lot flexibility but at a price. Small mistakes can become magnified and result in poor data quality.

I know there’s a lot of information out there about regular expressions, but I wanted to simplify the topic. In my opinion, here are the most important things to know.

Key Concept: How GA Regular Expressions Work

Let’s start by talking about how regular expressions work in Google Analytics. In general, we apply a regular expression to a piece of data. If the expression matches ANY part of the data then the expression will return TRUE. If the expression returns TRUE then some action will occur.

It doesn’t matter where you use the reg ex. If it’s part of an exclude filter, and the expression matches the data, then the data will be excluded. If it’s part of an include filter then the data will be included. If it’s part of a report filter then the report will only contain info that matches the reg ex. You get the idea.

How Google Analytics Filters Work
[In this image think of the data as the square cube and the red work bench as the regular expression. If the cube is the same shape as the hole in the bench then an action happens; the cube falls through. Get it?]

It’s really important to understand this because it simplifies the expressions we need to create. Let’s say I want to identify all the keywords in a set of data that contain the term excel. Here’s the full list:

word
excel
ms excel
excel 2003
linux
microsoft excel
excel 2007
excel makes pretty graphs
google

Rather than create some fancy regular expression, I can simply use: excel. After the expression is applied to the data we’ll have the following sub-set:

excel
ms excel
excel 2003
microsoft excel
excel 2007
excel makes pretty graphs

This simplifies the creation of your expression because you only need to match part of the data that you’re looking for. With that in mind, let’s move on to some tips that cover the most common uses of regular expressions.

Tip #1: Use Anchors

Regular Expression AnchorsAnchors are a way to specify if a regular expression should match the begining of the data or the end of the data. Remember, reg ex works by matching ANY PART of a piece of data. Sometimes we’re looking for data that starts or ends a particular way and that’s why we need anchors. Let’s go back to the excel example.

word
excel
ms excel
excel 2003
linux
microsoft excel
excel 2007
excel makes pretty graphs
google

Suppose I only want to see the items that END with the word excel. Well, if I use the regular expression excel, I’m going to get all the items that contain the word excel no matter where it appears.

I need to create a reg ex that means, “ends with.” That’s done by placing a dollar sign, $, at the end of my reg ex. So the expression to find all of the keywords that END with excel would be: excel$.

It would match the following items from our list:

excel
ms excel
microsoft excel

To find all of the keywords to START with excel use a carrot, ^, at the beginning of the regular expression, like this: ^excel. It would match the following items from the list:

excel
excel 2003
excel 2007
excel makes pretty graphs

Now, let’s say I want just the keyword excel. Here’s how that expression would look: ^excel$.

Anchors, pretty handy.

Tip #2: Find This OR That

ORMany times in an analysis we’ll want to find multiple items from a set of data. For example, let’s say I want to find all the keywords that contain the name of an MS Office product. The complete list of keywords is:

word 2007
microsoft excel
outlook express
powerpoint
windows 95
mac OSX
linux
google rocks

Again, I’m only interested in the MS Office products, so I need to create an expression that includes the names of all the products. I want to find word OR excel OR outlook OR powerpoint. The pipe character, |, is used to represent OR logic. The following expression will return true if any of the items occur in the data:

word|excel|outlook|powerpoint

And here are the results:

word 2007
microsoft excel
outlook express
powerpoint

Tip #3: If in Doubt, Escape it Out!

The dangerous thing about regular expressions is that we often don’t know what we don’t know. There are a lot of characters that have special meaning in reg ex. The plus sign, the question mark and the period are just a few. Inadvertently using a special character in an expression can lead to big trouble. There is an easy way to protect yourself: escaping.

Escaping a character means that GA will interpret the character as a LITERAL character and not as a regular expression character. To escape any character place a backslash in front of the character. Here’s the great part. It doesn’t matter if you escape a non-special character. To me, escaping a character is like using a safety net. If you’re unsure if a particular character is a special character, escape it. It can’t hurt your expression.

Time for an example. Let’s say we want to create a goal based on the following URL:

index.php?id=34

I need to turn the above into a regular expression. The question mark and period are special characters so they need to be escaped. But I’m not sure about the equal sign. I better escape just to be safe. So here’s how the resulting reg ex would look: index\.php\?id\=34. By the way, the equal sign is not a special character.

So there you have it. My two cents on regular expressions. These tips just scratch the surface of what you can do with Reg ex. If you really want to learn about reg ex check out my friend Robbin’s series on the subject.