This is part two of a two part series on advanced content tracking. This post is about the reporting and analysis of how people interact with content.
As I mentioned in part one, this technique and concept was born from collaboration. There are a number of people that need to be recognized for contributing:
Nick Mihailovski – Developer advocate at Google (and the guys that sits across from me)
Thomas Baekdal – Smart guy and publisher of www.baekdal.com (you should subscribe)
Avinash Kaushik – If you don’t know Avinash…
Joost de Valk – Creator of the Google Analytics for WordPress plugin (you should use it)
Eivind Savio – Blogger and GA consultant (read his stuff)
Let’s look at some data. It’s all from this blog.
This tracking technique uses event tracking to track how people scroll through pages on a site. So let’s start with the event tracking reports.
Here’s the Content > Events > Top Events Report.
All the events are bundled in the Reading category. You’re probably wondering why the Value is so high. Remember the value is the number of seconds between certain actions. That’s why the event value is so high. More on this later.
Click on Reading and you can see all of the actions that we created within this category (ArticleLoaded, StartReading, ContentBottom or PageBottom).
Let’s take a look at this data and translate it. Of the articles that load, 82% actually start to read the article. I think that’s pretty good. I actually think that the 18% that do not start to read the article are actually visits that open the article in a new tab/window and then time out. They go away for 30+ minutes and break the tracking.
Of those that start reading the article, 63% make it to the bottom of the content. That seems pretty good. I don’t have any other reading benchmarks, so I’m content that more than half make it to the bottom of the article. But this is something that I’m going to trend over time. I’m also going to do some segmentation on this in a minute.
Of those that actually start reading an article, 18% actually make it to the bottom of the page.
Believe it or not, 18% is very high. On average the number of visits that make it to the bottom of an article is between 3% and 5%. I’m basing this observation on 5 days of data across a couple of sites.
UGH! To me that’s terrible! I thought more people read the comments. I guess I was wrong. Comments don’t seem to interest too many people.
But remember, this is top-line data. We need to get into some segments to understand what content drives these metrics.
Page Level Reading Metrics
Staying with the Top Events report, we can add a secondary dimension of Page to see which articles are loaded most often, which are read and which ones are finished.
NOTE: I’m going to filter the results below to include a single article, it will make it easier to see the report. You can use an advanced filter to focus on only the data that you’re interested in.
Here you can see an article, how many times it was loaded, how many times people started reading it, how many times people got to the bottom of the content and how many times people made it to the bottom of the page.
This is pretty cool.
Comparing this data to the average I can see that it basically follows the same trend.
- About 82% that load the article start reading (daily avg = 82%)
- About 55% make it to the bottom of the content (daily avg = 63%)
- About 14% make it to the bottom of the page (daily avg = 18%)
I suggest looking at the article and checking things like how many comments the article has in order to further understand the data.
Let’s take a moment here and talk about event value. The event value measures time in seconds. These time measurements represent the following time intervals:
- Time between page loading and start reading (i.e. scrolling)
- Time between start reading and the bottom of the article
- Time between start reading and the bottom of the page
The first thing that jumps out is the amount of time before scrolling, It’s about 5 minutes! (time is in seconds, so I just divided by 60). After talking with a few people we believe that this time lag is caused by people opening articles in tabs and reading them later.
Once they start reading it takes about 6.5 minutes to get to the bottom of this article and 9.5 minutes to get to the bottom of the page. Again, this depends on a number of things, like comments, article complexity, etc. But you can compare this to the site average for context.
Or, better yet, create a average for the content category, so you’re comparing the time it takes to read a technical article to the avg time it takes to read a technical article.
Be aware, these metrics can easily be skewed due to outliers.
If a few people open tabs and then walk away the results can really throw off these metrics.
Ok, more reports.
Let’s check out the traffic sources report. Remember, I created goals out of these events. Now you can look at the Goals tab and instantly see how much traffic made it to the end of an article or page:
Now you can really measure the value of a traffic source based on how many people actually read your content not just time on site/page or pageviews.
NOTE: GA only counts one conversion per visit. So as soon as someone reaches the bottom of an article or page the goal will be counted. Total events are the raw count and will be higher that the total number of goals. Unique events are the number of visits that included a specific event and should match the goal counts.
Don’t Forget Advanced Segmentation
This is just the beginning. You can do some serious analysis of the data. If you’re a publisher try segmenting by things like:
- Publication date
- Content category
Obviously you need these data points to create these segments. I have this data in custom variables (thanks to the Google Analytics plugin for WordPress) so I can do things like this:
The resulting data is useful, but working with it in the interface can be a bit challenging. Here, take a look.
This would be easier in Excel.
Advanced Data Analysis
While you can get a lot of insights from the reports, this data begs for Excel. It’s actually a lot easier to export the data and filter it in Excel especially when you’re looking at the actions with a secondary dimension of Page.
In a spreadsheet, I want to see the Dimensions of Event Action and Page URL. I want to see the metrics Total Events, Unique Events, Total Event Value (all of our time measurements) and Avg. Event Value (which is the average time measurements).
Remember, the event actions are :
- How many times the page loaded
- how may times people started to scroll
- how many reached the bottom of the content
- how many reached the bottom of the page
The first thing you’ll want to do is convert the time measurements from seconds to minutes:seconds. Then start to build some averages for your site. Build benchmaks based on category, author, publication date.
You may want to include even more Dimensions, like category, author, etc. We’re getting into some pretty serious data here :)
If you’re really nerdy and want to pull the data using some code, here’s the API query that I used. You’ll just need to change the date range.
Other Things to Note
In addition to the above data, there are a few more things to be aware of. First, your bounce rate is going to go down. Way down. The event that tracks scrolling counts as an interaction. So if the visitor does not view any other pages they will NOT be counted as a bounce.
Second, your time on site and page will go up. I’m not going to get into all of the details here but for visits that only include one pageview, Google will use the last engagement HIT of the session to calculate time. So this will be the time the visitor gets to the bottom of the content or the page.
Your time measurements are going to be MUCH more accurate.
Before and After
To wrap up some analysis I thought it would be interesting to take a look at a few pieces of content and their perceived performance before this tracking technique.
Here’s a piece of content with data compared week over week. This week with the new content tracking vs. last week. Look at the HUGE difference in time on page and bounce rate.
An obvious improvement right? I’m getting much better numbers now that the code has been changed.
What About Custom Variables?
If you read part one then you’re probably wondering about the custom variables. The code actually sets a custom variable when someone reaches the bottom of the page in more than 60 seconds. If they get to the bottom in less then 60 seconds then they are a scanner.
After seeing how few people actually make it to the bottom of the HTML page I think this code should be moved to the bottom of the Content section. So I moved the code. I also changed the code to be a page-level custom var.
Custom vars will give you much of the same data that you get from events. Analysis for custom vars will have to wait.
That’s the end… for now.
I hope you find this interesting. I certainly do. I think it opens the door to a different, more accurate, method to track content.
Again, thank you to all the people that have contributed.