Advanced Content Tracking with Google Analytics: Part 2

This is part two of a two part series on advanced content tracking. This post is about the reporting and analysis of how people interact with content.

As I mentioned in part one, this technique and concept was born from collaboration. There are a number of people that need to be recognized for contributing:

Nick Mihailovski – Developer advocate at Google (and the guys that sits across from me)
Thomas Baekdal – Smart guy and publisher of www.baekdal.com (you should subscribe)
Avinash Kaushik – If you don’t know Avinash…
Joost de Valk – Creator of the Google Analytics for WordPress plugin (you should use it)
Eivind SavioBlogger and GA consultant (read his stuff)

Let’s look at some data. It’s all from this blog.

The Reports

This tracking technique uses event tracking to track how people scroll through pages on a site. So let’s start with the event tracking reports.

Here’s the Content > Events > Top Events Report.

Reading actions in Google Analytics

Reading actions in Google Analytics

All the events are bundled in the Reading category. You’re probably wondering why the Value is so high. Remember the value is the number of seconds between certain actions. That’s why the event value is so high. More on this later.

Click on Reading and you can see all of the actions that we created within this category (ArticleLoaded, StartReading, ContentBottom or PageBottom).

Reading actions in Google Analytics

Drill into the Reading event to view the specific reading actions.

Let’s take a look at this data and translate it. Of the articles that load, 82% actually start to read the article. I think that’s pretty good. I actually think that the 18% that do not start to read the article are actually visits that open the article in a new tab/window and then time out. They go away for 30+ minutes and break the tracking.

Of those that start reading the article, 63% make it to the bottom of the content. That seems pretty good. I don’t have any other reading benchmarks, so I’m content that more than half make it to the bottom of the article. But this is something that I’m going to trend over time. I’m also going to do some segmentation on this in a minute.

Of those that actually start reading an article, 18% actually make it to the bottom of the page.

Believe it or not, 18% is very high. On average the number of visits that make it to the bottom of an article is between 3% and 5%. I’m basing this observation on 5 days of data across a couple of sites.

UGH! To me that’s terrible! I thought more people read the comments. I guess I was wrong. Comments don’t seem to interest too many people.

But remember, this is top-line data. We need to get into some segments to understand what content drives these metrics.

Page Level Reading Metrics

Staying with the Top Events report, we can add a secondary dimension of Page to see which articles are loaded most often, which are read and which ones are finished.

NOTE: I’m going to filter the results below to include a single article, it will make it easier to see the report. You can use an advanced filter to focus on only the data that you’re interested in.

Page level interaction metrics in Google Analytics.

View engagement actions for individual content in Google Analytics.

Here you can see an article, how many times it was loaded, how many times people started reading it, how many times people got to the bottom of the content and how many times people made it to the bottom of the page.

Time to action measurements in Google Analytics

Average amount of time it takes to interact with content.

This is pretty cool.

Comparing this data to the average I can see that it basically follows the same trend.

  • About 82% that load the article start reading (daily avg = 82%)
  • About 55% make it to the bottom of the content (daily avg = 63%)
  • About 14% make it to the bottom of the page (daily avg = 18%)

I suggest looking at the article and checking things like how many comments the article has in order to further understand the data.

Let’s take a moment here and talk about event value. The event value measures time in seconds. These time measurements represent the following time intervals:

  • Time between page loading and start reading (i.e. scrolling)
  • Time between start reading and the bottom of the article
  • Time between start reading and the bottom of the page

The first thing that jumps out is the amount of time before scrolling, It’s about 5 minutes! (time is in seconds, so I just divided by 60). After talking with a few people we believe that this time lag is caused by people opening articles in tabs and reading them later.

Once they start reading it takes about 6.5 minutes to get to the bottom of this article and 9.5 minutes to get to the bottom of the page. Again, this depends on a number of things, like comments, article complexity, etc. But you can compare this to the site average for context.

Or, better yet, create a average for the content category, so you’re comparing the time it takes to read a technical article to the avg time it takes to read a technical article.

Be aware, these metrics can easily be skewed due to outliers.

If a few people open tabs and then walk away the results can really throw off these metrics.

Ok, more reports.

Let’s check out the traffic sources report. Remember, I created goals out of these events. Now you can look at the Goals tab and instantly see how much traffic made it to the end of an article or page:

Google Analytics Traffic Sources Report

Measure traffic sources based on actual reading metrics.

Now you can really measure the value of a traffic source based on how many people actually read your content not just time on site/page or pageviews.

NOTE: GA only counts one conversion per visit. So as soon as someone reaches the bottom of an article or page the goal will be counted. Total events are the raw count and will be higher that the total number of goals. Unique events are the number of visits that included a specific event and should match the goal counts.

Don’t Forget Advanced Segmentation

This is just the beginning. You can do some serious analysis of the data. If you’re a publisher try segmenting by things like:

  • Author
  • Publication date
  • Content category

Obviously you need these data points to create these segments. I have this data in custom variables (thanks to the Google Analytics plugin for WordPress) so I can do things like this:

Creating a Custom Advanced Segment in Google Analytics.

You can segment your reading metrics using a Custom Advanced Segment.

The resulting data is useful, but working with it in the interface can be a bit challenging. Here, take a look.

Segmented data in Google Analytics.

Get some deeper insights by segmenting reading data based on category.

This would be easier in Excel.

Advanced Data Analysis

While you can get a lot of insights from the reports, this data begs for Excel. It’s actually a lot easier to export the data and filter it in Excel especially when you’re looking at the actions with a secondary dimension of Page.

In a spreadsheet, I want to see the Dimensions of Event Action and Page URL. I want to see the metrics Total Events, Unique Events, Total Event Value (all of our time measurements) and Avg. Event Value (which is the average time measurements).

Remember, the event actions are :

  • How many times the page loaded
  • how may times people started to scroll
  • how many reached the bottom of the content
  • how many reached the bottom of the page

The first thing you’ll want to do is convert the time measurements from seconds to minutes:seconds. Then start to build some averages for your site. Build benchmaks based on category, author, publication date.

You may want to include even more Dimensions, like category, author, etc. We’re getting into some pretty serious data here :)

You can pull data using any GA excel plugin like NextAnalytics or GA Data Grabber.

Google Analytics Tracking of Content in Excel

Working with content engagement data in Excel is really powerful.

If you’re really nerdy and want to pull the data using some code, here’s the API query that I used. You’ll just need to change the date range.

http://code.google.com/apis/analytics/docs/gdata/gdataExplorer.html?dimensions=ga%253ApagePath%252Cga%253AeventAction&metrics=ga%253AtotalEvents%252Cga%253AuniqueEvents%252Cga%253AeventValue&filters=ga%253AeventCategory%253D%253DReading&start-date=2012-02-20&end-date=2012-02-20&max-results=10000

Other Things to Note

In addition to the above data, there are a few more things to be aware of. First, your bounce rate is going to go down. Way down. The event that tracks scrolling counts as an interaction. So if the visitor does not view any other pages they will NOT be counted as a bounce.

Second, your time on site and page will go up. I’m not going to get into all of the details here but for visits that only include one pageview, Google will use the last engagement HIT of the session to calculate time. So this will be the time the visitor gets to the bottom of the content or the page.

If you use this technique expect your bounce rate to drop and your time on site to rise.

Your time measurements are going to be MUCH more accurate.

Before and After

To wrap up some analysis I thought it would be interesting to take a look at a few pieces of content and their perceived performance before this tracking technique.

Here’s a piece of content with data compared week over week. This week with the new content tracking vs. last week. Look at the HUGE difference in time on page and bounce rate.

Comparing content before and after the code change.

There will be a huge change in your metrics after implementing this technique.

An obvious improvement right? I’m getting much better numbers now that the code has been changed.

What About Custom Variables?

If you read part one then you’re probably wondering about the custom variables. The code actually sets a custom variable when someone reaches the bottom of the page in more than 60 seconds. If they get to the bottom in less then 60 seconds then they are a scanner.

After seeing how few people actually make it to the bottom of the HTML page I think this code should be moved to the bottom of the Content section. So I moved the code. I also changed the code to be a page-level custom var.

Custom vars will give you much of the same data that you get from events. Analysis for custom vars will have to wait.

That’s the end… for now.

I hope you find this interesting. I certainly do. I think it opens the door to a different, more accurate, method to track content.

Again, thank you to all the people that have contributed.

Be Sociable, Share!

    Like this post? Sign up to get posts delivered to your inbox.

    Comments

    1. says

      Thanks for the analysis Justin – it makes for very interesting reading, especially the information about how this affects average time on page (and therefore average time on site) and bounce rate. The traditional definition of bounce rate needs to be redefined when the data collection approach is changed in such a substantial way.

      Would Google ever consider including this type of reporting in GA, or is it simply a pipedream like so many other customised tracking scripts that GA enthusiasts have been implementing themselves over the years?

      • says

        @Brendan: Never say never. But with that said, we need to change the entire way that people think about content measurement before something like this will get any traction. Perhaps an really smart person like yourself can build a cool tool that uses the GA API to create better content reports!

    2. Dan says

      Nice article, Justin.

      Also, should you be in a situation where you want to implement this without affecting the time on site or bounce rate you can add the non-interaction value in the events. Comes in handy when folks are relying on trending and not necessarily accuracy.

    3. says

      Thanks for sharing part 2 also Justin.

      As I commented in Part 1 I’ve done things a little bit differently using your script.

      I’ve tracked Scrolling Allowed instead of Article Loaded
      I’ve tracked Content Length
      I’ve used 30 seconds as a classification of a Reader

      And as I you wrote, these data begs for Excel, so I have also done that. So for those who are interested, in this picture you’ll find my data. I have removed the worst “time outliers” since they skew my data too much.

    4. Victor Acquah says

      @Eivind: What is the difference between “allowed scrolling” and “scroll started” ? Is “allowed scrolling” basically the total number of page views? ( i.e same thing as “article loaded” – just a different name? ).

      Does your modified code allow you to track interactions with specific parts of a page ? e.g. read article, read comments, watched video, shared article e.t.c? I’d like to be able to get a much finer detail of info about specific on page events for each user visit. thx.

    5. says

      @Victor
      What is the difference between “allowed scrolling” and “scroll started” ? Is “allowed scrolling” basically the total number of page views? ( i.e same thing as “article loaded” – just a different name? ).

      “Allowed scrolling” is measuring if you have to scroll the article or not to read it based on the height of the article. If the article is so long that you have to scroll it, that will be tracked as “Allowed scrolling”. The difference between “Allowed scrolling” and “Pageviews” on my site is 2 %.

      Does your modified code allow you to track interactions with specific parts of a page ? e.g. read article, read comments, watched video, shared article e.t.c?

      No, my code doesn’t allow me to do that. But if you want to do that, I guess you can modify the Scroll Analytics for Jquery to do that.

    6. says

      @Elvind Not being a JS whiz myself I’d love it if you posted a copy of your version of the code. The changes you mention seem like they are exactly what I’d like to be using to gather my information but I’m not sure how to make them.

      thanks in advance.

    7. says

      Those are some awesome GA tools! I have been wondering how far people actually read through the pages I’m tracking. I’m glad that your statistics show what I’ve been assuming in writing content and page design. Thanks!

    8. says

      Good Stuff Justin. I just finished up tracking for a similar app that does close to this. I also used Custom Variables to track steps on a personalization engine. I found using the reports much easier than trying to use Funnel Visualization, Goal Flow (or Events flow)!

      Cheers,

      Jim

    9. says

      I was thinking about the time lag before scrolling and then thought that perhaps there should be time tracking onfocus of the article instead of just the whole page, or at least the whole page(window?). Usually, when we read an article, we have our pointer over it or at least within the browser window so that we can actually scroll. If we load a an article to another tab, there’s no focus so the time tracking should not kick in. This would give you more accuracy about when the reader is actually ON the article doing something.

      Aside from that, I think this is awesome – especially for any blog where the whole Bounce rate had almost become irrelevant. You can’t tell when people just bounced or when they read the article and then left -

    10. Ammar Haider says

      @Justin
      Really great tool im trying to implement for clients we provide content for..

      Frustratingly though, I can’t get this to work in terms of specifying a div and getting the script to trigger an event when the end of the div is reached. The event is triggered as soon as the user moves the scroll bar.

      Have only tried this on the traditional ga.js GA code could this be the issue?

    11. says

      Hello Justin…. Wonderfull article….I have a two parts question. If the case is that I don’t want to use a goal for the PageBottom or ContentBottom Events, could I create a segment which includes ActionEvent = PageBottom | ContentBottom? and the second part is…. a segment indicates that those sessions where a PageBottom event was triggered would be in the bucket?

      Thanks

    12. James says

      Hi Justin,

      Are you aware of any services similar to http://www.tynt.com/ that help you to track when people copy and paste your content for use elsewhere – either on their own blogs or email / share with contacts etc?

      This would give you an idea of what content people are interested in.

      Thanks,

      James

      • says

        @James: Yes, I do know Tynt. And it is helpful to measure interest. While I have not used it, I can only image that there is far less data than from measuring people actually scrolling on-site. But it’s something that I should check out. Thanks for the tip!

    13. Arseniy says

      Great work! Thank you very much!

      I have one question: is it possible to count as a goal total number of events, and not just unique events?

      Thank you in advance!

    14. Jenn says

      So happy I stumbled upon this site. Great article and very easy to follow. I installed today on our site. Just a quick question to make sure I’m doing this correctly. Is all I had to change in the main code is the name of my div tag when the content begins? I’m not seeing Reader under Top Event yet but I’m not sure long this takes to calculate.

    Trackbacks

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>