Advanced Content Tracking with Google Analytics: Part 1

This is part 1 of a two part series on advanced content tracking. This post is about why you might want to use this technique and how to implement. The next post will cover the reporting and analysis.

Do people actually read content?

Do people actually read content?

The default content tracking content in Google Analytics is fairly straight forward. Using the standard page tag you can get all sorts of information like time on page, bounce rate and pageviews.

But sometimes this is not enough. For publishers and minor bloggers (like your’s truly) these metrics can be sub-optimal.

I want more detailed information about each article. Do people read the comments or do they just read the post? Do they open a lot of posts in tabs?

What would be better is a way to measure more detailed information about how website visitors interact with each page.

So that’s what this post is all about: measuring how people interact with content using custom tracking.

Some Thanks

Before we begin, this blog post, technique and concept was born from collaboration. There are a number of people that need to be recognized for contributing. You can read more about the genesis of this technique at the bottom of the post. Contributors include:

Nick Mihailovski – Developer advocate at Google (and the guys that sits across from me)
Thomas Baekdal – Smart guy and publisher of
Avinash Kaushik – If you don’t know Avinash…
Joost de Valk – Creator of the Google Analytics for WordPress
Eivind SavioBlogger and GA consultant

Now, on the details!

Business Objective

As I mentioned above the objective here is to get a better understanding, on a page by page basis, of the content that visitors engage with. Using some objectives suggested by Thomas Baekdal here’s what we’re going to track:

  • How many people scroll
  • When a person starts to scroll
  • When a person reaches the end of an article (not the end of the page, but the end of the article or post area)
  • When a person reaches the botton of the page (the bottom of the HTML)
  • Which website visitors are scanning my articles and which are reading my articles

Think about the value here! We will be able to get an accurate measure of which articles are actually read. We can even see which articles are so engaging that visitors continue through the comments to the botto of the page. Very useful stuff.

Tracking Technique

All of the above can be tracked with Event Tracking. The concept is that we will fire events when certain actions happen. Specifically we’re going to fire events based on visitors scrolling down the page.

Critical to any event tracking implementation is the data model. We need to define the data we want to see in Google Analytics.

All of the reading activities will be grouped together into a category named Reading.

Within this category there will be four main actions:

  • Article load: Measure how many times the article loads in a browser. Basically another count of pageviews. This will provide context to the other events that we track.
  • Start Reading: Track when a visitor starts scrolling down the page. This will be triggered after a visitor scrolls 150 pixels down the page. This value can be customized. I’m also tracking how much time it takes to start scrolling.
  • Content Bottom: Track when a visitor reaches the end of the article content. And track how much time it took between the scroll start and getting to the bottom of the content.
  • Page Bottom: Track when the visitor reached the botton of the page and how long it took.

Another piece of critical information is the page URL and title. We need this to segment the data and see which articles are most engaging to people. Google Analytics will automatically track the page URL and title so there’s no need to add it to the event.

We’re also going to use a Custom Variable to place this visitor in a bucket. If it took them less than 60 seconds to get to the bottom of the page then I will assume they are just scanning. We’ll put them in the Scanners bucket.

But, if they took longer than 60 seconds to get to the bottom of the page then we’ll put them in the readers bucket.

FInally, I can set these events up as goals. I’ll add one goal for those that make it to the bottom of the content and one goal for those that make it to the bottom of the page. This is an easy way to measure what percentage of visits complete these actions.

The Code

First, this code uses something called jQuery. It’s a special JavaScript library that makes it easier to program complex tasks. Almost every website is running jQuery these day. But make sure your site includes the library.

Here’s the code, feel free to copy, tweak and share. Just remember all the people that contributed to it!

We start with some simple declarations. These control flow etc. But notice there are a couple values you can change.

    // Debug flag
    var debugMode = true;

    // Default time delay before checking location
    var callBackTime = 100;

    // # px before tracking a reader
    var readerLocation = 150;

    // Set some flags for tracking & execution
    var timer = 0;
    var scroller = false;
    var endContent = false;
    var didComplete = false;

    // Set some time variables to calculate reading time
    var startTime = new Date();
    var beginning = startTime.getTime();
    var totalTime = 0;

You can change the callBackTime variable and the readerLocation variable. callbackTime is the time (in millisecond) that the browser will wait before checking the scroll location. This eliminates any lag in scrolling.

readerLocation is the distance, in pixels, that the visitor must scroll before we fire an event and classify them as someone who starts reading.

Now we send off an event to track that the article has loaded:

    // Track the aticle load
    if (!debugMode) {
        _gaq.push(['_trackEvent', 'Reading', 'ArticleLoaded', '', , true]);

Next comes the code that checks the location. First we gather where the visitor is on the page and how far they have scrolled.

bottom = $(window).height() + $(window).scrollTop();
height = $(document).height();

Then we start checking.

First, have they scrolled enough to fire the first event (150 px)?

// If user starts to scroll send an event
if (bottom > readerLocation && !scroller) {
  currentTime = new Date();
  scrollStart = currentTime.getTime();
  timeToScroll = Math.round((scrollStart - beginning) / 1000);
  if (!debugMode) {
    _gaq.push(['_trackEvent', 'Reading', 'StartReading', '', timeToScroll]);
  } else {
    alert('started reading ' + timeToScroll);
    scroller = true;

IMPORTANT: The above event WILL change your bounce rate. As soon as someone starts scrolling I consider them engaged and not a bounce. So this event will drop your bounce rate. Also note that these events WILL change your time on site calculations. You should see time on site increase.

Then, when they reach the bottom of the content area, this event fires marking their progress. I’m basically checking to see if the div that contains the article content has been reached. If so, fire the event.

// If user has hit the bottom of the content send an event
if (bottom >= $('.entry-content').scrollTop() + $('.entry-content').innerHeight() && !endContent) {
  currentTime = new Date();
  contentScrollEnd = currentTime.getTime();
  timeToContentEnd = Math.round((contentScrollEnd - scrollStart) / 1000);
  if (!debugMode) {
    _gaq.push(['_trackEvent', 'Reading', 'ContentBottom', '', timeToContentEnd]);
  } else {
    alert('end content section '+timeToContentEnd);
  endContent = true;

It’s really important to note that the above code looks for a div specific to my blog. On my site the div is named entry-content. It might be different on yours. Basically you’re looking for the container that holds the blog post or article.

Finally, we track if the visitor got to the bottom of the page. Here we do a few things.

  1. We calculate how long it took them
  2. We send an event
  3. We set a custom variables to bucket our traffic. If the visitor took longer than 60 seconds to reach the bottom then we’ll put them in the reader segment using a visit level custom variable. If they take less than 60 seconds I’ll put them in the Scanner bucket.

I’m putting them into custom variable slot 5 because that’s the only slot that I have available. You may need to use a different slot. Dont know what a slot is? Read more about mastering custom variables.

// If user has hit the bottom of page send an event
  if (bottom >= height && !didComplete) {
    currentTime = new Date();
    end = currentTime.getTime();
    totalTime = Math.round((end - scrollStart) / 1000);
    if (!debugMode) {
      if (totalTime < 60) {
        _gaq.push(['_setCustomVar', 5, 'ReaderType', 'Scanner', 2]);
      } else {
        _gaq.push(['_setCustomVar', 5, 'ReaderType', 'Reader', 2]);
      _gaq.push(['_trackEvent', 'Reading', 'PageBottom', '', totalTime]);
    } else {
      alert('bottom of page '+totalTime);
  didComplete = true;

Since we’re collecting the time spent on page, I’m going to use this data to adjust the threshold after I collect some data. I chose 60 seconds arbitrarily.

And finally, here’s the code that actually checks if the visitor has scrolled down the page:

// Track the scrolling and track location
$(window).scroll(function() {
  if (timer) {
  // Use a buffer so we don't call trackLocation too often.
  timer = setTimeout(trackLocation, callBackTime);

So that’s the code. You can copy it from the iFrame above and place it on your site if you want to.

Stay tuned for a post tomorrow about the resulting reports!

Be Sociable, Share!

    Like this post? Sign up to get posts delivered to your inbox.


    1. says

      The thing that bothers me is that I don’t see what I can do with this data. After a while you know which articles are read en which aren’t, and then what? The articles are already there and I won’t change them. I don’t mind if someone scans an article or reads it, as long as it was useful to my reader/scanner.

      So I’m looking forward to your post tomorrow ;)

      • says

        @Andre: I love your honest feedback, thanks! I think this gives a more accurate measure of content performance. Here’s one thing that I found, almost nobody reads the comments on this blog. Or at least over the last 2 days. Nobody scrolls to the bottom of the page.

        @Xiaoq: Yes, short articles on devices that have a large resolution can be a challenge. But in my early testing this has not been an issue. And, if your blog has shorter posts, you can change the settings to meet your need.s

        And yes, you can make all these hits non-interactive. My feeling is that once someone starts to scroll they are engaging the content and should not be considered a bounce. If you believe the opposite then you can change the hit-types.

        @Nate: Thanks Nate, I appreciate it. You made my day.

        @Adrian: I am no JavaScript guru. A lot of this was cobbled together and tested extensively. To answer your questions:

        1. Yes, there are a number of flags that keep the events from firing multiple times.
        2. Windows and tabs are a challenge. The data shows that people wait a long time before reading. This might be a discussion topic for the next post.
        3. Yes, you are correct about the custom vars. I’ve been going back and forth on the need for custom vars. Honestly, I don’t think we really need them.

    2. says

      Good Staff!

      But I think there are some limitations for these tracking code, that is screen resolutions may pollute the result, for example, if the visitor screen is big enough and the article is not so long, then the visitor may not need to scroll down to view the whole article.

      And, can we add the non interactive value in the event tracking to make the bounce rate not affected?I’m also wandering why the time on site increases?


    3. says

      Thanks for this amazing post. I haven’t implemented this yet or tested it in my own environment. But regardless of how well it might work in my environment I wanted to give you a virtual pat on the back for creating exactly the type of script I’ve been needing for a while.

    4. says

      This is awesome Justin! Coming from an advertising background, it’s very easy for me to understand this through the metaphor of impressions/clicks/conversions. I really hope content websites take this content tracking to heart.

      I have a few questions about the implemenation:
      -I’m still learning JavaScript, but it looks like the endContent variable controls for someone scrolling back up and then back down (thus firing multiple ContentBottom actions)?
      -How would one account for a user moving to another window and then coming back to your content window?
      -Concerning the scope you used for the GA custom variable: in a scenario where a user 1) reads one article, 2) scans a second article, and then 3) exits the site, the custom variable would hold that session as “scanned”, correct? I am just curious about why you went with that scope instead of the page level scope :-)

    5. says

      Very good idea Justin, would be very helpful specially to weblogs.

      I personally would set the startReading event as a non_interactive, but this is my personal opinion.

      I think we need better standards to track some common things that a lot of us (power users) do everyday. MaxScroll, FormTracking, Download Tracking, outbound tracking, … Right now everyone does it it’s own way and when a new user asks (on Google Forums for example) how to do it he’s often given a completely different approach not very well optimized, because there’s no good published solutions.

      Weblogs are a type of site that need more love from Web Analysts. To this date I haven’t seen a good way of tracking blog comments or article reading that doesn’t involve custom code. Also it’s well stablished that bounce rate for weblogs needed to be tweaked based on interaction or scrolling since a very engaged user may do so without ever leaving the homepage.

    6. says

      @Justin: hmm, that comment part is indeed interesting. The posts with the most comments tend to load much much longer than the ones without. So if you know almost no one reads them, you should only load the top 10 at first. Great insight and action.

    7. says

      Hey Justin –

      I noticed that those events are only being fired once each. For example, the event to indicate that I reached the bottom of the post is only fired one time, regardless of if I scroll back to the top and then back to the bottom.

      This raises the question of how many people initially scan an article (quickly scrolling all the way to the bottom) then go back to the top and read the entire post. Any ideas on how this behavior could be tracked (without muddying the waters and creating an abundance of these events in our reports)? If you did fire an event every time someone reached the bottom, would it suffice to look at unique events instead of total events?

    8. says

      I’ve been testing this code since I got aware of Justin experimenting with this method (I’ve tried a different script earlier, but that script wasn’t that good).

      I rewrote some of the code, and have because of that been tracking things a little bit different than Justin.
      I’ve used a shorter threshold for scrolling (30 seconds), but not for scrolling the complete page, just the content area. I’ve also tracked everything using non-bounce Events, and I’ve only tracked pages where the content area had to be scrolled. And finally, I’ve tracked length of the content area in pixels so I could run some analysis against “Scanners” & “Readers” vs. content length.

      Then I have extracted everything into Excel (using Next Analytics) so I could get all data about a page (including time on page, pageviews etc. that aren’t available in the Event Report.

      In Excel I’ve run some correlation between content length and “Readers” and “Reading” time. My hypothesis was that content length could explain reading time and readers -> The longer the content is, the longer time it takes to scroll = Longer content will bucket more visitors as “Readers”.

      These are my numbers:
      Correlation Coefficient
      Content Length vs. Scanners -0.087
      Content Length vs. Readers 0.165
      Content Length vs. Reading Time 0.192
      Content Length vs. Scanner Time 0.264
      Content Length vs. Page Bottom Time 0.196
      Reading Time vs. Avg. Time on Page 0.943
      Page Bottom Time vs. Avg. Time on Page 0.514

      Although the correlation is slightly positive between Content Length & Readers, in my case Content Length isn’t very correlated to readers. Yes, my data is also limited so the correlation may change. This data is based on data from around 750 “Scroll Allowed pages”. Also, I only measure blog posts and articles, not category pages etc.

    9. Victor Acquah says

      This should be interesting. Great stuff.

      However, I just started testing it on my site – the events seem to fire off a pop up every time each is triggered – “started reading – 2″ or “end content – 8″. Why are these showing up as pop ups? ( Firefox 10 ). thx!

      • says

        @Vistor: You’ve got debug mode turned on. Try turning it off. It’s the first variable in the script.

        @Nate: Yes, we could use non-interactive events. You would need to add another argument to the event calls.

        @Elvind: Thanks for sharing some of your data! I completely agree that this is best for content pages, not navigational pages.

        @Jim: You are correct, I only fire the events once. I do not fire multiple events. Based on the numbers I have so far it does not appear that scanning to the bottom of the page is a common behavior. The time-to-bottom of page is fairly long. More on the data tomorrow!

        @Eduardo: Agree, we need better content tracking tools. You’ve been good enough to create quite a few for the community. Hopefully this tracking ends up in other plugins, not just the WordPress plugin.

    10. Luke Hay says


      Great article. I’ve seen similar code before but nothing as comprehensive as this. I’m going to set this up on our site but I’ve got a quick question before I do. You say “All of this awesomeness will be added directly into the Google Analytics for WordPress plugin developed by Joost. Look for it soon.” This is great as I’m using that plugin already. The question is, how soon is ‘soon’? Should I implement the code in my site now or wait a couple of days/weeks/months and do it via the plugin?


      • says

        @Luke: Ah, that depends on Joost and how busy he is. I would day weeks, not months. But probably at least a few. Also, when these things get turned into plugins they usually have less options. So if there are any settings that you would really like to change you might be better off adding the code to your site.

    11. Victor Acquah says

      Implemented this yesterday. So far do good.

      Question: I have 2 different types of content pages – one is simply a blog and the other is a photo gallery page, much like how the Boston Globe’s Big Picture is laid out. ( In the case of the gallery, I’d like to tweak this to be “Gallery Viewers” instead of “Readers”. And “Viewing” vs. Reading” but keep it all under “ReaderType. ( Both editorial readers and Gallery Viewers). Not sure if this still goes into the same slot for ReaderType. ( I am using slot 1 ). Is it possible? What mods do i need to do to the code?

      I think this script is a first step in the need to begin to focus on what happens on a page ( how users are interacting with specific page elements) vs simply reporting Page Views, which in my opinion don’t mean much these days. What people do on each page, based on what the page is designed for, is what I want to know. If the page has an editorial, slideshow, video and comments, I’d like to be able to tell how each of these content types were separately used on that page – Are people simply watching the video and not reading the editorial ? Do people also read the comments? e.t.c

      The script in its current form, relies on a combination of scrolling, speed of scroll and position on page to make assumptions about 2 specific user behaviors – reading vs scrolling. I think the next progression of this script should be geared towards a means to detect specific on page interactions – if its possible at all ( for instance, if the comment section has its own container, we can apply the same assumption to how fast someone specifically goes through this section alone, as an indication of reading the comments or simply going past it. ). Add in video view events, slideshow interactions, form submissions, subscriptions e.t.c and we begin to get a fuller picture of whats going on on the page. Maybe, modification of the script to track multiple containers on a page, with customizable scroll times for each is the interim solution? Video views and slideshows might be different. Eivind Savio seems to be doing a bit of that with his modifications to the script as noted above.

      This is very cool. Great work. Hope there is more to come.

      • says

        @Victor: Wow, that was fast! Thanks for implementing and I hope you find the data useful. I hope this is a first step to more detailed content measurement. The script could absolutely be changed to include more direct “action” measurement, like leaving a comment. As you mentioned, this is a first step. I think we need to see how this works and them advance. More than anything else I want the data to be actionable.

        Regarding your question, you can absolutely use slot 1. if you want to add another value to the ReaderType, i.e. keep everything under the “ReaderTpye” custom variable, then just add the following code.

        _gaq.push([‘_setCustomVar’, 1, ‘ReaderType’, ‘GalleryViewer’, 3]);

        I also suggest that you might want to use a page-level custom variable rather than a session level, which I use. Because you have different content types you need to “reset” the the custom var from page to page.

    12. says

      Very nice work!

      BTW: For those who you asking about read/scroll-rate and why you need it. I can tell you that it is extremely important. As more and more people learn, the number of people who read an article is much lower than what you might think. As a publisher that is critical information because our ‘product’ is the article itself. If people don’t read, the chances of them coming back is practically zero.

      Another thing that is also important is how it affects ranking of your articles. If you just look at the top 50 articles for a given month, in terms of unique visitors, you don’t see the correct picture. But if you see the top 50 articles in terms of read-rate, you suddenly know which articles people spent time reading. That can dramatically change which type of articles you want to write in the future.

      For instance, on my site I found that several of my popular articles actually had a very low read-rate, while some of my less popular articles had a high read rate. That’s something you need to know if you want your content strategy to succeed.

      BTW: I wrote about this in:

      • says

        @Thomas: Thank you, thank you, thank you. You make a fantastic point, much of our current measurement can be easily skewed and mask what is really happening on the site. This provides a much more precise picture of user activity on the site.

      • says

        @Eduardo: The code should fire at least 3 events: 1 when the page loaded, 1 when you started scrolling and 1 when you got to the bottom of the content. And if you made it to the bottom of the page then a final event should have been triggered.

        Let’s hope that’s what actually happened :)

    13. says

      This is indeed a very interesting development and addition to the weaponry of analytics power users.

      I bet you had some major influence in Google’s recent decision to remove the 10/20 events per pageview limitation to enable you to do this type of tracking.

      I was disappointed that Google introduced the limitation in the first place – there would have been plenty of people affected by it who were tracking other types of event-based user interaction.

    14. Matt says

      This is a really cool metric! I wouldn’t call it a bounce rate though. It’s too widely used of a term to be redefined at this point. “page engagement” or something like it may be a useful term.

    15. Victor Acquah says

      @Justin – Something wrong with my code. My custom variables are not working anymore after changing it from a session level to a page level variable: ( not being recorded )

      For the blog page with an editorial, I have this:

      _gaq.push([‘_setCustomVar’, 1, ‘ReaderType’, ‘Scanner’, 3]);
      } else {
      _gaq.push([‘_setCustomVar’, 1, ‘ReaderType’, ‘Reader’, 3]);

      And for the Photo gallery page with the set of pictures, I have:

      _gaq.push([‘_setCustomVar’, 1, ‘ReaderType’, ‘GalleryScanner’, 3]);
      } else {
      _gaq.push([‘_setCustomVar’, 1, ‘ReaderType’, ‘GalleryViewer’, 3]);

      Anything jump at you? thx!

      • says

        @ Gerard: Yes, there needs to be some customization. First, there is a variable at the top of the code that turns debugging on and off. Set that accordingly. Second, the chunk of code that identifies where your content ends needs to be changed. You need to identify the CSS class, or some other part of the page, that designates the bottom of your content.

        Hope that helps!

    16. Victor Acquah says

      Yes, I have an opening IF for the code blocks and there are no js errors ( posted my entire code earlier but seems it was deleted – too long). Not sure why it is not working. All the events are registering correctly with no custom variable showing up at all. Could it be that this code is placed right after the google analytics code itself? ( i.e the _setCustomVar comes after _trackPageview ?)

    17. says

      Hi Justin! I’d suggest the _setSampleRate method for large sites that could get close to GA data collection limits, after implementing this great collection of events and custom vars.
      What do you think about this?
      Greetings from Mexico!

      • says

        @Hebert: Good point, a site owner should evaluate the need for this data and how it impacts their total number of hits. In general, I’d try to convince them to use GA Premium :)

    18. says

      @Justin: isn’t the limitation 0f 500 “hits” per session controlled by ga.js – meaning that even if you had GA premium you would still encounter the limit? Or is there a different GA JS file that GA premium customers use?

    19. says

      Justin, I just noticed John’s comments in my inbox (thanks John!)… I appreciate you fixing the link, but the link text is still wrong. Not a big deal, just a heads up ;)

    20. Dennis says

      i found that this code didnt quite work properly in terms of finding the end of the content.
      My fix was this:

      Original code
      if (bottom >= $(‘.entry-content’).scrollTop() + $(‘.entry-content’).innerHeight() && !endContent) {
      New Code
      if (bottom >= $(‘.entry-content’).offset().top+ $(‘.entry-content’).innerHeight() && !endContent) {

      This was triggering way to early, before i got to the div.
      Not sure why this was not working but i do know that it is working now.

      Additionally i had to change the . in this to # as i had “div ids” rather than “div classes”

    21. James says

      Enjoyed the post, just a quick question, do you define the end of page as the very bottom of this page (where reader can see footer) or the bottom of the comments (which obviously grow as more people write) . Personally I never usually scroll beyond the ‘trackbacks’

      Perhaps you could measure if more people responded/commented if the number of trackbacks was reduced or just removed completely?

      • says

        @James: I track both. There is an event to mark when people get to the bottom of the content, and there is a second event to mark when people get to the actual bottom of the page HTML.

        Thanks for the comment!

    22. Darrell says

      I made sure jquery is installed. When I paste the code in header.php, it appears on my webpage. What am I doing wrong?

    23. sumathi says

      we are pushing our data(Username,title,organization,department,searchterm) as 5 events using eventTracking.
      values of the above 5 given in EventLabel…..
      need to have all the above as consolidated report says:

      “sumathi” “executive” from “SMI”(organization) “Marketing”(dept) “test”(used searchterm)

      all the above 5 has to be in single report (either a dashboard or custom report or standard report or Atleast want to export in excel)

      Thanks and Regards

      • says

        @sumathi: Be careful, it looks like some of the data that you are collecting is personally identifiable information. Specifically ‘username’. Remember, you can not track any personally identifiable information in Google Analytics.

        As for your question, if the data is in five different events, then you can not combine it in Google Analytics. My suggestion would be to collect all data fields in the Event Category or Action, then export and manipulate in Excel.

    24. says


      First of all a great post!
      Now when google tag manager has come to play, is it possible to insert the Javascript into a custom HTML tag? And fire when the visitor lands on the relevant content pages. Or is it better to insert the script it in the DOM?

      Thanks in advance.

    25. Attila says

      Nice work! I am testing and running it in debug mode, but ‘end content section’ alert always fire a 250px earlier before div.entry-content. Do you have any idea why? Prpbably I have to add aproximatelly 250px to the measurement function.

    26. says

      Great articles. A year later, as we start evaluating the positives of a one-page website, that is iPad friendly against the negatives related to losing decent GA results, your article has given me hope that applying these scripts to a series of tags used to separate “page” content within a single page may prove to have positive results. Let the scripting begin!

    27. David Suarez says

      hey justin, If i use this on a wordpress site and I have one of those cahcing plugins installed, would that affect the script?

      • says

        @David: Hmmmm…. I don’t think so. The caching is done by the server, and would only affect delivery of the script. It would not change any Google Analytics data collection.

      • says

        @Chris: Sort of. You would need to put this code, along with the standard GA code, in a Custom HTML tag. That should work. I’m also working on a new version that will work with GTM. It should be done by the end of Q1 2014.


    Leave a Reply to Victor Acquah Cancel reply

    Your email address will not be published. Required fields are marked *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>