We all know that web analytics data quality sucks and we’re all OK with that. Thanks Avinash! But there comes a point when the data quality is just abysmal. This usually means that there is something wrong with the web analytics application configuration.
That’s why I’m writing this post. I’ve seen too many clients struggle through the analytics process thinking that their Google Analytics data was OK when it was complete garbage. It is extremely important to have a resource in your organization that truly understands Google Analytics. Not only should they understand the product, they should also have a good grasp on your data needs. Why? Because Google Analytics, like every other analytics product, has it’s own nuances. Configuring GA correctly, so it generates the data that you need to make decisions, takes experience and expertise with the product.
I am very lucky that I get to work with organizations of all sizes. Enterprise level clients, mid-level clients and small businesses. It doesn’t matter how big the company, I’ve seen them all struggle with Google Analytics.
And here comes a blatant plug. If you’re having trouble with Google Analytics get an Google Analytics Authorized Consultant, like EpikOne, to take a look at your configuration. And remember, it not just about getting the filters right. It’s understanding what data your organization needs to make informed decisions.
Justin: You make a great point about knowing your organization and site well to make that judgement. While data quality will always be sub optimal, the least we can do it make sure that will should be collected is going as well as it should.
Now how about spilling the beans and sharing some tips on what we can look for after we install GA to know if data collection and processing is going ok? : )
I’ll get the list started:
1) An obvious one compare to your GA data with your standard web analytics tool (if you have one). This is a great way to get the red flags going.
2) You probably sources that have some data that you can reconcile, email drops (look at response rates that your email vendor is giving) or PPC campaigns (again your agency, if you use one, will give some data). The goal is to compare and recon the data to see if GA is in the ball park.
3, 4, 5, 6) Justin to add!! :)
Thanks Justin.
Avinash,
You’ve spilled the beans on my next blog post! I’m getting ready to write about the top GA errors that I see when working with clients. I’ve also got a few tips for identifying changes in traffic.
Regarding your comments, there is on thing I would urge caution on. Comparing analytics data between applications and vendors is a tricky proposition. While I agree that the numbers should be in the same ballpark, many people look for 100% accuracy. This is not realistic.
I’ve spent many hours digging through log files trying to reconcile GA data with that of another application. In almost all instances it is an effort in futility.
As we all know, it is the trending of the data that is most helpful when doing our analysis.
Look for some more tips in the coming posts!
Justin
Look forward to your upcoming posts Justin.
I completely agree with you on the “reconcile” part, quite sub optimal.
My internal gut “benchmark” is around 10 – 15% delta between tools (if both are javascript based or both are web log based, it is important to compare similar data collection methods). If it falls in that vicinity then that is tentatively ok.
The good news is that atleast you are paid per hour a very high hourly rate to do the digging, I on the other hand….. :)
-Avinash.
Wow am I glad that I’ve come across your blog. You share some great tips quite freely.
I’ve posted links to some of your tips on my blog already. (It’s not a well read blog, but it’s my way of journaling what I learn as an Entrepreneurship & allowing anyone to try to learn from me ).
Thanks for everything I look forward to reading all you post.
Hello Justin and readers,
First of all I’d like to thank you with your information which is very useful to me.
I would like to ask you some help regarding a problem which I am encoutering on the Analytics setup for the webpage of my company. As I’ve been bombarded by my boss to set up and follow our Analytics account I am informing myself since some time but I still didn’t find out which would be the best configuration for our website. The main issue is the following: our website is translated in 5 languages which are subdomains of the english version: http://www.mywebsite.com = english, http://www.it.mywebsite.com = italian, http://www.fr.mywebsite.com = french etc.
I would like to create a profile for every language as well as a profile for all the languages together to be able to see the cumulate numbers.
In order to have this I did the following I inserted the tracking code (with the same id# in all pages) in the header of every page and added the _udn=”mywebsite.com part to them. After this I created a Profile French and I applied the following filter: Filtertype=include only traffice to a subdirectory, subdirectory= ^/fr/
For the English domain it seems to work (www.mywebsite.com) but I do not know this is reliable. I made a filter which excludes all the other languages like this: exclude all traffic from domain domain: it.mywebsite.com etc.
As I don’t know which way is the correct one I would like to ask you if you could put some light on this subject as there is no info to find on the web.
Thanking you in advance for any help,
regards,
Roger
Hi Roger,
Thanks for sharing your issue. The configuration you speak of is very tricky and I usually tell people to think twice about using it. First, your setup is correct except for one thing, the filter. The filter should be an exclude filter based on the host name.
The main problem with tracking all the sites together using the technique you describe above, is that the referral data for the visitor will be lost. So, when someone goes from the Italian site to the English site the original referral information will be overwritten.
If you’ve only got a few sites, any you need to get cumulative traffic numbers, I recommend just pulling the data and using Excel. If you’re more interested in the total numbers, then your approach will work.
Hope that Helps,
Justin
Dear Justin,
First of all I’d like to thank you very much for your reply.
You write: The filter should be an exclude filter based on the host name. If I would like to exclude my french traffic than I configure like this? –> Filter: Exclude, Filterfield: Hostname, Filterpattern: fr.mywebsite.com
As you probably understood what I am looking for, and this is not or hardly possible with Analytics, could you recommend another program?
My last question is if you know about some good forum or Irc# where these quite specific topics are discussed.
I appreciate your help.
Roger
Roger,
Regarding your filter, remember that all filters use regular expressions. You need to change the filter pattern to:
fr\.mywebsite\.com
I would recommend the Google Analytics Support Forum:
http://groups.google.com/group/analytics-help
There’s lots of great information out there.
Justin