I just read a great article by Bryan Eisenderg about Conversion Rate Basics. He points out some simple things that you should do to insure your site is converting visitors. This got me thinking about an experience I had earlier this week.
On Wednesday (August 29, 2007) I received a message from NetFlix regarding their video on demand service . I’m a big fan of on demand video so I decided to click on the email and view the offer. Imagine my surprise when I was greeted with the following message on the NetFlix website:
Now the site was not down long and this may have been an unforeseen problem. But the lesson is clear: make sure your site works. If it doesn’t you’re going to loose business.
Here’s a similar example. We started working with a client who uses their website to generate sales leads. While evaluating the website we found that their main lead generation form, if filled out incorrectly, would display a plain white screen to the visitor. No web page, no error message, no nothing. Just a plain white page.
How can you protect yourself from unexpected downtime? Try a site monitoring service. I’ve never used one but assume they all function the same way. At some given interval, say 2 minutes, the monitoring service makes a request to your website. If the web server returns an abnormal result then an alert is sent to the responsible party. The company usually charges a small monthly fee for this service. Does anyone out there have an experience with a site monitoring service they would like to share?
But what if the NetFlix issue was not unforeseen? What if the email blast was sent during the website’s scheduled maintenance period? To me, that indicates a lack of process. There should be been some type of process in place that stopped the email blast from going out while the website was down. From a web analytics standpoint, I always want to know when a client is sending out emails so I can insure that it is tagged for tracking. How about adding a step to the ’email blast process’ to check the website status before sending out the email? I know it doesn’t seem complicated, but unless checking the website maintenance schedule is a documented step in a defined process it could go undone.
This could be a really long reply as this posting of yours Justin, cuts to the core of some of my major experience and expertise. :-)
In essence, the problem is not “am I going to lose business”, rather “how much business can I afford to lose”.
Given an answer to that question – best formalised in $$$ terms, you can them determine an appropriate level of response. No point in having a multi-million dollar fully redundant, multiple data-centre system. If you have a sales of $5000 a year.
Silly example but I wish to stress the point.
Monitoring won’t protect you. It helps you manage the risk. Flip side, monitoring should be *proactive*. Fix problems before they become BIG, ie $$$, problems.
Email blasts can’t always be stopped if a site goes down. During scheduled maintenance, sure. But not during unforeseen. Email just doesn’t work that way. You may have already dumped half the messages into the email queues when the web server/farm breaks. Sometimes even trying to stop can actually make things worse!
Then one has all the attendant issues around “others” and their actions. I’ve seen poorly designed systems break spectacularly because of the normal day-to-day operations of another area.
In the most memorable case some… idiot(s)… decided that email was an immediate real time message transport system. Not according to the specs it ain’t, and more emphatically: Not when another client drops several million emails into ones queues it ain’t. :-)
I came across this saying some years ago. Have it stuck up on the wall in front of me. I try to live up to it. It seems appropriate to share here:
In God We Trust
Everything Else We Monitor
Incidentally, if you have clients who really can’t afford much. A VPS with one of the Open Source monitoring systems, eg Nagios, can have an enterprise grade monitoring system for practically peanuts. A competent sysadmin should be able to setup a complete, albeit simple, monitoring system in half a day.
Cheers!
Hi Steve,
Thanks for the great response. I think you’ve shown me up with a well thought out and thorough comment!
I completely agree that an organization should understand _how_ much business they can afford to loose and let that drive how they manage redundancy and disaster planning.
I should have been a better job segmenting the this issue between enterprise level organizations and small businesses. I think you’ve done a good job pointing out that larger organizations probably have more money and can actually implement solutions insuring website uptime (i.e redundant servers).
I believe that smaller organizations, that can not afford globally load balanced site, can utilize a ping service to keep things up and running (thanks for the tip on Nagios).
Thanks again for your insight, it is truly appreciated.
Justin
“show me up”? Rubbish. Haven’t shown you up at all. I’ve merely Value Added to your original posting. :-D
Incidentally, the issue is no different based on the organisations size. The resources to *throw* at a solution are different, even the solution itself can be different; But the problem is the same.
Namely: How do I maximise my availability to end users within the resources I have.
If you have $50 a month, you can do X. If you have $500 a month you can do Y. But you *may* find that instead of $500, that merely $50 will give you an adequate solution to your *needs*.
Ahhh. The subtle difference between “Want” and “Need”. :-)
To help put in some context, this year I have been involved in a project at work whereby we are designing possible solutions to handle website/information for a National Pandemic. eg Bird Flu outbreak. So this topic is… urm… topical… for me right now. :-)
FWIW? The Q&D solution we’d implement today, is to strip almost all content and images etc from the servers as they currently stand and just put up simple HTML; and probably pull in my servers via DNS Round-Robin as well. wget mirror and away we go. So we have “A” solution, but not the “Best” solution. Where “Best” has a rather extensive definition.
Last Point. I swear! If by “ping” you mean TCP port 80, virtual server specific plus Content Verification of the text sent back; Ping? Sure. :-)
If you mean network layer ICMP ping? No. That will only tell you that the network is working, or not. Won’t tell you that mysql has crashed and burned and hence that your website is down.
HTH! Cheers!