Last time, I looked at how a reasonable management goal--improve the performance of an automated customer service system--gets translated into a bad metric, like "call containment" defined as the percentage of calls which don't go to a customer service representative.
It's easy to take potshots at bad metrics, because they're so common, often obvious, and yet companies keep using them. I've had my fun, though, so now it's time to talk about how to build a better metric.
I'll take the business goal as a given: "Improve the performance of an automated customer service system." The problem is translating the often-vague business goal into something measurable without creating a perverse incentive to do the wrong thing.
Step 1: Translate Vague Goals into More Precise Ones
The first thing is to make sure everyone understands what the business goal is really all about. In this instance, the word "performance" could mean any number of things, from speech recognition accuracy to dollars per customer served.
One problem with this particular goal is that "performance" can't really be measured with a single number for an IVR system. As a bare minimum, you need to ask two things: how well does it work for customers who can be served through self-service; and how well does it work for customers who must talk to a human. The latter group can be a more challenging problem, since those are the people who have more complex needs: billing problems, product returns, lost shipments, and similar problems.
Since every stakeholder is going to have a different idea of what the business goal actually means, it may be best to spend some time brainstorming. You may find that several different measurements are needed. For example, you may decide that the "performance" of an automated customer service system needs to be measured in three ways:
- Of those customers who can be handled in the IVR, as many as possible complete their task inside the self-service system on the first call.
- Of those calls which are sent to agents, the customer spends as little time as possible in the automated system.
- The IVR correctly determines whether to provide live or self-service as often as possible.
Step 2: Draft Operational Definitions
Having settled on the dimensions of IVR performance, the next step is to figure out the nuts and bolts of measuring each parameter. Some parameters are available as statistics straight off of the system--for example, the time a caller spends in the IVR before being sent to an agent queue should be easy to calculate from system logs.
Other parameters are more difficult. For example, to decide whether the IVR is correctly providing live or self-service, you need to know what the customer's problem was and whether it can be automated. If the call was sent to an agent, the agent can code the call reason; but if the customer hung up in the automated part of the call, you will have to do a survey to determine whether that call should have been sent to an agent or not.
At this stage, you want to be mathematically precise in defining the metrics. Write an equation, define where the raw data will come from, and maybe even calculate the statistics a few times to see if it makes sense.
Step 3: Refine Operational Definitions
Now that you've got a working draft of how to calculate the metrics, it's time to add people back into the equation.
Remember that if someone is given an incentive to meet a goal, they will either (a) work to meet the business goal, or (b) manipulate the metric. Which people actually do will depend on which is easier, the character of the individuals, and how worried they are about getting caught cheating.
If it's easy to game the system and the odds (or penalty) of getting caught is small, then the metric won't work to meet the business goal, it'll just identify the people good at cheating.
The question you want to ask is "How else can I improve this metric, other than by meeting the business goals?"
For example, if you're asking agents to code the reason for the call in order to determine whether the call was routed properly, make sure there aren't any perverse incentives for agents to miscode calls. Do customer service reps get a break in their performance metrics if they handle lots of difficult calls? Or is there concern that a successful IVR implementation will lead to layoffs?
Similarly, design decisions in the IVR made to meet the metrics could have bad side-effects. If you're measuring whether the call is handled in the IVR (instead of the customer), then the statistics will look better if the system is designed to force people to hang up and call back when they reach a dead end.
Probably the most effective way to refine your metrics is to get several people together and brainstorm how you might manipulate it. Fresh perspectives and creativity are very helpful, since some manipulations are both subtle and easy to do. Consider both how the metric can be revised, and how manipulation can be detected.
The classic example of manipulating a metric is the automated end-of-call survey:
Worst: At the end of the call, the customer service representatives manually transfers callers into an automated survey.
If the agents are given an incentive to improve their survey scores, it doesn't take much imagination to see that upset customers will never get transferred to the survey. You simply can't rely on people to come up with a random sample, especially when it's in their best interest to inflate their scores.
Not Much Better: After the agent hangs up, the call is automatically transferred to an automated survey.
This at least takes the burden off the agent to decide which customers to survey. Even so, if the customer hangs up before the agent, then the call doesn't get surveyed. It doesn't take long for clever agents to figure out that if they say "goodbye" and wait a few seconds, the customer will nearly always hang up first.
What you wind up measuring is which agents have figured out how to manipulate the survey, and which ones haven't.
Best: The customer is called back for the survey within a few minutes of the end of the call.
If the survey is done on a second call, there's very little opportunity for the agent to manipulate who takes the survey. This will give the best measurement of which agents are actually providing better service. The downside is that it takes a little more effort to call customers back, and if it's an automated survey the customer may get a negative impression from being called by a machine.