In This Issue:
By Peter Leppik
The conventional wisdom for decades has been that the ideal phone menu should have no more than five options at each level. Any more than that, it was thought, would overtax the short term memory of the caller and lead to confusion and poor experiences.
Broad vs. Deep
As far as I know, however, nobody had ever experimentally tested the question of how broad or deep a design should be for a given application.
A recent article in the journal Human Factors, A Comparison of Broad Verses Deep Auditory Menu Structure, by Patrick Commarford and James Lewis (both of IBM) et al. tackles the question of broad vs. deep menus head-on and comes up with the surprising result that (at least for the application they tested) it works better to give callers a single menu with lots of options than to give callers fewer options in each menu but have submenus.
In other words, the "no more than five options" rule of thumb is wrong, at least for the application they tested.
For this experiment, Commarford and Lewis used navigating e-mail through a speech application. They identified 11 different functions (Next, Previous, Delete, Reply, etc.), and built a "Broad" and a "Deep" version of the user interface. The "Broad" menu provided all options in a single menu, and the "Deep" version offered four top-level choices (Listen, Respond, Distribution, and Details) with the 11 options contained in the four top-level choices.
There were some design oddities about the "Deep" menus: for example, the "Delete" function was under the "Respond" menu (for discussion of how this came about, see "User-Centric Design" in this issue). In the final analysis, though, I think it's fair and reasonable to conclude that for this application a broad design is probably superior to a deep design, even though that leads to more than five choices in a single menu.
What Does It Mean
Some people may read this result to mean "make the menu as flat as possible," presumably by putting every option into one single menu, but I think what it really means is "there's nothing magical about five menu options." There will always be a design decision about how broad or deep to make a tree, unless the system has few enough choices to put in a single menu. If the IVR needs to route the call among hundreds of different destinations or provide several dozen functions, a single menu will be unwieldy (though one thing to look at in those situations is whether the total number of functions or destinations can be consolidated).
The key to usability is making sure that the menu choices are obvious (which may require including some choices in more than one menu, if they don't fit cleanly in one place), and that the caller's time and mental effort is respected. That means striking a balance between broad and deep, and the right balance will be different in every situation.
The real lesson of this research is that real-world usability testing is likely to show that general rules of thumb break down when applied to a specific design. At the end of the day, usability testing is critical to making sure that the design is right.
By Peter Leppik
"How to Design a Phone Menu" (this issue) is that the menu design used in the experiment is quirky. The e-mail browsing application had eleven different functions for dealing with each message: Next, Previous, Repeat, Reply, Reply to All, Forward, Delete, List Recipients, Add Sender (to address book), Mark Unread, and Time and Date.
The menu structure the researchers came up with for the "deep" design was:
- Listen to Messages: Next, Previous, Repeat
- Respond: Reply, Reply to All, Forward, Delete
- Distribution: List Recipients, Add Sender
- Message Details: Mark Unread, Time and Date
So if you wanted to go to the next message, you would first have to say, "Listen to Messages" and then choose the "Next" option from that submenu.
The most obvious problem with this design is that "Delete" is in a submenu called "Respond." Few callers would ever look for "Delete" under the "Respond" menu, and that's probably one of the most used functions.
This menu structure was built through a process which illustrates the hazard of blindly applying survey data to design. Here's what they did:
Step 1: 26 users were asked to organize the eleven functions into logical groupings of five or fewer functions. Each user's answers were analyzed, yielding an aggregate grouping:
- Delete, Forward, Reply, Reply to All
- Repeat, Next, Previous
- Mark Unread, Time and Date
- List Recipients, Add Sender
So far so good, though arguably some functions should be included in multiple groups.
Step 2: 101 users were asked to label each group. The most common suggestions were:
- Delete, Forward, Reply, Reply to All: "Action" (suggested 22 times), "Respond" (15)
- Repeat, Next, Previous: "Navigate" (22), "Listen to Messages" (15)
- Mark Unread, Time and Date: "Message Details" (10), "Status" (10), "Miscellaneous" (5), "Options" (5)
- List Recipients, Add Sender: "Address Book" (9), "Distribution" (9)
One problem is that the survey asked participants to "label" each group. However, that's the opposite of what a caller needs to do: a caller needs to take a set of labels and guess which label contains a given function. As descriptive terms for the groups, the labels are fine. As guideposts to the functions contained in each group, many labels fall short.
Step 3: 155 users were given the most common labels for each group, and asked to choose the most appropriate label for each group. The most popular label was used in the application:
- Delete, Forward, Reply, Reply to All: "Respond" (66%)
- Repeat, Next, Previous: "Listen to Messages" (83%)
- Mark Unread, Time and Date: "Message Details" (40%)
- List Recipients, Add Sender: "Distribution" (64%)
Again, survey participants were asked to do the wrong thing: they were asked to choose the "most appropriate" label, not which menu they thought a given function belonged under.
Surveys In Design
While user surveys are critical input to the design process, they cannot replace an experienced designer. Surveys are valuable tools, but require some intelligence and interpretation (and skepticism).
While I applaud the effort to gather data about user preferences, it's important to ask questions like "Does this make sense?" "Did we apply the survey properly?" and "Did we ask the right questions?"