UX benchmarking is the process of evaluating a product or service’s user experience by using metrics to gauge its relative performance against a meaningful standard. These metrics are usually collected using quantitative usability testing, analytics, or surveys.
Consider conducting a benchmarking study if you want to:
- Track the overall progress of a product or service
- Compare your UX against an earlier version, a competitor, an industry benchmark, or a stakeholder-determined goal
- Demonstrate the value of UX efforts and your work
In a related article, we discuss when to benchmark. At a high level, benchmarking is a method to evaluate the overall performance of a product (and as such, is a type of summative evaluation). Thus, benchmarking studies tend to occur at the end of one design cycle, before the next cycle begins.
Benchmarking is often a program rather than a one-time activity: many organizations collect metrics repeatedly, as they go through successive releases of their designs. Benchmarking keeps teams accountable and documents progress in a measurable way.
Process Overview
In this article, we present a high-level seven-step process for creating a benchmarking program. When first establishing this program, there will be some extra work to do in order to figure out what to measure and how. However, once you’ve determined the study structure, the process becomes fairly repetitive and a lot less work is involved.
Step 1: Choose What to Measure
Focus on the key metrics that best reflect the quality of the user experience you’re interested in evaluating. Look for metrics that translate to UX and organizational goals.
That said, before you determine which metrics to collect, you must define the context of your study. In other words, consider:
- What product will you focus on? (website, application, etc.)
- Which user group will you target?
- What tasks or features do you want to measure?
Tasks
Figure out the top tasks that users complete in your product. If your organization doesn’t have existing top tasks, you can start by documenting (most) tasks in the product. Then, prioritize the list of tasks and select approximately 5–10 that are most important to your users.
The table below outlines multiple possible product and task scenarios. It includes just one task per product, but in real life you will probably focus on more than one task.
Product |
Possible task |
Smart-speaker app |
Setting up a new smart speaker |
Ecommerce website |
Making a purchase with 1-click purchasing |
Mobile-banking website |
Updating contact information |
B2B-agency website |
Submitting a lead form |
Mobile puzzle game |
Solving one puzzle |
The following list outlines multiple possible product and task scenarios. It includes just one task per product, but in real life you will probably focus on more than one task.
Smart-speaker app
- Task: setting up a new smart speaker
E-commerce website
- Task: making a purchase with 1-click purchasing
Mobile-banking website
- Task: updating contact information
B2B-agency website
- Task: submitting a lead form
Mobile-puzzle game
- Task: solving one puzzle
Metrics
Now that you’ve focused in on a set of tasks, how can you measure them? Google’s HEART framework provides a concise overview of different types of metrics you may want to collect and track. The following table is an adaptation of the HEART framework:
|
Description |
Example Metrics |
Happiness |
Measures of user attitudes or perceptions |
Satisfaction rating Ease-of-use rating |
Engagement |
Level of user involvement |
Average time on task Feature usage |
Adoption |
Initial uptake of a product, service, or feature |
New accounts/visitors Sales Conversion rate |
Retention |
How existing users return, and remain active in the product |
Returning users Churn Renewal rate |
Task effectiveness and efficiency |
Efficiency, effectiveness, and errors |
Error count Success rate Time on task |
Now that you’ve focused in on a set of tasks, how can you measure them? Google’s HEART framework provides a concise overview of different types of metrics you may want to collect and track. The following is an adaptation of the HEART framework:
Happiness: measure of user attitudes or perceptions
- Metric examples: satisfaction rating, ease-of-use rating, Net Promoter Score
Engagement: level of user involvement
- Metric examples: average time on task, feature usage, conversion rate
Adoption: initial uptake of a product, service, or feature
- Metric examples: new accounts/visitors, sales, conversion rate
Retention: how existing users return and remain active in the product
- Metric examples: returning users, churn, renewal rate
Task effectiveness and efficiency: efficiency, effectiveness, and errors
- Metric examples: error count, success rate, time on task
Note that as an engagement metric, time on task should be high (e.g., a long time spent reading articles on a newspaper site), whereas as an efficiency metric, time on task should be low (e.g., fast to check out on an ecommerce site). In other words, the same change (say, longer time) could be either good or bad, depending on what type of use is measured.
Pick metrics that will matter for the long haul, since ideally, you’ll be collecting these metrics repeatedly over many years. Aim for 2–4 metrics that focus on different aspects of your UX (e.g., happiness and engagement).
Here are some possible metrics we may track for the tasks in our previous example.
Product |
Task or Feature |
Metrics |
Smart-speaker app |
Setting up a new smart speaker |
Time on task Success rate Single Ease Question (SEQ) |
Ecommerce website |
Making a purchase with 1-click purchasing |
Weekly sales with 1-click 1-click feature adoption
|
Mobile-banking website |
Updating contact information |
Completion rate Errors on page # of support calls on the same task |
B2B-agency website |
Submitting a lead form |
Form submissions Abandonment rate |
Mobile-puzzle game |
Solving one puzzle |
Success Rate Returning users |
Smart-speaker app
- Task: setting up a new smart speaker
- Metrics: time on task, success rate, Single Ease Question (SEQ)
E-commerce website
- Task: making a purchase with 1-click purchasing
- Metrics: weekly sales with 1-click, 1-click feature adoption
Mobile-banking website
- Task: updating contact information
- Metrics: completion rate, errors on page, # of support calls on the same task
B2B-agency website
- Task: submitting a lead form
- Metrics: form submissions, abandonment rate
Mobile-puzzle game
- Task: solving one puzzle
- Metrics: success rate, returning users
Benchmarking a user experience isn’t just about tracking metrics, it’s also about demonstrating value. That’s much easier to accomplish when you select metrics that align to your organizations’ key performance indicators (KPIs). For instance, in a bank where customer-support cost is a KPI you may be able to show that a redesigned contact form contributed to decreased support costs by tracking the number of support calls before and after the redesign.
Step 2: Decide How to Measure
When it comes to determining the methodology to collect your metrics, you must consider the time commitment that the research method requires, cost of such method, skill of the researchers involved, and the research tools available to you. Don’t do something if you don’t have the right skills, since bad numbers are worse than no numbers. Also, don’t specify a measuring plan that will be too expensive to be sustained in the long term (because the entire idea of benchmarking is to repeat the measurement again and again).
Before you start planning a new study, see what existing data your organization has around the experience you want to measure. It can be extremely valuable to gain a holistic understanding of the experience and connect your UX metrics to larger organization goals. When requesting data from other sources, be sure to explain why it’s needed and how it will be used.
There are 3 research methods that work well for UX benchmarking: quantitative usability testing, analytics, and survey data.
Quantitative usability testing. Participants perform top tasks in a system and researchers collect metrics (such as time on task, success rate, and satisfaction) that measure the users’ performance on those tasks.
- Analytics. System-usage data (such as abandonment rates and feature adoption) is automatically gathered.
- Surveys. Users answer questions to report their behavior, background, or opinions. Task ease, satisfaction ratings, net promoter score are all metrics collected in surveys.
Ideally, you’ll pair a survey (to get self-reported metrics) with a behavioral, observational method (quantitative usability testing or analytics) to get a holistic view of the user experience.
In the following, we’ve charted out methodologies given our previous scenarios.
Product |
Task or Feature |
Metrics |
Methodology |
Smart-speaker app |
Setting up a new smart speaker |
Time on task Success rate Single Ease Question (SEQ)
|
Quantitative usability testing with survey |
Ecommerce website |
Making a purchase with 1-click purchasing |
Sales Adoption Net promoter score
|
Analytics Survey |
Mobile-banking website |
Updating contact information |
Completion rate Errors on page # of support calls on the same task
|
Analytics Internal customer-support data |
B2B-agency website |
Submitting a lead form |
Form submissions Abandonment rate
|
Analytics |
Mobile-puzzle game |
Solving one puzzle |
Average time spent Retention
|
Analytics |
Smart-speaker app
- Task: setting up a new smart speaker
- Metrics: time on task, success rate, Single Ease Question (SEQ)
- Methodology: quantitative usability test with survey
E-commerce website
- Task: making a purchase with 1-click purchasing
- Metrics: weekly sales with 1-click, 1-click feature adoption
- Methodology: analytics, survey
Mobile-banking website
- Task: updating contact information
- Metrics: completion rate, errors on page, # of support calls on the same task
- Methodology: analytics, internal customer support data
B2B-agency website
- Task: submitting a lead form
- Metrics: form submissions, abandonment rate
- Methodology: analytics
Mobile-puzzle game
- Task: solving one puzzle
- Metrics: success rate, returning users
- Methodology: analytics
Step 3: Collect First Measurement: Establish Baseline
Now that you’ve determined which metrics to collect and how to collect them, it’s time to gather your baseline metrics. (But not so fast — do a pilot study first to collect an initial sample of data and run a preliminary analysis to make sure your method is sound and that the data can answer your questions. Most likely, the pilot will make you revise your methodology, meaning that the initial data set should be discarded. But this is worth the investment in order to get sound results from your subsequent, bigger data-collection efforts.)
As you gather your first set of measurements, consider external factors that may affect your data and, when possible, plan around it. For instance, if you’re an ecommerce website using analytics to collect sales metrics for benchmarking, be wary of factors like extensive marketing campaigns or large-scale economic influences that can disrupt your metrics and make it difficult to correlate the design change to outcomes.
One measurement of your site is not likely to be meaningful by itself. Even if you’ve just started your benchmarking program and you don’t have prior data to compare to, you can still make comparisons with competitors, an industry benchmark, or a stakeholder-determined goal. Below we provide examples of each.
- Your competitor. For example, if your product is a smart-speaker app, you could benchmark the experience of setting up your product versus setting up a competing product. (To do so, you will likely have to collect data on your product and on competitors’ products, so the prior steps will have to take that into account. That said, you could not use analytics as your methodology, since you won’t have access to your competitor’s analytics.)
- Industry benchmark. You may have access to external statistics pertaining to your field. For example, if you’re a hotel website, you may want to compare your NPS to the average net promoter score (NPS) for this industry, which is 13%.
- Stakeholder-determined goal. For instance, your stakeholders say they want the average time to submit a lead form to be under 3 minutes, so you may want to compare your current performance to that threshold.
As you’re considering how to interpret the outcome of these comparisons, take into account the recommendations described in step 6.
Step 4: Redesign the Product
The redesign process is outside the scope of this article, though it’s an incredibly important part: without a redesign, you won’t be able to compare multiple versions of your product.
As you redesign your product, keep the 10 usability heuristics for interaction design in mind.
Step 5: Collect Additional Measurement
After your redesign is launched, measure your design again. There is no hard and fast rule on how long to wait after a design is launched to measure again. If you’re tracking analytics, there’s added benefit of continuous measurement. However, for task-based data collection, like quantitative usability testing and surveys, you’ll need to determine the right time to collect the data. Users often hate change, so give them a bit of time to adapt to the redesign before measuring it. The amount of time varies depending on how frequently users accesses your product. For products accessed daily, perhaps 2–3 weeks is enough time. For a product that users access once or twice a week, 4–5 weeks before you measure is better.
As you consider the right time to measure your new design, once again document any potential external influencers that may impact your findings.
Step 6: Interpret Findings
Now that you’ve gathered at least two data points, it’s time to interpret your findings. You shouldn’t take your metrics at face value since the sample used for your study is likely much smaller than the entire population of your users. For that reason, you will need to use statistical methods to see whether any visible differences in your data are real or due to random noise. In our course, How to Interpret UX Numbers: Statistics for UX, we discuss this topic in great detail.
In general, interpreting your metrics is highly contextual to your product and the metrics you’ve chosen to collect. For example, time on task for an expense-reporting app is different than time on task for a mobile game. In the following, we outline one of the previously discussed scenarios and interpretation of the findings.
Scenario: Setting up a Smart-speaker
Assume we used quantitative usability testing paired with a survey to collect time on task, success rate, and SEQ. The following table outlines hypothetical metrics for our initial design and redesign.
|
Initial Design |
Redesign |
Average time on task (minutes) |
6.28 |
6.32 |
Average success rate |
70% |
95% |
Average SEQ (1 very difficult – 7 very easy) |
5.4 |
6.2 |
In summary, time on task was nearly the same, success rate increased, and average SEQ increased. Let’s assume we found statistically significant differences between these pairs of metrics. Therefore, in the redesign, users were more successful and satisfied with the setup process. In other words, the redesign was a success!
Step 7: Calculate ROI (Optional)
Benchmarking allows you to track your success and demonstrate the value of your work. One way to demonstrate the value of UX is to connect the UX metrics to the organization's goals and calculate return on investment (ROI). These calculations connect a UX metric to a key performance indicator (KPI) such as profit, cost, employee productivity, or customer satisfaction.
Calculating ROI is extremely beneficial, though not widely practiced by UX professionals (perhaps because relating your UX metric to a KPI is often convincing enough). In any case, if you struggle to prove UX impact, calculating ROI can be persuasive.
Presenting Benchmarking Findings
As you wrap up your analysis and share your findings with stakeholders, aim to tell a story with the data. Just because some members of your leadership love numbers doesn’t mean you can’t incorporate some qualitative findings or quotes from previous studies that align with your findings — this can be a great way to build empathy for your users among that data-driven audience.
Additionally, when presenting to stakeholders, be sure to have documented all of your assumptions and possible confounding variables of your study. Though you may not have to directly comment on them, having them in an appendix of your presentation shows that you have a holistic understanding of the product environment and allows you to easily reference it, should any questions arise about the validity of your measurements.
Conclusion
Benchmarking is a fantastic tool to align and correlate UX efforts to overall organizational goals and outcomes. To conduct a benchmarking study, begin by focusing on important tasks or features in your product and determine how you can measure them. Next, select a research method that allows you to collect those metrics, given your time, budget, and skills. Collect your first measurement, redesign your product, and collect those metrics again under the same methodology. Finally, interpret your findings by comparing your collected data points and using your product and organization knowledge to make sense of it all.
Then, do it all again next year! (Or after the next release.) Hopefully, your numbers will be better, and if not, you’ll know where to focus efforts during the subsequent redesign.
References
K. Rodden, H. Hutchinson, X. Fu. “Measuring the User Experience on a Large Scale: User-Centered Metrics for Web Applications” (2010). Source: https://research.google/pubs/pub36299/
Share this article: