UX benchmarking is the process of evaluating a product or service’s user experience by using metrics to gauge its relative performance against a meaningful standard. These metrics are usually collected using quantitative usability testing, analytics, or surveys.

Consider conducting a benchmarking study if you want to:

  • Track the overall progress of a product or service
  • Compare your UX against an earlier version, a competitor, an industry benchmark, or a stakeholder-determined goal
  • Demonstrate the value of UX efforts and your work

In a related article, we discuss when to benchmark. At a high level, benchmarking is a method to evaluate the overall performance of a product (and as such, is a type of summative evaluation). Thus, benchmarking studies tend to occur at the end of one design cycle, before the next cycle begins.

Benchmarking is often a program rather than a one-time activity: many organizations collect metrics repeatedly, as they go through successive releases of their designs.  Benchmarking keeps teams accountable and documents progress in a measurable way.

Process Overview

In this article, we present a high-level seven-step process for creating a benchmarking program. When first establishing this program, there will be some extra work to do in order to figure out what to measure and how. However, once you’ve determined the study structure, the process becomes fairly repetitive and a lot less work is involved.

There are 7 steps to benchmark your product's UX: choose what to measure, choose how to measure, collect first measurement, redesign the product, collect more measurements, interpret findings, calculate ROI (optional).
To conduct an end-to-end UX benchmarking study, first decide what you’re going to measure and which research method you’ll use to collect those metrics. Next, collect your first measurement, redesign the product, and collect an additional measurement. Then, compare and interpret your findings, and possibly calculate ROI. Once you’ve completed the initial end-to-end process, future iterations of your study (assuming that the context remains the same) can begin at step 4 (redesign the product).

Step 1: Choose What to Measure

Focus on the key metrics that best reflect the quality of the user experience you’re interested in evaluating. Look for metrics that translate to UX and organizational goals.

That said, before you determine which metrics to collect, you must define the context of your study. In other words, consider:

  • What product will you focus on? (website, application, etc.)
  • Which user group will you target?
  • What tasks or features do you want to measure?

Tasks

Figure out the top tasks that users complete in your product. If your organization doesn’t have existing top tasks, you can start by documenting (most) tasks in the product. Then, prioritize the list of tasks and select approximately 5–10 that are most important to your users.

The table below outlines multiple possible product and task scenarios. It includes just one task per product, but in real life you will probably focus on more than one task.

Product

Possible task

Smart-speaker app

Setting up a new smart speaker

Ecommerce website

Making a purchase with 1-click purchasing

Mobile-banking website

Updating contact information

B2B-agency website

Submitting a lead form

Mobile puzzle game

Solving one puzzle

The following list outlines multiple possible product and task scenarios. It includes just one task per product, but in real life you will probably focus on more than one task.

Smart-speaker app

  • Task: setting up a new smart speaker

E-commerce website

  • Task: making a purchase with 1-click purchasing

Mobile-banking website

  • Task: updating contact information

B2B-agency website

  • Task: submitting a lead form

Mobile-puzzle game

  • Task: solving one puzzle

Metrics

Now that you’ve focused in on a set of tasks, how can you measure them? Google’s HEART framework provides a concise overview of different types of metrics you may want to collect and track. The following table is an adaptation of the HEART framework:

 

Description

Example Metrics

Happiness

Measures of user attitudes or perceptions

Satisfaction rating

Ease-of-use rating

Net promoter score

Engagement

Level of user involvement

Average time on task

Feature usage

Conversion rate

Adoption

Initial uptake of a product, service, or feature

New accounts/visitors

Sales

Conversion rate

Retention

How existing users return, and remain active in the product

Returning users

Churn

Renewal rate

Task effectiveness and efficiency

Efficiency, effectiveness, and errors

Error count

Success rate

Time on task

Now that you’ve focused in on a set of tasks, how can you measure them? Google’s HEART framework provides a concise overview of different types of metrics you may want to collect and track. The following is an adaptation of the HEART framework:

Happiness: measure of user attitudes or perceptions

Engagement: level of user involvement

Adoption: initial uptake of a product, service, or feature

  • Metric examples: new accounts/visitors, sales, conversion rate

Retention: how existing users return and remain active in the product

  • Metric examples: returning users, churn, renewal rate

Task effectiveness and efficiency: efficiency, effectiveness, and errors

  • Metric examples: error count, success rate, time on task

Note that as an engagement metric, time on task should be high (e.g., a long time spent reading articles on a newspaper site), whereas as an efficiency metric, time on task should be low (e.g., fast to check out on an ecommerce site). In other words, the same change (say, longer time) could be either good or bad, depending on what type of use is measured.

Pick metrics that will matter for the long haul, since ideally, you’ll be collecting these metrics repeatedly over many years. Aim for 2–4 metrics that focus on different aspects of your UX (e.g., happiness and engagement).

Here are some possible metrics we may track for the tasks in our previous example.

Product

Task or Feature

Metrics

Smart-speaker app

Setting up a new smart speaker

Time on task

Success rate

Single Ease Question (SEQ)

Ecommerce website

Making a purchase with 1-click purchasing

Weekly sales with 1-click

1-click feature adoption

 

Mobile-banking website

Updating contact information

Completion rate

Errors on page

# of support calls on the same task

B2B-agency website

Submitting a lead form

Form submissions

Abandonment rate

Mobile-puzzle game

Solving one puzzle

Success Rate

Returning users

Smart-speaker app

  • Task: setting up a new smart speaker
  • Metrics: time on task, success rate, Single Ease Question (SEQ)

E-commerce website

  • Task: making a purchase with 1-click purchasing
  • Metrics: weekly sales with 1-click, 1-click feature adoption

Mobile-banking website

  • Task: updating contact information
  • Metrics: completion rate, errors on page, # of support calls on the same task

B2B-agency website

  • Task: submitting a lead form
  • Metrics: form submissions, abandonment rate

Mobile-puzzle game

  • Task: solving one puzzle
  • Metrics: success rate, returning users

Benchmarking a user experience isn’t just about tracking metrics, it’s also about demonstrating value. That’s much easier to accomplish when you select metrics that align to your organizations’ key performance indicators (KPIs). For instance, in a bank where customer-support cost is a KPI you may be able to show that a redesigned contact form contributed to decreased support costs by tracking the number of support calls before and after the redesign.

Step 2: Decide How to Measure

When it comes to determining the methodology to collect your metrics, you must consider the time commitment that the research method requires, cost of such method, skill of the researchers involved, and the research tools available to you. Don’t do something if you don’t have the right skills, since bad numbers are worse than no numbers. Also, don’t specify a measuring plan that will be too expensive to be sustained in the long term (because the entire idea of benchmarking is to repeat the measurement again and again).

Before you start planning a new study, see what existing data your organization has around the experience you want to measure. It can be extremely valuable to gain a holistic understanding of the experience and connect your UX metrics to larger organization goals. When requesting data from other sources, be sure to explain why it’s needed and how it will be used.

There are 3 research methods that work well for UX benchmarking: quantitative usability testing, analytics, and survey data.

Quantitative usability testing. Participants perform top tasks in a system and researchers collect metrics (such as time on task, success rate, and satisfaction) that measure the users’ performance on those tasks.

  • Analytics. System-usage data (such as abandonment rates and feature adoption) is automatically gathered. 
  • Surveys. Users answer questions to report their behavior, background, or opinions. Task ease, satisfaction ratings, net promoter score are all metrics collected in surveys.  

Ideally, you’ll pair a survey (to get self-reported metrics) with a behavioral, observational method (quantitative usability testing or analytics) to get a holistic view of the user experience.

In the following, we’ve charted out methodologies given our previous scenarios.

Product

Task or Feature

Metrics

Methodology

Smart-speaker app

Setting up a new smart speaker

Time on task

Success rate

Single Ease Question (SEQ)

 

Quantitative usability testing with survey

Ecommerce website

Making a purchase with 1-click purchasing

Sales

Adoption

Net promoter score

 

Analytics

Survey

Mobile-banking website

Updating contact information

Completion rate

Errors on page

# of support calls on the same task

 

Analytics

Internal customer-support data

B2B-agency website

Submitting a lead form

Form submissions

Abandonment rate

 

Analytics

Mobile-puzzle game

Solving one puzzle

Average time spent

Retention

 

Analytics

Smart-speaker app

  • Task: setting up a new smart speaker
  • Metrics: time on task, success rate, Single Ease Question (SEQ)
  • Methodology: quantitative usability test with survey

E-commerce website

  • Task: making a purchase with 1-click purchasing
  • Metrics: weekly sales with 1-click, 1-click feature adoption
  • Methodology: analytics, survey

Mobile-banking website

  • Task: updating contact information
  • Metrics: completion rate, errors on page, # of support calls on the same task
  • Methodology: analytics, internal customer support data

B2B-agency website

  • Task: submitting a lead form
  • Metrics: form submissions, abandonment rate
  • Methodology: analytics

Mobile-puzzle game

  • Task: solving one puzzle
  • Metrics: success rate, returning users
  • Methodology: analytics

Step 3: Collect First Measurement: Establish Baseline

Now that you’ve determined which metrics to collect and how to collect them, it’s time to gather your baseline metrics. (But not so fast — do a pilot study first to collect an initial sample of data and run a preliminary analysis to make sure your method is sound and that the data can answer your questions. Most likely, the pilot will make you revise your methodology, meaning that the initial data set should be discarded. But this is worth the investment in order to get sound results from your subsequent, bigger data-collection efforts.)

As you gather your first set of measurements, consider external factors that may affect your data and, when possible, plan around it. For instance, if you’re an ecommerce website using analytics to collect sales metrics for benchmarking, be wary of factors like extensive marketing campaigns or large-scale economic influences that can disrupt your metrics and make it difficult to correlate the design change to outcomes.

One measurement of your site is not likely to be meaningful by itself. Even if you’ve just started your benchmarking program and you don’t have prior data to compare to, you can still make comparisons with competitors, an industry benchmark, or a stakeholder-determined goal. Below we provide examples of each.

  • Your competitor. For example, if your product is a smart-speaker app, you could benchmark the experience of setting up your product versus setting up a competing product. (To do so, you will likely have to collect data on your product and on competitors’ products, so the prior steps will have to take that into account. That said, you could not use analytics as your methodology, since you won’t have access to your competitor’s analytics.)
  • Industry benchmark. You may have access to external statistics pertaining to your field. For example, if you’re a hotel website, you may want to compare your NPS to the average net promoter score (NPS) for this industry, which is 13%.
  • Stakeholder-determined goal. For instance, your stakeholders say they want the average time to submit a lead form to be under 3 minutes, so you may want to compare your current performance to that threshold.

As you’re considering how to interpret the outcome of these comparisons, take into account the recommendations described in step 6.

Step 4: Redesign the Product

The redesign process is outside the scope of this article, though it’s an incredibly important part: without a redesign, you won’t be able to compare multiple versions of your product.

As you redesign your product, keep the 10 usability heuristics for interaction design in mind.

Step 5: Collect Additional Measurement

After your redesign is launched, measure your design again. There is no hard and fast rule on how long to wait after a design is launched to measure again. If you’re tracking  analytics, there’s added benefit of continuous measurement. However, for task-based data collection, like quantitative usability testing and surveys, you’ll need to determine the right time to collect the data. Users often hate change, so give them a bit of time to adapt to the redesign before measuring it. The amount of time varies depending on how frequently users accesses your product. For products accessed daily, perhaps 2–3 weeks is enough time.  For a product that users access once or twice a week, 4–5 weeks before you measure is better.

As you consider the right time to measure your new design, once again document any potential external influencers that may impact your findings.

Step 6: Interpret Findings

Now that you’ve gathered at least two data points, it’s time to interpret your findings. You shouldn’t take your metrics at face value since the sample used for your study is likely much smaller than the entire population of your users. For that reason, you will need to use statistical methods to see whether any visible differences in your data are real or due to random noise. In our course, How to Interpret UX Numbers: Statistics for UX, we discuss this topic in great detail.

In general, interpreting your metrics is highly contextual to your product and the metrics you’ve chosen to collect. For example, time on task for an expense-reporting app is different than time on task for a mobile game. In the following, we outline one of the previously discussed scenarios and interpretation of the findings.

Scenario: Setting up a Smart-speaker

Assume we used quantitative usability testing paired with a survey to collect time on task, success rate, and SEQ. The following table outlines hypothetical metrics for our initial design and redesign.

 

Initial Design

Redesign

Average time on task (minutes)

6.28

6.32

Average success rate

70%

95%

Average SEQ

(1 very difficult – 7 very easy)

5.4

6.2

In summary, time on task was nearly the same, success rate increased, and average SEQ increased. Let’s assume we found statistically significant differences between these pairs of metrics. Therefore, in the redesign, users were more successful and satisfied with the setup process. In other words, the redesign was a success!

Step 7: Calculate ROI (Optional)

Benchmarking allows you to track your success and demonstrate the value of your work. One way to demonstrate the value of UX is to connect the UX metrics to the organization's goals and calculate return on investment (ROI). These calculations connect a UX metric to a key performance indicator (KPI) such as profit, cost, employee productivity, or customer satisfaction.

Calculating ROI is extremely beneficial, though not widely practiced by UX professionals (perhaps because relating your UX metric to a KPI is often convincing enough). In any case, if you struggle to prove UX impact, calculating ROI can be persuasive.

Presenting Benchmarking Findings

As you wrap up your analysis and share your findings with stakeholders, aim to tell a story with the data. Just because some members of your leadership love numbers doesn’t mean you can’t incorporate some qualitative findings or quotes from previous studies that align with your findings — this can be a great way to build empathy for your users among that data-driven audience.

Additionally, when presenting to stakeholders, be sure to have documented all of your assumptions and possible confounding variables of your study. Though you may not have to directly comment on them, having them in an appendix of your presentation shows that you have a holistic understanding of the product environment and allows you to easily reference it, should any questions arise about the validity of your measurements.

Conclusion

Benchmarking is a fantastic tool to align and correlate UX efforts to overall organizational goals and outcomes. To conduct a benchmarking study, begin by focusing on important tasks or features in your product and determine how you can measure them. Next, select a research method that allows you to collect those metrics, given your time, budget, and skills. Collect your first measurement, redesign your product, and collect those metrics again under the same methodology. Finally, interpret your findings by comparing your collected data points and using your product and organization knowledge to make sense of it all.

Then, do it all again next year! (Or after the next release.) Hopefully, your numbers will be better, and if not, you’ll know where to focus efforts during the subsequent redesign.

References

K. Rodden, H. Hutchinson, X. Fu. “Measuring the User Experience on a Large Scale: User-Centered Metrics for Web Applications” (2010). Source: https://research.google/pubs/pub36299/