In 1967, the Russian psychologist Alfred Lukyanovich Yarbus watched people as they looked at the same oil painting with different goals in mind. He noticed that the eye-gaze movements depended on the activity being performed and he concluded that people attended to those areas of the scene that were more likely to contain relevant information for the current task.

Our recent eyetracking studies build on Yarbus’s research, and reinforce the idea that tasks greatly impact user behavior on the web and therefore drastically change the outcome of eyetracking gazeplots and heatmaps. Better get the tasks right, or any eyetracking you do will be more misleading than helpful for driving your design.

Methodology

In usability eyetracking studies, we uncover the most helpful and realistic findings for improving a website when we allow people to click, search, and type as much as they want, since this is how they normally use the web. In other words, the users have to try to actually do something realistic with the website or application, as opposed to being told to “just take a look at this page.” Doing something, of course, almost always requires users to move between different screens where they don’t know in advance which pages will be useful to them.

However, because this study’s goal was to determine how users examine the same webpage as they attempt different tasks on that page, we adjusted our recommended eyetracking methodology: We gave users a task and, instead of allowing them to navigate by themselves to the page of interest, we took them to that page, allowed them to complete the task on the page, then closed the page for them. We examined a few different pages, including 2 pages that we will discuss in this article: a page displaying women’s dresses on Bebe.com, and a page displaying vacation packages on jetBlue.com.

One page in our study included images of dresses on www.bebe.com.

 

Another page in our study included a list of vacation packages offered on www.jetBlue.com.

For each page, we gave users several different tasks. (The Bebe tasks will be described later.) These were the tasks given on the jetBlue page:

  1. Take a look at the page. (Test facilitator closes the page after 8 seconds.)
  2. To which places do these getaways go?
  3. Which place looks the nicest to you?
  4. Which is the least expensive?
  5. Which is the highest rated?
  6. Imagine you were going to get a quiz about this page. Study the page enough that you feel you could pass the quiz.

To be clear: the approach we used in this study is not how we recommend that you test your website. The goal of our research was to capture the effect of the various test tasks, not to improve the websites. If Bebe or jetBlue had hired us for a consulting job, we would have run very different studies.

Gaze Plots

In eyetracking studies, researchers record participants’ eye movements and track the order and duration of their gazes (or fixations). Then they aggregate the number of fixations, fixation durations, or sequence of fixations in graphical representations such as gaze plots and heatmaps. In this article we illustrate our findings using gaze plots.

A gaze plot is an overlay superimposed on a static screenshot that demonstrates where one or more users looked on that page. The elements in a gaze plot include the following:

  • Dots = fixations: In a gaze plot, each dot represents one gaze (or fixation). In other words, a dot says that the user looked at that spot on the screen (or closely around it). Most eyetracking usability studies operate under the eye–mind hypothesis — namely, if a user looked at an item, that item was attended and processed cognitively. That said, an attended item may still not be properly interpreted or remembered. It’s important to understand that the dots show the only parts of the page that the user saw sharply enough to read text or understand the details in images.
  • Larger dots = longer fixations: The size of the dot is (roughly) proportional with the duration of the corresponding fixation. Thus, larger dots represent a longer time spent looking at that location on the page. Long fixations signal that the user spent more time processing the corresponding item, because 1) she is interested in it, or 2) she is confused and has a hard time understanding what it means.
  • Numbers = order: The numbers within the dots represent the order in which the user looked at the items on the page.
  • Lines = saccades between fixations: Each dot (fixation) is connected to the fixation preceding it and to the one following it. Thus, the lines between dots make it easier to follow the eye movements (called saccades) and the sequence of the fixations. (The numbers provide enough information for determining the order, but, without lines, it becomes too challenging to hunt for the next fixation.) Because the eyes move extremely fast between fixations, the person is effectively blind during an eye movement and doesn’t register the visual landscape that the gaze speeds across.
This gaze plot from a study participant doing all the given tasks on jetBlue’s vacation-package page includes more than 440 fixations.

Interpreting User’s Fixations on a Page

The gaze plot of the jetblue.com page above includes more than 440 fixations on various page elements, ranging from vacation listings to UI components such as logo and navigation. Where did the user look and why?

We could come up with several reasonable interpretations, such as:

  • The participant looked in the area in the upper left a few times to see the logo and confirm the site she was on.
  • She looked at the navigation within the first few moments on the page to get a sense of what was offered on the site.
  • Many, long fixations on each package name, image, rating, price, and short description possibly indicate that she is interested in all of them and trying to decide which to pick.
  • After she looked at everything on the page, she checked the footer to see what else was offered on the site.

Each of these possible interpretations are logical. But, because we know nothing about what the user was trying to achieve, these assumptions are unfounded. Simply zooming in on the top of the gaze plot shows that the fixation numbers are high and did not happen within the user’s first few moments on the site, demonstrating that the second interpretation is wrong. But that’s about all we can conclude from looking at the static image alone.

Fixations on the menu and logo happened relatively late in the task, after more than 200 fixations on other places on the page.

Watching the user’s full eyetracking session, not just studying gaze plots, is the best way to uncover meaningful insights. Watching slow-motion replays enabled us to create more telling gaze plots that represent time segments on the page (as opposed to the entire visit on the page) related to the task the user was doing at the time. We’ll examine those in this article.

Task and User Motivation Affect Eye Movements

Task 1: Free Examination (“Take a Look at the Page”)

The first task was simply to look at the page. The task did not specify an amount of time, but the facilitator closed the page after 8 seconds. This task tested the concept of “free examination.” Free examination is rare in the real world — people usually visit websites for a reason. In usability testing, free-examination tasks should be reserved for special cases when you want to understand how users behave when they’re simply interested in your company or brand. (Even then, people are usually trying to satisfy an information need — for example, to see what’s new, or what the company does.) The main reasons free-examination tasks are not usually recommended in usability testing are: 1) they are unrealistic; and 2) they cause users to study the page more carefully than they normally would, and, as a result, may bias their behavior in subsequent tasks.

User behavior. In these first few seconds on the page, the user looked at the largest piece of text in the content area (the price of the first vacation), at the corresponding thumbnail, and then she moved her gaze to the upper left corner of the page. That spot is traditionally reserved for the company logo, but on jetBlue’s page the logo was pushed to the right. Centered logos are unexpected and harder to locate.

After not finding the logo, the user returned to the content area and began to scan the location names.

The gaze plot for the task “Take a look at the page.” The image is cropped to exclude areas where the user did not fixate.

Task 2: Scan Section Titles (“To Which Places Do These Getaways Go?”)

The second task was to find something more specific: the locations of the vacations. For this task and the ones that follow, we will describe the information on the page that was necessary to complete the task, the user’s behavior, and, if applicable, the design elements that supported the task or hindered it.

Information needed to complete the task:

  • the names of all the destinations
  • the destination thumbnail images, which may have been a secondary source of information

User behavior. Once the user learned the look of the destination names and how much vertical space existed between them, she quickly adapted her scanning to extract the information needed for the task (i.e., destination names) as efficiently as possible, without wasting extra fixations. Although occasionally the user glanced at secondary elements, such as the thumbnail and the description, the majority of her gazes were directed at the destination names. Our participant was able to complete the task in 38 fixations.

The gaze plot that shows her eye movements exhibits the layer cake pattern of scanning, during which users scan to headings and subheadings but don’t read the information below these, usually because the headings contain enough information to either answer their question, or indicate that the text under the heading is not relevant for answering it. The physical scan path resembles the horizontal sheets found in a layer cake. (This and other scanning patterns are described in our How People Read on the Web: The Eyetracking Evidence report.) The scanning pattern in the gaze plot is an instance of efficient scanning: it is highly focused on the current task and ruthlessly ignores content unrelated to the user’s goal.

The gaze plot for the task “To which places do these getaways go?”; The image is cropped to exclude areas where the user did not fixate.

Design elements that support the task:

  • consistent presentation of the different vacation packages
  • large, bold text for location names, juxtaposed against the smaller description text
  • vertical white space between vacation packages
  • light, thin, subtle grey lines separating vacation packages
  • consistent vertical spacing between different package names

All these design elements allowed the user to quickly figure out how to find the most important piece of information needed to complete the task: the destination names.

Task 3: Scan the Images (“Which Place Looks the Nicest to You?”)

The third task involved gathering impressions about each destination.

Information needed to complete the task:

  • the thumbnail photographs
  • the location name for the nicest place

User behavior. Again, the user quickly limited her scanning only to information that was needed to complete the task. Each thumbnail got 1–6 fixations, and some destination names (presumably of places that looked nice to her) also got fixations. When the user didn’t care for the photo — as with those for Grand Cayman, Charleston, and Fort Lauderdale — she didn’t bother reading the location name. And she scanned nothing else on the page. She was able to complete the task in 37 fixations. (However, had the images been clearer, she likely would have been able to succeed with even fewer fixations.)

The gaze plot for the task “Which place looks the nicest to you?”: The image is cropped to exclude areas where the user did not fixate.

Design elements that hindered the task:

  • photographs that were too small given the amount of detail in them
  • inconsistent thumbnail image types, with different subjects, camera angles, times of day, and level of detail

Task 4: Scan the Prices (“Which Is the Least Expensive?”)

The second vacation package, Saint Maarten, cost $349 and was the least expensive.

Information needed to see to complete the task:

  • price for each location
  • location name for the cheapest destination

User behavior. Our study participant confidently scanned all prices on the page, then she scrolled up and gazed at the name of the second destination in the list to find her answer. That is highly efficient scanning. She completed the task in 28 fixations.

The gaze plot for the task “Which is the least expensive?”; The image is cropped to exclude areas where the user did not fixate.

Design elements that supported the task:

  • price size larger than all the text items in the content area
  • whitespace surrounding prices
  • short numbers
  • bold font for prices
  • same price position for each location in the list

All these elements made the prices easy to locate.

Task 5: Scan Details (“Which Is the Highest Rated?)

Information needed to complete the task:

  • star ratings
  • location name for the highest-rated destination

User behavior. Our participant optimized her scanning procedure along the way, as she learned about the structure of the page: she started by looking at all the stars in the ratings appearing early in the list, but, as the scan progressed, she understood that all locations got at least 4 stars, so she fixated on only the stars to the right. Finally, she determined that Charleston, the 7th entry, and Aruba, the last entry, each held a 5-star rating. She completed the task in 36 fixations.

The gaze plot for the task “Which is the highest rated?”; The image was cropped to exclude areas where the user did not fixate.

Design elements that supported the task:

  • conventional pattern for ratings (stars)
  • star icons easily distinguishable from the other items in the destination description
  • consistent positioning of items across destinations

All these made the stars easy to locate.

Design elements that hindered the task:

  • small stars
  • little visual difference between dark blue and light blue stars, or between half-full stars and full stars

These issues required the user to spend more time on each star rating to parse the details.

Task 6: Motivated Reading (“Study the Page Enough That You Feel You Could Pass a Quiz About This Page”)

This task required the user to pay attention to all the information presented on the page and attempt to memorize it so that it could be recalled later.

Information needed to complete the task:

  • everything on the page

User behavior. Our participant looked at navigational elements and at the content, and in fact she fixated many of these elements multiple times, presumably trying to memorize them. She used 228 fixations to complete the task.

As we saw in the previous tasks, for the sake of efficiency, users scan web pages, focusing only on the relevant bits of content. But when their motivation or engagement are high (or when all the content on the page is relevant, like in this task), they might read almost everything. This commitment can be simulated in a lab setting by telling people that they will be quizzed about the content of the page.

The gaze plot for the task: “Imagine you were going to get a quiz about this page. Study the page enough that you feel you could pass the quiz”; The image is cropped to exclude areas where the user did not fixate.

Test Your Eyetracking-Analysis Assumptions

Our analysis of the gaze plots for the jetBlue tasks probably gave you some insights into how users adjust eye movements to respond to the task. Let’s see if you can apply these to a new set of data.

Below is a list of 3 tasks we asked people to try on www.bebe.com, and 3 gaze plots (labeled A through C) of segments from one user’s visit to the same page.

Your job: Match the task to the corresponding gaze plot.

  1. Which dress is the prettiest?
  2. Estimate the average age of the models.
  3. What is the price range for the dresses?

 

A.

 

 

B.

 

 

C.

 

The answers appear at the end of this article.

What Did We Learn from This Study?

The jetBlue examples and the Bebe quiz should have convinced you that Yarbus’s findings hold for web reading: Users fixate on those elements on the page that are relevant for their task. The same page will be processed differently by the same user when her goal changes.

And, more importantly, this behavior is another embodiment of the principle of least effort and minimum interaction cost: in performing an activity, people will try to be as efficient as possible and will avoid spending any unnecessary effort. Like our study participant, they will always try to find an optimum algorithm for getting the information that they need without working unnecessarily hard.

Design for Efficient Scanning

The 6 jetBlue tasks discussed above encapsulate 3 general scanning behaviors:

  1. getting the lay of the land (task 1)
  2. comparing items (tasks 2–5)
  3. motivated reading (task 6)

The first two are most common on the web, and indeed all our Bebe tasks are also comparison tasks. Designers should support users in their attempt to minimize the amount of effort involved in reading and extracting relevant information from a webpage. As we saw in the previous gaze plots, predictable, unambiguous patterns help users get to an optimal scanning algorithm fast and allow them to easily focus on the essential, while skipping those content elements that are unnecessary for their goals. Remember that sometimes too much variation or detail (like in the case of star ratings and photographs) can slow down the eye, make the task harder, and ultimately increase user frustration.

Here are some design recommendations for supporting efficient scanning of list pages:

  1. Be consistent with the position and layout of items in the list.
  2. Use short, recognizable words and digits when possible.
  3. Use large, bold text and white space surrounding it to attract the eye and showcase the most important information.
  4. Consider using comparison tables to support this behavior.

If displaying photos:

  1. Present consistent types of photos.
  2. Present a level of detail that is recognizable given the size of the photos.

If presenting star ratings:

  1. Consider showing a number with the star ratings.
  2. Make the selected number of stars very simple to glean in one fixation. For example, use a saturated color filling with a thick outline for the selected stars, and white filling and a medium outline for the stars that are not selected. Use high color contrast between selected and not selected stars (and remember color blindness when choosing colors).

Tasks in Usability Studies Focus Test Participants on Particular Site Areas

One of the greatest drawbacks of giving people tasks to perform in a usability test is that tasks affect behavior. Even the act of asking people to do an activity usually clues them that the activity is possible and that the answer exists somewhere on the site. If the users had not been given a task, they may have never discovered certain site features related to that task.

If tasks change how people look at and interact with a design, then why give a task at all in a usability test? Although giving tasks has some drawbacks, it also has many benefits:

  • Most important, if users don’t have a reason to use a website or application, then they would just flail around to no purpose, which is a completely unrealistic way of approaching most designs. It’s better if we present people with a purpose than to waste our time observing a style of use that doesn’t happen in real life and has no business value.
  • If the tasks reflect our audience’s main user needs and goals, then it makes sense to ask study participants to perform those tasks in the lab, so we can optimize the design for those high-priority activities.
  • Tasks allow teams to focus on sections of the design that are new or important for the business. Task-based studies enable us to fix issues before a design goes live, and learn from the research so we can become better at predicting which new designs will work for users and why.

Clearly formulated tasks enable us to figure out what design elements work for which user goals. Still, it’s helpful to employ a variety of research methods in addition to task-based usability tests. Particularly useful are open-ended observation of users in their natural environment (e.g., field studies, contextual inquiry) where they have their own reasons to use the computer or phone. But these research methods are not practical for all projects, budgets, development schedules, or design stages. If we did a field study every time we wanted to see users interact with a design, we could spend days watching people engage in site activities that are irrelevant for our current design project. And this method doesn’t work for sketches and prototypes that need to be tested early in the design lifecycle. Also, when researchers observe users in the field, they may not have a clear understanding of their goals and motivations.

When designing a usability study or analyzing user data, remember that tasks may change user behavior: they may tell users that a piece of information or a functionality are available in the design, leading them work harder than if they had not suspected that the task was doable.

(Learn much more about how to design good test tasks in our full-day Usability Testing course.)

Eyetracking Requires Realistic Tasks

While good test tasks are important for traditional usability testing, the tasks are essential for the validity of an eyetracking study. A heatmap of where users looked while performing a nonrealistic task will be misleading. Any design decision made on the basis of such data is more likely to hurt your business than to improve your metrics. Write good tasks for the most useful eyetracking studies.

 

Quiz Answer

These are the answers to the quiz in this article:

  1. Which dress is the prettiest? Image C.

After a few initial fixations needed to locate the dresses on the page, the user looked directly at the dresses. Consistent size and layout of the dress images made it easy for the user to find the dresses. Unlike the jetBlue photos, the dress images are consistent and each shows a model standing in front of a plain light background.

  1. Estimate the average age of the models. Image B.

Most of the fixations were on the faces of the models, which would be most telling of their age. But there were also a few fixations on the body, which can also be expressive of age. Again, the consistent pictures used in the design helped the user quickly locate the faces.

  1. What is the price range for the dresses? Image A

The first 2 fixations were used to finding the way to the prices, then the user scanned only the text below the images, where the prices appear. The consistent positioning of the prices made the task easy. But, unlike the large jetBlue prices, the Bebe prices were smaller and positioned too close to the dress name, forcing the user to spend longer time or more fixations to register the information.