Treemaps are a data-visualization technique for large, hierarchical data sets. They capture two types of information in the data: (1) the value of individual data points; (2) the structure of the hierarchy.
Definition: Treemaps are visualizations for hierarchical data. They are made of a series of nested rectangles of sizes proportional to the corresponding data value. A large rectangle represents a branch of a data tree, and it is subdivided into smaller rectangles that represent the size of each node within that branch.
Key Uses of Treemaps
Treemaps are commonly found on data dashboards. Designers often choose them to add visual variety on a dense dashboard. However, treemaps are a complex visualization and present many obstacles to quick comprehension (which is the main requirement for any information displayed on a dashboard).
Treemaps are often used for sales data, as they capture relative sizes of data categories, allowing for quick perception of the items that are large contributors to each category. Color can identify items that are underperforming (or overperforming) compared to their siblings from the same category. This is why FinViz’s Map of the Market is an enduring example of treemaps: it allows users to identify companies that are doing better than their industry peers, even though their overall stock value may be quite small.
Treemaps work well when your hierarchical data has 2 main dimensions that you want to visualize:
- A positive quantitative value, which will be expressed as the area of the rectangle (Because area cannot be negative, you cannot use treemaps for visualizing variables like gain/loss, which can have both positive and negative values.)
- A categorical or second quantitative value, which will be expressed as the color of the individual rectangles. If color is used to express a quantitative value, it’s strongly encouraged to use only one color (if all the numbers are positive) or two colors (one for negative and one for positive), and vary the intensity of the color to express precise value. As humans don’t perceive colors to have an inherent order, we strongly recommend that you do not use multiple colors to represent a range of numbers.
If color represents a categorical variable, it is okay to use different colors for different possible values, as there’s no need for users to interpret a specific color as being “higher” or “lower” than another. However, as with any use of color in a data visualization, restraint in the number of colors is strongly advised!
Regardless of how you use color in a treemap, make the following accessibility accommodations for color-blind users:
- Avoid using both red and green in the same treemap (especially for values that need to be differentiated quickly).
- Use color palettes that are safe for color-blind people.
- Test your design with a tool that allows you to simulate a color-blind user’s experience
- Use a secondary signal (such as text within the rectangle or appearing on hover) for the data aspect captured through color
Here are a few more guidelines for creating usable treemaps:
- Visually distinct borders around higher-level categories help users identify the top-level groupings.
- High-contrast text ensures that people can read the labels inside the treemap rectangles.
- A visually distinctive selected state, reached when users hover (or tap) a rectangle, helps users confirm that they are looking at the right data point.
- Additional detail about a selected rectangle (appearing in an overlay), such as the name, value of the variables allows users to drill into the data.
Treemaps’ Downsides
Comparisons Are Difficult
Human brains are able to process certain visual information preattentively: attributes such as length can be grasped quickly and accurately, and values for such attributes can be compared with almost no cognitive effort. Unfortunately, area is not one of these preattentive attributes. Treemaps rely on area (and possibly color) to encode the value of a variable, and therefore, although treemaps can convey overall relationships in a large data set, they are not suited for tasks involving precise comparisons.
Inefficient for Data that Is Not Hierarchical
Treemaps should not be used if your data is not hierarchical: in those situations, they are functionally equivalent with a pie chart — simply showing a parts-to-whole relationship. (Pie charts are not great visualizations either — like treemaps, they are based on area and angle, attributes that are not preattentive. They should be used only to communicate that one or two items are much larger than the rest, and not for comparing relative sizes of the pie slices.)
Visually Overwhelming
Treemaps are often used to visualize very large data sets, with hundreds or thousands of items. This quantity of information can visually overwhelm users — the treemap becomes a sea of tiny rectangles, many too small to bear a text label. Furthermore, in complex treemaps, the overall hierarchy can easily become undiscernible. The solution is a cushion treemap, which uses texture to make each rectangle look “raised” in the center like a cushion and tapering off to the edges. This visual effect takes advantage of humans’ tendency to interpret this type of shading as a raised surface, making it faster to identify the different rectangles.
Not for Balanced Trees
Treemaps are also poor choices for data sets with items close in size (i.e., balanced trees). In these cases, the main purpose of a treemap (quickly identifying the largest items in a given category) becomes very difficult. Finally, the standard algorithm used to create a treemap attempts to make the rectangles as square as possible, in order to make size comparisons slightly easier and less error-prone. However, in interactive visualizations where change is shown over time, an artifact of this algorithm is that the rectangles may move around as their size changes. As a result, keeping track of a particular item over time becomes very difficult.
Alternatives to Treemaps
In many cases, treemaps can be replaced with bar charts (for data that have one quantitative and one categorical variable) or scatter plots (for data with two quantitative variables) that represent the variables of interest.
This process, however, requires an understanding of your users’ top tasks; for executives attempting to identify the products that have both a high sales volume and a large profit margin in order to advertise them most aggressively, a 2D scatterplot would be better than a treemap. But if the user cares primarily about the overall sales, a sorted bar chart is a better choice than a treemap. (Sorting is often underappreciated, but is one of the simplest ways of making it easy to identify those items with the biggest and smallest values.)
Summary
While treemaps can be useful for visualizing certain types of complex, hierarchical data sets, they often hard to interpret. If using a treemap, visually separate the different high-level categories, avoid using multiple colors to express numeric values, and design with color-blind users in mind. Last and foremost, understand what your users need to do with your data and consider whether other visualizations (such as a bar chart or a scatter plot) could replace or augment the treemap.
References
Ben Shneiderman: “Tree visualization with tree-maps: 2-d space-filling approach,” ACM Transactions on Graphics, 11,1, 92-99. (1992)
Jarke J. van Wijk and Huub van de Wetering: “Cushion Treemaps: Visualization of Hierarchical Information,” IEEE Symposium on Information Visualization (INFOVIS’99), San Francisco, (October 25-26, 1999)
Share this article: