Worklytics

Workplace Analytics Benchmark Report

Design system, visualization, and automation for the Worklytics Annual Benchmarks Report.

Context

Client

Worklytics is an industry leader in workplace analytics, helping organizations measure & continuously improve the way they work.

Prompt

How might we visualize workplace benchmarks, with humanity, clarity, and style?

Background

Data becomes “actionable” when it’s misaligned with our expect­ations. For example, knowing that employees spend 20 hours each week in meetings might be an interesting “fun fact,” but what can you do with it?

Is 20 hours good or bad? If bad, how bad? How urgently does it require your attention?

To answer this, you need to know what’s “normal.”

Benchmarks help you pair “20 hours per week of meeting time” with context like “compared to hundreds of thousands of similar office workers, 20 hours a week is an extreme outlier, higher than XX% of the population.” This makes it clear that 20 hours is a crazy amount of time and that your teams’ meeting habits might need some attention. This realization is the first step toward actions like improving meeting hygiene, better collaboration tooling to encourage asynchronous communication, or adopting practices like “no meeting Wednesdays.”

Worklytics’ benchmarks provide this important extra context and make up one of the world’s leading datasets on “what’s normal?” at work. The Worklytics Benchmarks Report showcases these benchmarks and educates customers on how to use them in their own analysis.



Design



Challenge #1

What does “normal” look like?

When designing the main benchmark visualization, we needed to balance clarity and approachability, while playing nicely with Worklytics’ existing visual language. Fortunately Worklytics makes this easy. They’re not shy about leaning into more expressive charts like jitter plots, which feature prominently in their app and in their reporting.

Two sample slides, showing fake data, illustrating two chart types that are common to Worklytics reporting. On the left is a slide showing a stacked jitter plot, on the right is a slide showing a scatter plot with a regression line.
Worklytics reports already do a great job at “showing the data.” Not only do these more expressive charts create visual interest, they’re effective data design and support better decision making. The Benchmarks report needed to build on this foundation.

Plots like these are powerful for visualizing people analytics data because they a) have an easy, concrete visual metaphor (1 dot = 1 person), b) they promote better business decisions in a variety of scenarios since they show the full range of outcomes, and c) they can be quite compact and easy to lay out in dense reporting.

Jitter plots have an important drawback though: Because they’re compact, they tend to hide the shape of the underlying data. Since the benchmarks are detailed enough to show the shape, and the shape is an important part of the story, we wanted to show it off. It’s also a great source of visual variety, which goes a long way toward differentiating metrics in a long report.

Insights

Show more data: For the benchmarks, showing the full distributions of data is important.


Wilmer & Kerns 2022 What’s Really Wrong With Bar Graphs of Mean Values: Variable and Inaccurate Communication of Evidence on Three Key Dimensions"

Holder & Xiong 2022 Dispersion vs Disparity: Hiding Variability Can Encourage Stereotyping When Visualizing Social Outcomes

Hofman et al 2020 "How Visualizing Inferential Uncertainty Can Mislead Readers About Treatment Effects in Scientific Results"

While conventional business reporting favors simple-seeming charts (e.g. bar charts of averages), these overly simplistic visualizations are often misleading (Wilmer & Kerns 2022). Hiding outcome variability encourages misjudgements about the causal stories behind the data (Holder & Xiong 2022).

Related biases can also impact a variety of other business decisions, like overpaying for programs that only offer marginal improvements (Hofman 2020). More expressive charts like jitter plots or quantile dot plots avoid these issues by showing the full range of data.

An example of the main chart type used in the report, showing fake data. There are three columns. On the left it says Number of Cookies Eaten Per Day, 100% of the population, 1 dot = 1 person (out of 500). In the middle is a quantile dot plot showing the distribution of cookies eaten per day. The plot has 3 annotation lines representing the 25th, 50th, and 75th percentiles. There is also a brief explanatory text that says In a typical week, the median person eats 9.0 cookies per day. For most people this ranges between 7.0 (25th percentile) and 11 cookies eaten (75th percentile).
This shows the benchmark population as if it were 500 people and each dot is a single person (1 dot = 0.2% of the population). The blue vertical bar shows the median value (50th percentile). Darker dots are within the normal range, representing the middle 50% of the population (between the 25th - 75th percentiles). In a typical week, the median person eats 9.0 M&Ms per day. For most people, this ranges between between 7 and 11 M&Ms eaten (the 25th and 75th percentiles).


Kale et al 2020 Visual Reasoning Strategies for Effect Size Judgments and Decisions

Lead with the familiar: While averages and medians are only a small part of the story, they’re still the first thing people look for. By overlaying the bright blue median bar (p50=9.0) and giving it the most visual weight, we can ensure the charts keep viewers in their comfort zones, meeting their immediate expectations and even minimizing certain types of decision bias (Kale 2020). Because the prominent median gets noticed first, the additional detail of the distribution is purely additive—adding context—without sacrificing the immediacy of a more familiar plot, like a bar chart. Detail doesn’t have to be distracting.

Holder & Xiong 2023 Polarizing Political Polls: How Visual­ization Design Choices Can Shape Public Opinion and Increase Political Polarization

Milkman et al 2021 Megastudies improve the impact of applied behavioural science

Allcott & Mullainathan 2010 Behavior and energy policy: Investment in scalable, non–price-based behavioral interventions and research may prove valuable in improving energy efficiency.





Scott & Nowlis 2013 The Effect of Goal Specificity on Con­sumer Goal Reengagement

Benchmark ranges. The charts also highlight the interquartile range of the benchmark distributions, with blue dots and shading from the 25th to the 75th percentiles. Benchmarks aren’t goals, at least not necessarily. However, our research and others’ show that charts like these can be highly influential (Holder & Xiong 2023, Milkman et al 2021, Allcott & Mullainathan 2010): It’s human nature to shift our attitudes and behaviors to align with perceived social norms.

Presenting the benchmarks as a range of outcomes avoids being overly assertive about the importance of any particular point on the spectrum, which is a judgement best left to customers. At the same time, to the extent that organizations would like to shift toward the benchmarks, targeting a range of acceptable outcomes can be more motivating toward longer term perseverance in behavior change, relative to a goal defined as a point value (Scott 2013).


Kay et al 2016 When (ish) is My Bus? User-centered Visual­izations of Uncertainty in Everyday, Mobile Predictive Systems

Proven approachability: Quantile Dot Plots help data-shy audiences understand variability and uncertainty. These charts are well-studied within the VIS community and reliably effective, even for helping random people at a bus stop to predict uncertain bus arrivals (Kay 2016). This works because the individual dots are concrete and (potentially) countable. This affords a simple, but powerful visual metaphor: You can read the chart by imagining each dot is a person, and they’re all lined up based on their outcome on the chart.


Kong et al 2019 Trust and Recall of Information across Varying Degrees of Title-Visualization Misalignment

Good dataviz is good writing. Even with proven charts, data-literacy issues can put insights out of reach for some audiences. For this reason, it’s always good to provide detailed “how to read this chart” explainers. It’s also good to tell the same story in multiple ways, both visually and in writing. This has an added bonus of assisting memorability, as people remember soundbites from chart titles more than the charts themselves (Kong 2019).

Challenge #2

Information Overload.

In addition to benchmarking the overall population, Worklytics also provides benchmarks for eight specific subgroups like frontline managers, software engineers, or people who work at huge corporations. While this enables customers to make more “apples to apples” comparisons, it adds quite a bit of density.

Insights

Small multiples. Aligning the plots vertically into small multiples gives viewers enough space to consider each subpopulation individually, while also making it easy to compare between rows, to see how metrics differ between groups.







Franconeri et al 2021 The Science of Visual Data Communication: What Works

Blue normal range anchor. To further facilitate between-row comparisons, we extended the normal range for the overall population all the way down the page (and onto the second page) as the soft blue band in the background. This makes it easier to compare subpopulations to the overall norm, without having to bounce your eyes up-and-down the page (Franconeri 2021). As an added bonus, it also serves as a pleasant structural element on each page, guiding your eye toward the most critical content.

Caption to follow.
Full two-page spread for the (fake) metric: # Cookies Eaten Daily. The first row is the Overall Population, representing 100% of people in the benchmark dataset. The large blue-gray band in the background shows the normal range of this overall population (you can see it aligns with the p25 and p75 values for the “Overall Population”). Each following row shows the distribution for a specific sub-population. For example, individual contributors are 80% of the population, so that row shows dots for 400 people (excluding a few outliers who fall outside the plot range). The final two plots (right page, bottom) include a timeseries for understanding seasonality in cookie-eating, and the table at the bottom serves the reference use-case, for viewers needing to cite some specific benchmark value.

Dot counts as differentiation. To reinforce the idea that each row represents a distinct subgroup of the overall population, and convey how these subgroups can differ dramatically in size, the number of dots on each row is proportionate to the size of the subgroup. For example, you can see there are only a handful of dots on the senior leaders row, because senior leaders are only a small proportion of people within a typical organization.

Strict Baseline Grid. To minimize visual noise, each chart and all text elements were carefully aligned against a consistent grid, both vertically and against a text baseline. This gave us room to pack in more information while avoiding a cluttered feeling.

A screenshot of the top section of the page with grid guidelines showing.
The bottom row of dots aligns with the top of the plot, and the baselines for each of the percentile annotations. Matplotlib put up a good fight to prevent this, but we made it happen!

Challenge #3

Where’s the action?

While the metric pages were designed to minimize clutter and overload, the scope of the report added another challenge: It’s 89 pages long and covers 35 metrics. And each metric includes eight profiles and 12 charts.

Benchmarks make data actionable, but with this much material, how do we guide viewers toward “the action?” How can we use the report to demonstrate the types of comparisons that make this data valuable?

Insights

Always be educating. Introductory material sets up the rest of the report for success. We expect that most viewers will quickly flip past this on their way to the main content, but even scrolling through they can gain a gut sense of what to expect from the report and gain some exposure to the visual language. And, as questions pop up, they’ll know exactly where to look first.

A collage of four page designs, showing the introduction, the table of contents, a guide on 'how to use this report', and an infographic explaining the composition of subpopulations.
In addition to the explainer text that shows up on each metric page, introductory material like the “How to use this report” can help viewers get the most out of the report.

Follow the blue path. The blue band through each page represents the benchmark range for the overall population. This element does a lot of work within each page (e.g. it’s a visual anchor, as well as a reference for the charts), but it also works between pages. As viewers navigate from section to section and metric to metric, the blue band shifts positions horizontally, giving each section a unique fingerprint, while indicating transitions between metric sets and previewing their distributions.

Stylized mockups of multiple report pages showing the different placements of the blue anchor ribbon
The blue benchmark range served as both a reference for comparing vertically stacked plots, as well as a source of visual differentiation across each page, giving each metric a subtle visual fingerprint.

Action in the outliers. Benchmarks are actionable because they highlight data that doesn’t match expectations. They reveal outliers in the organization, which represent the biggest opportunities either for improving stale processes (e.g. senior leaders getting first dibs on cookies) or finding exceptional teams worth emulating (e.g. the lower tail of senior leaders who eat a reasonable amount of cookies).

So the best way for analysts to use the benchmark data is looking for places where their organization and the benchmarks are misaligned, then digging deeper to figure out why. Because the overall population acts as a “benchmark of benchmarks,” we’re able to use this approach in the report by showing each groups’ normal ranges in high contrast blue, making it easy to spot parts of their curves that are misaligned with the overall population, demonstrating this as a technique that Worklytics’ customers can do in their own analysis.

Isolated stack of plots showing distributions for the overall population, individual contributors, frontline managers, and senior leaders. It shows senior leaders eating quite a lot of cookies compared to the overall norm.
The designs invite viewers to look for misaligned outcome distributions and attend to outliers. For example, in the chart above we can see that the median (p50) Senior Leader eats an exceptional number of cookies. But that’s not the case for every Senior Leader: The left tail shows ten of the Senior Leader dots fall within the normal range. They are exceptional for Senior Leaders, but normal for everyone else. Why is that? Is this left tail of senior leaders not eating enough cookies? Or maybe, their outcomes suggest that excessive cookie consumption isn’t strictly necessary for effective leadership? Spotting misalignment and outliers enables more incisive snacking analysis, and therefore deeper insights overall.

Results

The report is live here:

The Worklytics Benchmark Report, Version 2.



a curious guinea pig
Would you like to be a guinea pig?

Join 3iap’s mailing list for early access to the latest dataviz research, writing, and experiments.