Data Visualization
Data Visualization
Evan F. Sinar
DDI
Copyright 2018
Society for Human Resource Management and Society for Industrial and Organizational Psychology
Evan F. Sinar
DDI
[email protected]
accessible, clear, enticing and engaging to business audiences. Yet, despite the recent
approaches. Thus, they miss golden opportunities to expand their influence and
audience reach.
1
Why Visualization Now?
The historical roots of data visualization extend back centuries: from the early 1700s-
known for creating data-rich geographic representations, most notably his stunning
map of Napoleon’s 1812 Russian Campaign) to the foundations built by (still active) late
20th-century practitioners William Cleveland and Edward Tufte. However, the past
several years have seen an explosion of prominence for data visualization within the
business community.
I see three main trends driving this upswell: big data availability, data-driven
sort through and extract insights from these large-scale data is increasingly
valuable. Data visualization taps into our formidable visual processing skills to
Visualization techniques also pair well with other facets of data science. These
techniques serve as a “face” for the advanced machine learning and artificial
intelligence methods in expanded use in many organizations with the data gathered
2
mandate—for data to drive strategic and tactical choices. This has hugely increased
the proportion and number of business professionals being asked to make decisions
organizations can involve a broader range of experts to explore and extract value
about big data’s impact has faded, replaced—particularly for employee data—by an
orientation toward data-gathering that guards against bias and unethical data use.
“Black box” analytical models resulting from complex data science procedures are
also facing increased scrutiny because they are largely impenetrable to lay
openness, engagement and interactivity. It also provides a more direct view of the
raw data, counteracting the potential for biased interpretations on the part of the
researchers and giving the audience an opportunity to see the underlying data for
themselves.
strategy—visualization is a vital area of proficiency to fuel success. Yet, it’s not typically
In data gathered in 2017 (DDI, The Conference Board & EY, 2018), we found that
3
storytelling techniques. This is despite the utility these methods have in linking an
level—for example, where HR professionals are responsible for growing others in these
surfaces and clarifies data trends. Visualized forms of data also guide audiences
to contribute their observations and ideas about why and how patterns exist.
states but rapidly gaining momentum. Three recent research studies provide a
4
promising foundation for visualization’s advantages over more traditional forms of
business communication.
information was more persuasive and resulted in greater attitude change to identical
information presented using tables. Al-Kassab and his research team (2014) found that
making decisions. And a 2015 study by Kernbach, Eppler and Bresciani demonstrated
higher attention to, agreement with and recall of business strategies when presented
visually.
Foundational Issues
Objectives and Business Questions. Data visualization is most effective when well-
aligned with a business question and sought-after action. That is, what is the goal of the
analysis and resulting visualization? Five common visualization objectives and example
deficiencies.
employee level and turnover risk score category to guide retention efforts.
5
4. Illustrate change over time—for example, training completions by course and by
What Data are Best to Visualize? While there is no one “right” kind of data to
visualize, certain data characteristics are better for taking advantage of visualization’s
strengths.
populations) are more suited to visualization than low-volume data. Over-time data—
segmented into hierarchical, nested categories such as region, department and job
title. Geographical, map-based data can also be represented visually using techniques
Of course, baseline standards for data quality and veracity must be met for
6
Audience and Format. Before creating any data visualization, consider your
audience and format(s) for presentation. What’s the audience’s familiarity with
visualized data? What’s worked well (and what hasn’t) for this audience in the past?
Ultimately, what insight, decision and actions are you looking to drive with your
audience?
Answers to these questions will guide what types of visualizations you use, as
well as the level of detail you share. A group of senior executives may expect a more
may seek additional detail causing you to include more such information in the
their orientation toward data visualization to avoid creating a single view of the data
need to be higher-resolution, with larger text and visual elements, and less detailed
than visualizations shared in a printed report. Slide presentations can have their own
challenges—factors such as room lighting, screen size and projector resolution can have
visualizations in the same room and with the same A/V setup as the final presentation.
enlarging graphics, adding call-out boxes to highlight key data trends) to ensure that
7
Visualization Techniques
I recommend three overarching techniques to increase the accuracy, interpretability
and persuasiveness for the data visualizations you create: contrast, annotate and
sequence.
Contrast. Use contrast within the visualization to draw attention to the most
notable segments of or patterns within the data. This can be done by using focal colors
important to note that creating contrast also involves deemphasizing less important
data elements. This may be because they aren’t statistically significant, are too small to
be meaningful or simply aren’t the focus of a specific part of the presentation (but may
be given focus at a later point). This is best done by using gray to fade nonfocus
from the program used to create them. Post-creation annotations are often essential
for clarifying your message for the audience. Annotations include labeling the
visualization not merely with a descriptive title (e.g., “Applicant volume over time by
region”). Create a title that drives action (e.g., “Recruitment staff needed in the
know the background for notable visualization features such as a spike in the trend line
in a particular quarter. “Callout” annotations to layer in this context will help the
8
Annotations can also be used to note data provenance such as the source and
date for the data. This pre-empts basic clarification questions from the audience and
helps build confidence in data quality. When used properly, annotations can improve
faces, and help establish a common frame of reference and understanding for the
discussion time to focus less on basic questions and more on sophisticated discussions
components in a larger narrative that you’re building and sharing with the audience.
storytelling principles. For example, consider Freytag's Pyramid (see Figure 1) —from
early “exposition” use within a presentation to introduce and set up the topic, to next
highlight “rising action” as supporting facts are revealed, to the “climax” revealing the
primary insight, to “falling action” showing how the conflict is addressed, and finally to
9
Figure 1: Freytag’s Pyramid
Source: https://www.clearvoice.com/blog/freytags-pyramid-using-classic-storytelling-
techniques-successful-marketing/
(Bach, Riche, Carpendale, & Pfister, 2017): progressive panel-by-panel (or slide-by-
slide) views shifting from one to an alternate view of the same data, changing the time
scale (one year to the next) or moving from a higher to a lower level of detail. See
Source: http://aviz.fr/~bbach/datacomics/Bach2017datacomics.pdf
(orienting the audience to each step before moving on) is also a counterweight to
10
pressures based on the overstated assumption that “the audience needs to be able to
understand a visualization in five seconds!” Taken literally, this approach will lead to
simplistic data views (and weaker impressions about the presenter). Rather than
deploying visualization to simplify the complex (often futile or misleading given the
complex.
effects, gridlines, borders and shadows draw focus from the data, can reduce accuracy
truncating the Y-axis for column charts (which can make even small changes appear
massive); using dual Y-axes (which can show a false correlation between two variables).
whenever possible. This is preferable to relying on axes and captions focusing them to
oscillate back and forth across the visualization. Also, provide common baselines—for
example, aligning bar segments for different groups by their left edge—to aid
11
Data Visualization Versus Infographics. Infographics are related to data
visualization. Yet, the two forms of information presentation are distinct in several key
aspects.
sometimes at the expense of data accuracy. Data visualizations are typically defined as
objective in their message. Infographics are often made up of several individual data
share many of the same effective design principles (e.g., use of color; graphical
the patterns illustrated by a data visualization, the various graphical features used to
denote data are far from equal and should be prioritized accordingly.
accuracy (Figure 3). Features higher on the list will be better-suited for visual design
features, leading to higher interpretative accuracy than those lower on the list.
12
For example, for quantitative interval/ratio data, varying data element positions
will be more accurately interpreted than varying their length, which, in turn, will be
It’s important to understand these distinctions when making design choices for
distinctions within the data. Avoid those lower in the list unless necessary (Figure 3).
recommended over others. For example, because volume differences are interpreted
less accurately than those based on area or length, 3D graphics are rarely appropriate.
13
Using Color. Color is both a powerful and problematic graphical property for
representing data. Many visualizations used in business settings use color excessively or
One particular risk area is the use of “rainbow” color scales. These fail to
recognize that visual distinctions are not consistent across the color spectrum, leading
2013).
than hue for continuous scales. A second color-related risk area is use of palettes that
don’t account for colorblindness. Online resources exist for checking graphics for
colorblindness suitability and for identifying color schemes that accurately represent
types, which also build on the five main objectives for visualization as listed above.
For each category, I provide a brief description of the key question addressed by
the visualization, identify several representative types and show simplified versions
(without labels or annotations for this purpose). For further “in the wild” examples of
each type—very useful for getting ideas and gauging feasibility for your own data—I
highly recommend searching Google Images for the name of each type.
14
• Comparing Values Across Groups—data visualizations that aid comparisons
visualizations are designed to illustrate data patterns (e.g., which groups are
larger than others; which groups differ most between two time periods; which
between variables, what values most often co-occur with others, and the
15
Figure 5: Representative Visualizations for Displaying Connections or Relationships
Between Variables
designed to depict patterns such how a single data category relates within a
broader grouping (e.g., the scope and depth of an ordered data structure; how a
group breaks down into its proportionate members). Four examples are circle
packing diagrams, tree maps, sunburst diagrams and pie/donut charts (Figure
6).
across a time span. These visualizations show data patterns such as how certain
across years and the trajectory and magnitude of a category’s growth or decline
over time. Four examples are horizon charts, stream graphs, bump charts and
16
Figure 7: Representative Visualizations for Illustrating Change over Time
the foundation onto which other visual properties (for example, value-sized
shapes or color saturation) are placed. Two example types are Choropleth maps
For an expanded, searchable view of over 150 data visualization types, including
descriptions, definitions, data structure guidance, and examples, peruse The Data Viz
Project (http://datavizproject.com/).
17
Tools and Continuous Learning
Tools. Approachability and ability to create high-quality data visualizations may be
easier than you think. Although it’s certainly possible to invest time and expense to
With a wide range of data visualization tools available—and growing every day—
I focus in this paper on three tools that are open-source or widely available, involve a
allows users to enter their own data and to generate many of the visualization types
listed above, and that can be paired with a vector graphics editor to rapidly produce
based data to produce word trees, bubble lines and a dozen other visualizations for
after overruling its default settings, which violate many of the foundational
templates designed specifically for Excel, and the 2016 version of the program
18
A Small-Scale Example. The data and steps below walk through a small-scale
example of data visualization using the RawGraphs tool described above; this example
http://rawgraphs.io/.
• Click on the “Use It Now!” button to progress to the “Load your data” screen.
• Copy and paste the data above into the entry box.
• Click on the “Click here to stack it” button in the lower right, then select
“Region” as the dimension to stack on (this reorients the data into a vertical
format).
• Scroll down to view the available visualization types and select “Bump Chart.”
(Note: Select “Try our samples” in the “Load your data” section to view and
• Scroll further to configure the visualization: Drag and drop “Region” into the
Group box, “column” into the Date box and “value” into the Size box.
19
• The visualization below will appear. The “Download” section can be exported
directly into an image (.png) format (picture file with transparent background),
tool.)
(X-axis is year; wider streams indicate larger values; higher placement on the Y-
and energetic field, many of the leading data visualization practitioners are prolific
curators and authors on the topic. I recommend the work of the four experts below for
staying current on new data visualization tips, templates and examples to emulate:
20
• Andy Kirk (http://www.visualisingdata.com/blog/).
Beyond the work of these experts, I advocate tracking the Twitter hashtag
2016-in-19-charts).
Conclusion
HR professionals armed with skills in information visualization will be more effective in
Visual storytelling skills are highly developable, and they are also exceedingly
skill strength for most), they can make their messages more compelling, memorable
and engaging to the full range of employees and business stakeholders they must
influence.
21
References
Al-Kassab, J., Ouertani, Z. M., Schiuma, G., & Neely, A. (2014). Information visualization to
support management decisions. International Journal of Information Technology &
Decision Making, 13, 407.
Bach, B., Riche, N. H., Carpendale, S., & Pfister, H. (2017). The Emerging Genre of Data Comics.
IEEE Computer Graphics and Applications, 38(3), 6-13.
DDI, The Conference Board, & EY. (2018). Global Leadership Forecast 2018: 25 research insights
to fuel your people strategy. Pittsburgh, PA: DDI.
Kaur, J., & Fink, A., (2017). Trends and Practices in Talent Analytics. Society for Human Resource
Management (SHRM)-Society for Industrial-Organizational Psychology (SIOP) Science
of HR White Paper Series.
Kernbach, S., Eppler, M. J., & Bresciani, S. (2015). The Use of Visualization in the
Communication of Business Strategies: An Experimental Evaluation. International
Journal of Business Communication, 52(2), 164-187.
Kirk, A. (2012). Data visualization: A successful design process. Birmingham, UK: Packt Pub.
Kosara, R. (2013). How the rainbow color map misleads. Retrieved January 7, 2018, from
https://eagereyes.org/basics/rainbow-color-map.
Lunbald, P. (2015) Second Pillar of Mapping Data to Visualizations: Visual Encoding. Retrieved
December 14, 2017, from https://blog.qlik.com/visual-encoding.
Pandey, A. V., Manivannan, A., Nov, O., Satterthwaite, M. L., & Bertini, E. (2014). The
persuasive power of data visualization. New York University Public Law and Legal Theory
Working Papers, Paper 474.
Sinar, E. F. (2015). Data visualization. In S. Tonidandel, E. King, & J. Cortina (Eds.), Big data at
work: The data science revolution and organizational psychology (pp. 115-157). New York,
NY: Taylor & Francis.
Sinar, E. F. (2018). Data Visualization: Get Visual to Drive HR’s Impact and Influence.
Society for Human Resource Management (SHRM)-Society for Industrial-
Organizational Psychology (SIOP) Science of HR White Paper Series.
© 2018 Society for Human Resource Management and Society for Industrial and
Organizational Psychology
22