Developed by Michael Nute
Data Analysis and the Goal of Visualizing Data
• The goal of Data Analysis, generally, is to answer some kind of question or get some kind of significant insight about the outside world based on the data.
– If you say: “The average sales during the month was $500 per day…” that is not analysis.
– If you say “The average sales during the month was $500 per day, which represented a decrease from the prior month and which is alarming because sales increased between the same months one year earlier…” that is analysis.
• Analysis is the act of saying something interesting about the data.
Data Analysis and the Goal of Visualizing Data
• The FUNDAMENTAL UNIT of Data Analysis is the comparison of two numbers.
– It is the building block. The cell. – In the previous example, we have two key comparisons: 1) sales this month vs.
sales the previous month, and 2) this year’s change in sales vs. last year’s change in sales.
• When you create a chart, graph or other visualization, your only goal should be to make the comparisons of the numbers easier for the user. Otherwise you can just give him the numbers, no need for a picture!
Data Visualization Basic Manners
DO • Have scales available wherever needed. Ensure
the units of your scale are intuitive and are clearly displayed.
• Label your axes and add a succinct and descriptive title to the chart.
• Label anything that is not obvious. E.g. if data points are in two different colors, there had better be a difference between them and a label saying what it is.
• Note sample size in small print on the margin of the graph, unless it is obvious in context.
• Annotate your graph with any important additional information. For example, on a time- series graph, denoting important concurrent events is key.
• Cite your sources in footnotes where necessary. Also include any explanatory footnotes for methodology that may not be intuitive.
DON’T • DON’T BE SLOPPY! Don’t have labels or other stuff
overlapping each other. Don’t have titles off- center. Don’t let a single outlier dominate the scale.
• Don’t use gridlines unless absolutely necessary. They are a waste of ink and create visual clutter. If there is an important threshold associated with your data (i.e. a point where above/below is important in context) then add a single gridline for that threshold manually.
• If you absolutely must use gridlines, use the lightest shade possible, and use as few lines as possible (the largest unit possible between each line).
• Don’t use obtrusive typeface. Stick to the basics: TNR, Calibri, TeX fonts, etc… Try to keep it consistent with the rest of your document.
• Don’t put the data values on the graph unless it’s truly necessary (more on this later).
Data Visualization Basic Manners (Dos & Don’ts)
DO: Have scales available wherever needed. Ensure the units of your scale are intuitive and clearly displayed.
Temp (C °) -10.0
0.0 10.0 20.0 30.0
50 -40
-40
-30 -30
-20
-20 -10 0
10
10
10
20
20
30
30
Global Temperature Measurements But wait: which one is hot and which is cold? And how much hotter or colder?
It turns out the data was gathered in December, so the Southern Hemisphere is warmer.
Data Visualization Basic Manners (Dos & Don’ts)
DO: Label your axes and add a succinct and descriptive title to the chart.
Ja n-
07
Ap r-
07
Ju l-0
7
O ct
-0 7
Ja n-
08
Ap r-
08
Ju l-0
8
O ct
-0 8
Ja n-
09
Ap r-
09
Ju l-0
9
O ct
-0 9
Ja n-
10
Ap r-
10
Ju l-1
0
O ct
-1 0
Ja n-
11
Ap r-
11
Ju l-1
1
O ct
-1 1
Mike’s Expenditures by Month on Alcohol
BORING!!
Expenditures by Who?
And for what?
And how much did they Spend?
Oh, well that’s more interesting…
(R ed
ac te
d to
P ro
te ct
M ik
e)
Data Visualization Basic Manners (Dos & Don’ts)
DO: Annotate your graph with any important additional information. For example, on a time-series graph, denoting important concurrent events is key.
Ja n-
07
Ap r-
07
Ju l-0
7
O ct
-0 7
Ja n-
08
Ap r-
08
Ju l-0
8
O ct
-0 8
Ja n-
09
Ap r-
09
Ju l-0
9
O ct
-0 9
Ja n-
10
Ap r-
10
Ju l-1
0
O ct
-1 0
Ja n-
11
Ap r-
11
Ju l-1
1
O ct
-1 1
Mike’s Expenditures by Month on Alcohol
(R ed
ac te
d to
P ro
te ct
M ik
e)
July 2009: Trip to Nantucket with buddies for the 4th of July
May 2010: Two-week trip to brother’s graduation & a Canadian wedding
Ok Let’s Admit it: Summer is the time for partying.
July 2010 & Beyond: girlfriend moves to Minneapolis; weekend travel curtails partying…
Data Visualization Basic Manners (Dos & Don’ts)
DO: Label anything that is not obvious. E.g. if data points are in two different colors, there had better be a difference between them and a label saying what it is.
DO: Note sample size in small print on the margin of the graph, unless it is obvious in context.
DO: Cite your sources in footnotes where necessary. Also include any explanatory footnotes for methodology that may not be intuitive.
N = 30 Source: Fox News
Data Visualization Basic Manners (Dos & Don’ts)
DON’T: DON’T BE SLOPPY! Don’t have labels or other stuff overlapping each other. Don’t have titles off-center. Don’t let a single outlier dominate the scale.
Barry Bonds
Adrian BeltreAlbert PujolsScott Rolen Jim Edmonds
J.D. DrewLance Berkman
Roger Clemens
Mark Loretta Aramis RamirezCarlos BeltranJeff KentMoises Alou
Steve Finley
Todd Helton
.000
.100
.200
.300
.400
.500
.600
.700
.800
.900
.000 .100 .200 .300 .400 .500 .600 .700
Sl ug
gi ng
P ct
.
On-Base Pct.
2004 Baseball MVP Top Vote-Getters OBP vs. Slugging
How many things can you find that are wrong with this? • Title centered over data, not whole picture. • Labels are a mess • Roger Clemens is an outlier because he is a pitcher,
doesn’t belong on the graph • Vivid blue is obtrusive.
Barry Bonds
Adrian Beltre
Albert Pujols
Scott Rolen
Jim Edmonds
J.D. Drew
Lance Berkman
Mark Loretta
Aramis Ramirez Carlos
BeltranJeff Kent Moises Alou
Steve Finley
Todd Helton
.400
.450
.500
.550
.600
.650
.700
.750
.800
.850
.300 .350 .400 .450 .500 .550 .600 .650 Sl
ug gi
ng P
ct .
On-Base Pct.
2004 Baseball MVP Top Vote-Getters OBP vs. Slugging
Slight Improvement
Data Visualization Basic Manners (Dos & Don’ts)
DON’T: Don’t use gridlines unless absolutely necessary. They are a waste of ink and create visual clutter. If there is an important threshold associated with your data (i.e. a point where above/below is important in context) then add a single gridline for that threshold manually. If you absolutely must use gridlines, use the lightest shade possible, and use as few lines as possible (the largest unit possible between each line).
DON’T: Don’t use obtrusive typeface. Stick to the basics: TNR, Calibri, TeX fonts, etc… Try to keep it consistent with the rest of your document.
DON’T: Don’t put the data values on the graph unless it’s truly necessary (more on this later).
Basic Tactics Features of Quality Data Visualization
1. Representationally Faithful – The data scales that your graphic
suggests should be accurate, or should be explicitly noted as not to scale.
2. Simple – Don’t make the user work hard to
figure out what the graphic is saying.
3. Comprehensive – If more context is helpful and you have
the real estate to provide it, then do so.
4. Interesting – Don’t make a graphic that doesn’t have
something interesting to say about the data. The Trifecta (at right) should be in harmony.
What is the Question Of
Interest?
What Does the Data
Say?
What Does the Graphic
Say?
The Data Visualization Trifecta3
3This is also from Kaiser Fung at Junk Charts. He also has a book called Numbers Rule Your World. Consider this a plug, though I have not read it.
Features of Quality Data Visualization Representationally Faithful: Bad Example
Stocks Bonds
Stocks Bonds
I asked my investment manager how much of my money was in stocks vs. bonds, which of these should he send me?
Remember that the goal is to help the viewer compare two numbers. Otherwise it’s just fluff.
Features of Quality Data Visualization Simplicity: Bad Example
00:00
00:43
01:26
02:10
02:53
11:40
18:52
26:04
33:16
40:28
Al um
ni E
ig ht
s M en
Al um
ni E
ig ht
s W om
en Ch
am pi
on sh
ip D
ou bl
es M
en Ch
am pi
on sh
ip D
ou bl
es W
om en
Ch am
pi on
sh ip
E ig
ht s M
en Ch
am pi
on sh
ip E
ig ht
s W om
en Ch
am pi
on sh
ip F
ou rs
M en
Ch am
pi on
sh ip
F ou
rs W
om en
Ch am
pi on
sh ip
S in
gl es
M en
Ch am
pi on
sh ip
S in
gl es
W om
en Cl
ub E
ig ht
s M en
Cl ub
E ig
ht s W
om en
Cl ub
F ou
rs M
en Cl
ub F
ou rs
W om
en Cl
ub S
in gl
es M
en Cl
ub S
in gl
es W
om en
Co lle
gi at
e Ei
gh ts
M en
Co lle
gi at
e Ei
gh ts
W om
en Co
lle gi
at e
Fo ur
s M en
Co lle
gi at
e Fo
ur s W
om en
ct or
s’ Ch
al le
ng e
M ix
ed D
ou bl
e (M
X2 x)
C ha
lle ng
e Pa
re nt
C hi
ld D
ou bl
e (M
X2 x)
Di re
ct or
s’ Ch
al le
ng e
Q ua
ds M
en Di
re ct
or s’
Ch al
le ng
e Q
ua ds
M ix
ed Di
re ct
or s’
Ch al
le ng
e Q
ua ds
W om
en G
ra nd
-M as
te r S
in gl
es M
en [5
0+ ]
G ra
nd -M
as te
r S in
gl es
W om
en [5
0+ ]
Li gh
tw ei
gh t E
ig ht
s M en
Li gh
tw ei
gh t E
ig ht
s W om
en Li
gh tw
ei gh
t F ou
rs M
en Li
gh tw
ei gh
t F ou
rs W
om en
Li gh
tw ei
gh t S
in gl
es M
en Li
gh tw
ei gh
t S in
gl es
W om
en M
as te
r D ou
bl es
M en
[4 0+
] M
as te
r D ou
bl es
W om
en [4
0+ ]
M as
te r E
ig ht
s M en
[4 0+
] M
as te
r E ig
ht s W
om en
[4 0+
] M
as te
r F ou
rs M
en [4
0+ ]
M as
te r F
ou rs
W om
en [4
0+ ]
M as
te r S
in gl
es M
en M
as te
r S in
gl es
W om
en Se
ni or
-M as
te r D
ou bl
es M
en [5
0+ ]
Se ni
or -M
as te
r D ou
bl es
W om
en [5
0+ ]
Se ni
or -M
as te
r E ig
ht s M
en [5
0+ ]
Se ni
or -M
as te
r E ig
ht s W
om en
[5 0+
] Se
ni or
-M as
te r F
ou rs
M en
[5 0+
] Se
ni or
-M as
te r F
ou rs
W om
en [5
0+ ]
Se ni
or -M
as te
r S in
gl es
M en
[4 0+
] Se
ni or
-M as
te r S
in gl
es W
om en
[4 0+
] Se
ni or
-V et
er an
S in
gl es
M en
[7 0+
] Se
ni or
-V et
er an
S in
gl es
W om
en [7
0+ ]
Ve te
ra n
Si ng
le s M
en [6
0+ ]
Ve te
ra n
Si ng
le s W
om en
[6 0+
] Yo
ut h
Do ub
le s M
en Yo
ut h
Do ub
le s W
om en
Yo ut
h Ei
gh ts
M en
Yo ut
h Ei
gh ts
W om
en Yo
ut h
Fo ur
s M en
Yo ut
h Fo
ur s W
om en
Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats
StdDev of OfficialTime Rabbit_Proj Rabbit_Act Avg Caboose_Act Caboose_Proj
Question: Where in the Schedule can we save time by revising our projections of the fastest and slowest boat?
Features of Quality Data Visualization Simplicity: Bad Example
00:00
00:43
01:26
02:10
02:53
11:40
18:52
26:04
33:16
40:28
Al um
ni E
ig ht
s M en
Al um
ni E
ig ht
s W om
en Ch
am pi
on sh
ip D
ou bl
es M
en Ch
am pi
on sh
ip D
ou bl
es W
om en
Ch am
pi on
sh ip
E ig
ht s M
en Ch
am pi
on sh
ip E
ig ht
s W om
en Ch
am pi
on sh
ip F
ou rs
M en
Ch am
pi on
sh ip
F ou
rs W
om en
Ch am
pi on
sh ip
S in
gl es
M en
Ch am
pi on
sh ip
S in
gl es
W om
en Cl
ub E
ig ht
s M en
Cl ub
E ig
ht s W
om en
Cl ub
F ou
rs M
en Cl
ub F
ou rs
W om
en Cl
ub S
in gl
es M
en Cl
ub S
in gl
es W
om en
Co lle
gi at
e Ei
gh ts
M en
Co lle
gi at
e Ei
gh ts
W om
en Co
lle gi
at e
Fo ur
s M en
Co lle
gi at
e Fo
ur s W
om en
ct or
s’ Ch
al le
ng e
M ix
ed D
ou bl
e (M
X2 x)
C ha
lle ng
e Pa
re nt
C hi
ld D
ou bl
e (M
X2 x)
Di re
ct or
s’ Ch
al le
ng e
Q ua
ds M
en Di
re ct
or s’
Ch al
le ng
e Q
ua ds
M ix
ed Di
re ct
or s’
Ch al
le ng
e Q
ua ds
W om
en G
ra nd
-M as
te r S
in gl
es M
en [5
0+ ]
G ra
nd -M
as te
r S in
gl es
W om
en [5
0+ ]
Li gh
tw ei
gh t E
ig ht
s M en
Li gh
tw ei
gh t E
ig ht
s W om
en Li
gh tw
ei gh
t F ou
rs M
en Li
gh tw
ei gh
t F ou
rs W
om en
Li gh
tw ei
gh t S
in gl
es M
en Li
gh tw
ei gh
t S in
gl es
W om
en M
as te
r D ou
bl es
M en
[4 0+
] M
as te
r D ou
bl es
W om
en [4
0+ ]
M as
te r E
ig ht
s M en
[4 0+
] M
as te
r E ig
ht s W
om en
[4 0+
] M
as te
r F ou
rs M
en [4
0+ ]
M as
te r F
ou rs
W om
en [4
0+ ]
M as
te r S
in gl
es M
en M
as te
r S in
gl es
W om
en Se
ni or
-M as
te r D
ou bl
es M
en [5
0+ ]
Se ni
or -M
as te
r D ou
bl es
W om
en [5
0+ ]
Se ni
or -M
as te
r E ig
ht s M
en [5
0+ ]
Se ni
or -M
as te
r E ig
ht s W
om en
[5 0+
] Se
ni or
-M as
te r F
ou rs
M en
[5 0+
] Se
ni or
-M as
te r F
ou rs
W om
en [5
0+ ]
Se ni
or -M
as te
r S in
gl es
M en
[4 0+
] Se
ni or
-M as
te r S
in gl
es W
om en
[4 0+
] Se
ni or
-V et
er an
S in
gl es
M en
[7 0+
] Se
ni or
-V et
er an
S in
gl es
W om
en [7
0+ ]
Ve te
ra n
Si ng
le s M
en [6
0+ ]
Ve te
ra n
Si ng
le s W
om en
[6 0+
] Yo
ut h
Do ub
le s M
en Yo
ut h
Do ub
le s W
om en
Yo ut
h Ei
gh ts
M en
Yo ut
h Ei
gh ts
W om
en Yo
ut h
Fo ur
s M en
Yo ut
h Fo
ur s W
om en
Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats
StdDev of OfficialTime Rabbit_Proj Rabbit_Act Avg Caboose_Act Caboose_Proj
Full name of each event is completely un- necessary, abbreviations are fine. Visual aide in grouping, such as lines or color coding, would be helpful.
What good are the gridlines here?
Standard Deviation adds nothing to the meaning of the graph, should be removed, or at very least in a light grey without a border to blend in better.
The goal here is to understand where in the schedule we can save time by revising our projections of the fastest and slowest boat. But here we have to work very hard to do that.
Features of Quality Data Visualization Simplicity: Bad Example Improved
12:58
15:50
18:43
21:36
24:29
27:22
30:14
33:07
36:00
38:53
41:46
Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats
Positive Margin* Negative Margin Boats on Course Avg
* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.
Features of Quality Data Visualization Comprehensive: Same Example
12:58
15:50
18:43
21:36
24:29
27:22
30:14
33:07
36:00
38:53
41:46
Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats
Positive Margin* Negative Margin Boats on Course Avg
* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.
This graph shows: 1. Every event in the whole
regatta 2. How long each event takes in
the schedule 3. Whether we have room for
error or we are taking a risk 4. The average time for each
event
And all of that in the blink of an eye.
Features of Quality Data Visualization Interesting: Same Example
12:58
15:50
18:43
21:36
24:29
27:22
30:14
33:07
36:00
38:53
41:46
Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats
Positive Margin* Negative Margin Boats on Course Avg
* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.
Let’s Examine the Data Trifecta: 1. What is the Question? Do we
have need to improve our projections of the fastest and slowest boats?
2. What does the data say? Yes, for some events, there is as much as 3 minutes of fat. For others, we will most likely run over the scheduled time.
3. What does the graphic say? Some big green lines, and some big red lines, just like the data!
Basic Tactics Ways to Represent Data Visually
1. Size – Very common: the larger the item, the
larger the value being represented. – E.g. a bar graph or pie chart, or in a
bubble chart.
2. Position – Relative to a scale, can convey size (e.g.
in a scatter plot). – Relative to other data points, can convey
order (e.g. rank order, time order) – In a group or cluster, can convey
similarity
3. Shape – Basic shapes can be used for data
markers to convey grouping – Can be used to convey some feature of
physical shape if applicable
There are many different ways to use visual features of a graphic to represent data. A good graphic will use many of these at the same time. Here are some of the most common:
4. Color – Color scale is a very effective way of
showing magnitude of a data value; easily combined with other methods shown here.
– Can be used similar to shape to convey groupings, but with a natural association for the user (e.g. red for loss, black for profit. Red for hot, blue for cold, etc…)
5. Connectedness – Shows that two items are naturally
related or “connected” to each other. E.g. for a time series.
– Can be used to convey order when position is used for something else.
Ways to Represent Data Visually Example: Size & Position
Marimekko of U.S. Government Spending (N.Y. Times): – Size indicates amount. Position within box indicates category. – Interactive graphic, enables drill down into category detail. Link
Ways to Represent Data Visually Example: Position
XKCD: Movie Narrative ChartsCitation: XKCDLink
Ways to Represent Data Visually Example: Shape
Citation: XKCD Link
Ways to Represent Data Visually Example: Connectedness & Position
Citation: XKCD Link
Ways to Represent Data Visually Example: Color & Position
• This is called a “Cloropleth” or a “Heat Map” – Name two reasons why this is not a very interesting informative.
1. No scale 2. No historical perspective (when is this data from?)
Source: Flowing Data Link
U.S. Unemployment by County
Ways to Represent Data Visually Example: Size & Position
Bubble Chart of U.S. Insurance Industry Financial Performance
ACE
TRV
CB
ALL
PGR
XL
CINF WRB RNR
TRH
HCC
VRAWH
THG
RLI
SIGI
TWGP
HMN
SAFT
NAVG
STFC
NATL
AMSF
ASI
0.0x
0.5x
1.0x
1.5x
2.0x
2.5x
0.0 5.0 10.0 15.0 20.0
Va lu
at io
n (P
ric e
/ Ta
ng ib
le B
oo k
Va lu
e pe
r S ha
re )
Projected 2013 Return on Equity (%)
U.S. Property & Casualty Insurance Industry Financial Valuation vs. Performance
Bubble Size Indicates Market Cap
Is this graphic too cluttered to be useful? Pretty close, but debatable.
Basic Tactics Guiding the Viewer’s Eye
• The viewer’s eye will go first toward the boldest lines and the most vivid colors, so use those to represent the most important items.
– Thicker lines = Bolder – Bright Red, Blue, Green & Yellow are usually the most vivid, don’t use those
unless you want to yell. THINK OF IT LIKE WRITING IN ALL CAPS!!! – Pastels are calmer and more relaxing. Darker colors are also calm, but can be
difficult to tell apart.
• The viewer’s eye will follow lines. If you want the viewer to see points 1, 2, 3 and 4 in that order, then a line connecting them in that order will help.
• The viewer’s eye will usually follow the graphic in this order: 1. Look at the center (most vivid stuff first) 2. Look at the axes & title to figure out what the center means
(Pause to make sure it makes sense)
3. Look for explanations for anything not immediately clear 4. Form an opinion about the stuff in the center
Principles of Data Visualization The Data/Ink Ratio1
• Rule of Thumb: higher is better • The idea here is simple: ink draws the eye of the
viewer, and you don’t want the viewer’s eye to go where there is no meaningful data. We don’t want to make the viewer work to understand our point.
• Don’t clutter the page with garbage that is loud and strictly decorative (such as a choo-choo-train).
• Arrange your data points in such a way that the comparisons the user will be interested in are easy and don’t require doing mental math.
• Another way of saying this: the KISS principle (Keep It Simple, Stupid)
• Negative Examples: – A bar chart with 2 bars on it – Most pie charts – USA Today Snapshots©
1This idea comes from Edward Tufte in The Visual Display of Quantitative Information, which is sort of the bible of Data Visualization
Principles of Data Visualization The Data/Ink Ratio – Compare & Contrast
0.0
0.2
0.4
0.6
0.8
1.0
1.2
0
2
4
6
8
10
12
14
4Q 09
1Q 10
2Q 10
3Q 10
4Q 10
1Q 11
2Q 11
3Q 11
4Q 11
1Q 12
2Q 12
3Q 12
4Q 12
1Q 13
2Q 13
3Q 13
M ic
ro so
ft
G oo
gl e
Earnings Per Share Comparison: Google vs. Microsoft
(figures in $/common share)
GOOG MSFT
Question: How does the growth in earnings per share of Microsoft compare to that of
Google over the last three years? • Which company’s earnings are
growing faster? • Which graph lets you draw that
conclusion faster?
Other things to note here: • In the bottom graphic, if I want to compare
Microsoft’s earnings quarter over quarter, I have to mentally remove a big red vertical bar to do it. The comparison is not made easy.
• The key comparison facilitated in the bottom graphic is “MSFT vs. GOOG in Quarter X”, but that is a useless comparison here because the scales are different.
• This data is a time series, which means that almost always a line graph is going to be better than a bar graph.
• If we added gridlines here, would they help? 0.0
0.2
0.4
0.6
0.8
1.0
1.2
0
2
4
6
8
10
12
14
4Q 09
1Q 10
2Q 10
3Q 10
4Q 10
1Q 11
2Q 11
3Q 11
4Q 11
1Q 12
2Q 12
3Q 12
4Q 12
1Q 13
2Q 13
3Q 13
M ic
ro so
ft
G oo
gl e
Earnings Per Share Comparison: Google vs. Microsoft
(figures in $/common share)
GOOG MSFT2
Principles of Data Visualization The Disappearing Baseline
• Be wary of making an graph with an axis that doesn’t start from zero. – If starting from zero obscures the relevant comparisons, then it may be
acceptable. – If there’s not a good reason, it will be seen as purposely misleading.
3.95
3.96
3.97
3.98
3.99
4.00
Voon’s GPA John’s GPA
True Story: Mike’s friend Voon was always a good student, and John wasn’t. So when John went back to business school and had a good first semester, he sent this:
*This is also classic Tufte
Principles of Data Visualization The Self-Sufficiency Principle2
• The basic idea: your graphic should be valuable on its own without having to have the data behind it shown at the same time.
– If I have to refer back to the values in order to make sense of the graphic, then what is the point of putting up the graphic? Remember, it’s suppose to help me compare and contrast data points.
• This is not a hard rule, just something to bear in mind. – It’s always helpful to be asking yourself “Do I really need to be doing this?” If you’re not
sure if or why the answer is yes, then maybe you can be doing something more useful.
• Sometimes the values themselves are helpful because they provide additional context beyond the comparisons of interest in the graphic.
– Example: a graph showing growth in profits over time for a business may be effective in showing the trend for the business, but for an investor considering purchasing shares in the company, it may help them to see exactly how much profit that represents.
– One often useful option is to show not the values themselves but some transformation of the values. For example, instead of showing the value of profits over time, the graph could show the values for return on investment represented by the profits. This number is directly interesting to the investor, regardless of trend.
2This idea comes from Kaiser Fung at Junk Charts (http://junkcharts.typepad.com)
Principles of Data Visualization The Self-Sufficiency Principle – Compare & Contrast
9.5 9.2
8.9 8.9 8.8 8.8 8.7 8.6 8.6 8.5 8.5 8.5 8.4 8.1 8.1
7.0
8.0
9.0
10.0
U ne
m pl
oy m
en t R
at e
(% )
Top 15 States by Unemployment – July 2013
Look at the graphic below. Do the values add anything here? • Actually, they do a little. Without them all we have are ten bars that say the same thing in the
middle, with a couple on each end that are a little different. This is not a very informative graphic.
Do we even need the graph—what about a slightly annotated table? • No, we really don’t. Because we’ve sorted the data, there are only a few comparisons of
interest, namely the “break” points between IL/NC and IN/CT. We can convey the same information in a table with a lot less real estate.
Nevada 9.5 Il l inois 9.2 North Carolina 8.9 Rhode Island 8.9 Georgia 8.8 Michigan 8.8 California 8.7 D.C. 8.6 New Jersey 8.6 Kentucky 8.5 Mississippi 8.5 Tennessee 8.5 Indiana 8.4 Connecticut 8.1 South Carolina 8.1
Top 15 States by Unemployment Rate – July
2013
Example Histogram
Sometimes the relevant comparison for the data is to something more abstract like a Normal Distribution:
0
50
100
150
200
250
300
350
400
450
0.0 – 0.5
0.5 – 1.0
1.0 – 1.5
1.5 – 2.0
2.0 – 2.5
2.5 – 3.0
3.0 – 3.5
3.5 – 4.0
4.0 – 4.5
4.5 – 5.0
5.0 – 5.5
5.5 – 6.0
6.0 – 6.5
6.5 – 7.0
7.0 – 7.5
7.5 – 8.0
8.0 – 8.5
8.5 – 9.0
9.0 – 9.5
9.5 – 10.0
10.0 – 10.5
10.5 – 11.0
11.0 – 11.5
11.5 – 12.0
12.0 – 12.5
12.5 – 13.0
13.0 – 13.5
13.5 – 14.0 #
of R
ec or
ds
Histogram of Adjusted Passing Yards/Attempt Based on Quarterback-Seasons, 2002-2012
Note: Passing averages of 0 are omitted, and averages greater than 14 are not shown.
Examples Table with Color
This graphic shows which baseball teams made the playoffs each year, how they did in the playoffs and where they ranked in player salary budget.
The conclusion from this table is quick and easy: playoff teams tend to come from the top of the ranks in budget.
Source: Kaiser Fung (www.junkcharts.typepad.co m) Link
Examples Line Chart with Color & Non-Uniform Time-Scale This graphic shows the number of seats available on a United flights from O’Hare to Boston on a specific day, graphed over time as the date approached the departure date.
6:55 AM 8:00 AM
9:21 AM
11:14 AM
1:12 PM
2:37 PM
4:36 PM
5:27 PM 7:08 PM
9:09 PM
-20
0
20
40
60
80
100
120
140
6/13 6/18 6/23 6/28 7/3 7/8 7/13 7/18
# of
A va
ila bl
e Se
at s,
b y
De pa
rt ur
e
As Of
Sunday 7/22: ORD-BOS Flights 6:55 AM
8:00 AM
9:21 AM
11:14 AM
1:12 PM
2:37 PM
4:36 PM
5:27 PM
7:08 PM
9:09 PM
Scheduled capacity can change suddenly if the scheduled aircraft changes
Early flights are often more popular on Sundays, as they were on this day
Note that this graphic conveys four different data points at the same time: 1) The number of seats
available (y-axis) 2) The as-of date for the
capacity (x-axis) 3) Which departures had
the most capacity (grouping by line)
4) What time of day each departure was (color)
Also note the annotation for two odd features of the chart.
Examples Multiple Overlapping and Stacked Time Series (Yahoo! Finance)
Note the use of the blue here to make the major stock of interest (THG) the most vivid. The other three are there for comparison only.
This graphic compares the share price percent gain for a single stock (THG) to three major indices. At the bottom is the THG trading volume for each trading day shown.
This construction of pairing a volume measure with a time series of interest is a common way to provide important context. Often when the metric is a ratio, the “volume” is the denominator.
Note that the gridlines, such as they are, are light-toned. The vertical gridlines also convey how many trading days were in the period.
Example Basic Table
AAPL Fiscal Quarter 3Q2013 2Q2013 1Q2013 FY2012 4Q2012 3Q2012 2Q2012 1Q2012 Calendar Quarter 2Q2013 1Q2013 4Q2012 TFQ 3Q2012 2Q2012 1Q2012 4Q2011 Income Statement Net sales 36,551 43,603 54,512 156,508 35,966 35,023 39,186 46,333 Cost of sales 23,210 27,254 33,452 87,846 21,565 20,029 20,622 25,630
Gross margin 13,341 16,349 21,060 68,662 14,401 14,994 18,564 20,703
Operating expenses: Research and development 1,343 1,119 1,010 3,381 906 876 841 758 Selling, general and administrative 2,672 2,672 2,840 10,040 2,551 2,545 2,339 2,605
Total operating expenses 4,015 3,791 3,850 13,421 3,457 3,421 3,180 3,363
Operating income 9,326 12,558 17,210 55,241 10,944 11,573 15,384 17,340 Other income and expense 347 347 462 522 (51) 288 148 137
Income before provision for income taxes 8,979 12,905 17,672 55,763 10,893 11,861 15,532 17,477 Provision for income taxes 2,424 3,358 4,594 14,030 2,670 3,037 3,910 4,413
Net income 6,555 9,547 13,078 41,733 8,223 8,824 11,622 13,064
Earnings per common share: Basic 10.16 13.93$ 44.64 8.77 9.42$ 12.45 14.03 Diluted 7.12$ 10.09 13.81$ 44.15 8.68 9.32$ 12.30 13.87
Shares used in computing earnings per share: Basic 939,629 938,916 934,818 938,053 936,596 933,582 931,041 Diluted 921,035 946,035 947,217 945,355 947,896 947,059 944,893 941,572 Shares Added During Period (1,182) (679) 837 2,166 3,321 1,900
Cash Dividend Declared Per Share 3.05 2.65 2.65 2.65
Select Balance Sheet Items 7,575 15,861 4,030 7,045 12,575 16,031 Cash on hand 138,433 144,687 137,112 121,251 121,251 117,221 110,176 97,601 Cash per share 150.30$ 152.94$ 144.75$ 127.92$ 127.92$ 123.77$ 116.60$ 103.66$
This table is a summary income statement for AAPL for the most recent 7 quarters. Projected results are in blue, and important summary lines are in peach.
The table layout allows the viewer to follow the math behind each calculation, essentially a different use of position.
Note that the viewers eye is guided toward the most important lines via the summary underlines and the bold type. Also the peach color.
Exercise What Works or Doesn’t Work About This Graphic?
Graphic from a previous Career Center in-class presentation
Exercise What Works or Doesn’t Work About This Graphic?
Graphic pulled from here via Junk Charts.
- Data Visualization�Theory & Best Practices
- Data Analysis and the �Goal of Visualizing Data
- Data Analysis and the �Goal of Visualizing Data
- Data Visualization�Basic Manners
- Data Visualization�Basic Manners (Dos & Don’ts)
- Data Visualization�Basic Manners (Dos & Don’ts)
- Data Visualization�Basic Manners (Dos & Don’ts)
- Data Visualization�Basic Manners (Dos & Don’ts)
- Data Visualization�Basic Manners (Dos & Don’ts)
- Data Visualization�Basic Manners (Dos & Don’ts)
- Basic Tactics�Features of Quality Data Visualization
- Features of Quality Data Visualization�Representationally Faithful: Bad Example
- Features of Quality Data Visualization�Simplicity: Bad Example
- Features of Quality Data Visualization�Simplicity: Bad Example
- Features of Quality Data Visualization�Simplicity: Bad Example Improved
- Features of Quality Data Visualization�Comprehensive: Same Example
- Features of Quality Data Visualization�Interesting: Same Example
- Basic Tactics�Ways to Represent Data Visually
- Ways to Represent Data Visually�Example: Size & Position
- Ways to Represent Data Visually�Example: Position
- Ways to Represent Data Visually�Example: Shape
- Ways to Represent Data Visually�Example: Connectedness & Position
- Ways to Represent Data Visually�Example: Color & Position
- Ways to Represent Data Visually�Example: Size & Position
- Basic Tactics�Guiding the Viewer’s Eye
- Principles of Data Visualization�The Data/Ink Ratio1
- Principles of Data Visualization�The Data/Ink Ratio – Compare & Contrast
- Principles of Data Visualization�The Disappearing Baseline
- Principles of Data Visualization�The Self-Sufficiency Principle2
- Principles of Data Visualization�The Self-Sufficiency Principle – Compare & Contrast
- Example�Histogram
- Examples�Table with Color
- Examples�Line Chart with Color & Non-Uniform Time-Scale
- Examples�Multiple Overlapping and Stacked Time Series (Yahoo! Finance)
- Example�Basic Table
- Exercise�What Works or Doesn’t Work About This Graphic?
- Exercise�What Works or Doesn’t Work About This Graphic?
Are you looking for a similar paper or any other quality academic essay? Then look no further. Our research paper writing service is what you require. Our team of experienced writers is on standby to deliver to you an original paper as per your specified instructions with zero plagiarism guaranteed. This is the perfect way you can prepare your own unique academic paper and score the grades you deserve.
[meteor_slideshow slideshow="slide2"]Use the order calculator below and get ordering with idealtermpapers.com now! Contact our live support team for any assistance or inquiry.
[order_calculator]