Data Visualization Theory & Best Practices

[meteor_slideshow slideshow="slide1"]

Developed by Michael Nute

Data Analysis and the Goal of Visualizing Data

• The goal of Data Analysis, generally, is to answer some kind of question or get some kind of significant insight about the outside world based on the data.

– If you say: “The average sales during the month was $500 per day…” that is not analysis.

– If you say “The average sales during the month was $500 per day, which represented a decrease from the prior month and which is alarming because sales increased between the same months one year earlier…” that is analysis.

• Analysis is the act of saying something interesting about the data.

Data Analysis and the Goal of Visualizing Data

• The FUNDAMENTAL UNIT of Data Analysis is the comparison of two numbers.

– It is the building block. The cell. – In the previous example, we have two key comparisons: 1) sales this month vs.

sales the previous month, and 2) this year’s change in sales vs. last year’s change in sales.

• When you create a chart, graph or other visualization, your only goal should be to make the comparisons of the numbers easier for the user. Otherwise you can just give him the numbers, no need for a picture!

Data Visualization Basic Manners

DO • Have scales available wherever needed. Ensure

the units of your scale are intuitive and are clearly displayed.

• Label your axes and add a succinct and descriptive title to the chart.

• Label anything that is not obvious. E.g. if data points are in two different colors, there had better be a difference between them and a label saying what it is.

• Note sample size in small print on the margin of the graph, unless it is obvious in context.

• Annotate your graph with any important additional information. For example, on a time- series graph, denoting important concurrent events is key.

• Cite your sources in footnotes where necessary. Also include any explanatory footnotes for methodology that may not be intuitive.

DON’T • DON’T BE SLOPPY! Don’t have labels or other stuff

overlapping each other. Don’t have titles off- center. Don’t let a single outlier dominate the scale.

• Don’t use gridlines unless absolutely necessary. They are a waste of ink and create visual clutter. If there is an important threshold associated with your data (i.e. a point where above/below is important in context) then add a single gridline for that threshold manually.

• If you absolutely must use gridlines, use the lightest shade possible, and use as few lines as possible (the largest unit possible between each line).

• Don’t use obtrusive typeface. Stick to the basics: TNR, Calibri, TeX fonts, etc… Try to keep it consistent with the rest of your document.

• Don’t put the data values on the graph unless it’s truly necessary (more on this later).

Data Visualization Basic Manners (Dos & Don’ts)

DO: Have scales available wherever needed. Ensure the units of your scale are intuitive and clearly displayed.

Temp (C °) -10.0

0.0 10.0 20.0 30.0

50 -40

-40

-30 -30

-20

-20 -10 0

10

10

10

20

20

30

30

Global Temperature Measurements But wait: which one is hot and which is cold? And how much hotter or colder?

It turns out the data was gathered in December, so the Southern Hemisphere is warmer.

Data Visualization Basic Manners (Dos & Don’ts)

DO: Label your axes and add a succinct and descriptive title to the chart.

Ja n-

07

Ap r-

07

Ju l-0

7

O ct

-0 7

Ja n-

08

Ap r-

08

Ju l-0

8

O ct

-0 8

Ja n-

09

Ap r-

09

Ju l-0

9

O ct

-0 9

Ja n-

10

Ap r-

10

Ju l-1

0

O ct

-1 0

Ja n-

11

Ap r-

11

Ju l-1

1

O ct

-1 1

Mike’s Expenditures by Month on Alcohol

BORING!!

Expenditures by Who?

And for what?

And how much did they Spend?

Oh, well that’s more interesting…

(R ed

ac te

d to

P ro

te ct

M ik

e)

Data Visualization Basic Manners (Dos & Don’ts)

DO: Annotate your graph with any important additional information. For example, on a time-series graph, denoting important concurrent events is key.

Ja n-

07

Ap r-

07

Ju l-0

7

O ct

-0 7

Ja n-

08

Ap r-

08

Ju l-0

8

O ct

-0 8

Ja n-

09

Ap r-

09

Ju l-0

9

O ct

-0 9

Ja n-

10

Ap r-

10

Ju l-1

0

O ct

-1 0

Ja n-

11

Ap r-

11

Ju l-1

1

O ct

-1 1

Mike’s Expenditures by Month on Alcohol

(R ed

ac te

d to

P ro

te ct

M ik

e)

July 2009: Trip to Nantucket with buddies for the 4th of July

May 2010: Two-week trip to brother’s graduation & a Canadian wedding

Ok Let’s Admit it: Summer is the time for partying.

July 2010 & Beyond: girlfriend moves to Minneapolis; weekend travel curtails partying…

Data Visualization Basic Manners (Dos & Don’ts)

DO: Label anything that is not obvious. E.g. if data points are in two different colors, there had better be a difference between them and a label saying what it is.

DO: Note sample size in small print on the margin of the graph, unless it is obvious in context.

DO: Cite your sources in footnotes where necessary. Also include any explanatory footnotes for methodology that may not be intuitive.

N = 30 Source: Fox News

Data Visualization Basic Manners (Dos & Don’ts)

DON’T: DON’T BE SLOPPY! Don’t have labels or other stuff overlapping each other. Don’t have titles off-center. Don’t let a single outlier dominate the scale.

Barry Bonds

Adrian BeltreAlbert PujolsScott Rolen Jim Edmonds

J.D. DrewLance Berkman

Roger Clemens

Mark Loretta Aramis RamirezCarlos BeltranJeff KentMoises Alou

Steve Finley

Todd Helton

.000

.100

.200

.300

.400

.500

.600

.700

.800

.900

.000 .100 .200 .300 .400 .500 .600 .700

Sl ug

gi ng

P ct

.

On-Base Pct.

2004 Baseball MVP Top Vote-Getters OBP vs. Slugging

How many things can you find that are wrong with this? • Title centered over data, not whole picture. • Labels are a mess • Roger Clemens is an outlier because he is a pitcher,

doesn’t belong on the graph • Vivid blue is obtrusive.

Barry Bonds

Adrian Beltre

Albert Pujols

Scott Rolen

Jim Edmonds

J.D. Drew

Lance Berkman

Mark Loretta

Aramis Ramirez Carlos

BeltranJeff Kent Moises Alou

Steve Finley

Todd Helton

.400

.450

.500

.550

.600

.650

.700

.750

.800

.850

.300 .350 .400 .450 .500 .550 .600 .650 Sl

ug gi

ng P

ct .

On-Base Pct.

2004 Baseball MVP Top Vote-Getters OBP vs. Slugging

Slight Improvement

Data Visualization Basic Manners (Dos & Don’ts)

DON’T: Don’t use gridlines unless absolutely necessary. They are a waste of ink and create visual clutter. If there is an important threshold associated with your data (i.e. a point where above/below is important in context) then add a single gridline for that threshold manually. If you absolutely must use gridlines, use the lightest shade possible, and use as few lines as possible (the largest unit possible between each line).

DON’T: Don’t use obtrusive typeface. Stick to the basics: TNR, Calibri, TeX fonts, etc… Try to keep it consistent with the rest of your document.

DON’T: Don’t put the data values on the graph unless it’s truly necessary (more on this later).

Basic Tactics Features of Quality Data Visualization

1. Representationally Faithful – The data scales that your graphic

suggests should be accurate, or should be explicitly noted as not to scale.

2. Simple – Don’t make the user work hard to

figure out what the graphic is saying.

3. Comprehensive – If more context is helpful and you have

the real estate to provide it, then do so.

4. Interesting – Don’t make a graphic that doesn’t have

something interesting to say about the data. The Trifecta (at right) should be in harmony.

What is the Question Of

Interest?

What Does the Data

Say?

What Does the Graphic

Say?

The Data Visualization Trifecta3

3This is also from Kaiser Fung at Junk Charts. He also has a book called Numbers Rule Your World. Consider this a plug, though I have not read it.

Features of Quality Data Visualization Representationally Faithful: Bad Example

Stocks Bonds

Stocks Bonds

I asked my investment manager how much of my money was in stocks vs. bonds, which of these should he send me?

Remember that the goal is to help the viewer compare two numbers. Otherwise it’s just fluff.

Features of Quality Data Visualization Simplicity: Bad Example

00:00

00:43

01:26

02:10

02:53

11:40

18:52

26:04

33:16

40:28

Al um

ni E

ig ht

s M en

Al um

ni E

ig ht

s W om

en Ch

am pi

on sh

ip D

ou bl

es M

en Ch

am pi

on sh

ip D

ou bl

es W

om en

Ch am

pi on

sh ip

E ig

ht s M

en Ch

am pi

on sh

ip E

ig ht

s W om

en Ch

am pi

on sh

ip F

ou rs

M en

Ch am

pi on

sh ip

F ou

rs W

om en

Ch am

pi on

sh ip

S in

gl es

M en

Ch am

pi on

sh ip

S in

gl es

W om

en Cl

ub E

ig ht

s M en

Cl ub

E ig

ht s W

om en

Cl ub

F ou

rs M

en Cl

ub F

ou rs

W om

en Cl

ub S

in gl

es M

en Cl

ub S

in gl

es W

om en

Co lle

gi at

e Ei

gh ts

M en

Co lle

gi at

e Ei

gh ts

W om

en Co

lle gi

at e

Fo ur

s M en

Co lle

gi at

e Fo

ur s W

om en

ct or

s’ Ch

al le

ng e

M ix

ed D

ou bl

e (M

X2 x)

C ha

lle ng

e Pa

re nt

C hi

ld D

ou bl

e (M

X2 x)

Di re

ct or

s’ Ch

al le

ng e

Q ua

ds M

en Di

re ct

or s’

Ch al

le ng

e Q

ua ds

M ix

ed Di

re ct

or s’

Ch al

le ng

e Q

ua ds

W om

en G

ra nd

-M as

te r S

in gl

es M

en [5

0+ ]

G ra

nd -M

as te

r S in

gl es

W om

en [5

0+ ]

Li gh

tw ei

gh t E

ig ht

s M en

Li gh

tw ei

gh t E

ig ht

s W om

en Li

gh tw

ei gh

t F ou

rs M

en Li

gh tw

ei gh

t F ou

rs W

om en

Li gh

tw ei

gh t S

in gl

es M

en Li

gh tw

ei gh

t S in

gl es

W om

en M

as te

r D ou

bl es

M en

[4 0+

] M

as te

r D ou

bl es

W om

en [4

0+ ]

M as

te r E

ig ht

s M en

[4 0+

] M

as te

r E ig

ht s W

om en

[4 0+

] M

as te

r F ou

rs M

en [4

0+ ]

M as

te r F

ou rs

W om

en [4

0+ ]

M as

te r S

in gl

es M

en M

as te

r S in

gl es

W om

en Se

ni or

-M as

te r D

ou bl

es M

en [5

0+ ]

Se ni

or -M

as te

r D ou

bl es

W om

en [5

0+ ]

Se ni

or -M

as te

r E ig

ht s M

en [5

0+ ]

Se ni

or -M

as te

r E ig

ht s W

om en

[5 0+

] Se

ni or

-M as

te r F

ou rs

M en

[5 0+

] Se

ni or

-M as

te r F

ou rs

W om

en [5

0+ ]

Se ni

or -M

as te

r S in

gl es

M en

[4 0+

] Se

ni or

-M as

te r S

in gl

es W

om en

[4 0+

] Se

ni or

-V et

er an

S in

gl es

M en

[7 0+

] Se

ni or

-V et

er an

S in

gl es

W om

en [7

0+ ]

Ve te

ra n

Si ng

le s M

en [6

0+ ]

Ve te

ra n

Si ng

le s W

om en

[6 0+

] Yo

ut h

Do ub

le s M

en Yo

ut h

Do ub

le s W

om en

Yo ut

h Ei

gh ts

M en

Yo ut

h Ei

gh ts

W om

en Yo

ut h

Fo ur

s M en

Yo ut

h Fo

ur s W

om en

Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats

StdDev of OfficialTime Rabbit_Proj Rabbit_Act Avg Caboose_Act Caboose_Proj

Question: Where in the Schedule can we save time by revising our projections of the fastest and slowest boat?

Features of Quality Data Visualization Simplicity: Bad Example

00:00

00:43

01:26

02:10

02:53

11:40

18:52

26:04

33:16

40:28

Al um

ni E

ig ht

s M en

Al um

ni E

ig ht

s W om

en Ch

am pi

on sh

ip D

ou bl

es M

en Ch

am pi

on sh

ip D

ou bl

es W

om en

Ch am

pi on

sh ip

E ig

ht s M

en Ch

am pi

on sh

ip E

ig ht

s W om

en Ch

am pi

on sh

ip F

ou rs

M en

Ch am

pi on

sh ip

F ou

rs W

om en

Ch am

pi on

sh ip

S in

gl es

M en

Ch am

pi on

sh ip

S in

gl es

W om

en Cl

ub E

ig ht

s M en

Cl ub

E ig

ht s W

om en

Cl ub

F ou

rs M

en Cl

ub F

ou rs

W om

en Cl

ub S

in gl

es M

en Cl

ub S

in gl

es W

om en

Co lle

gi at

e Ei

gh ts

M en

Co lle

gi at

e Ei

gh ts

W om

en Co

lle gi

at e

Fo ur

s M en

Co lle

gi at

e Fo

ur s W

om en

ct or

s’ Ch

al le

ng e

M ix

ed D

ou bl

e (M

X2 x)

C ha

lle ng

e Pa

re nt

C hi

ld D

ou bl

e (M

X2 x)

Di re

ct or

s’ Ch

al le

ng e

Q ua

ds M

en Di

re ct

or s’

Ch al

le ng

e Q

ua ds

M ix

ed Di

re ct

or s’

Ch al

le ng

e Q

ua ds

W om

en G

ra nd

-M as

te r S

in gl

es M

en [5

0+ ]

G ra

nd -M

as te

r S in

gl es

W om

en [5

0+ ]

Li gh

tw ei

gh t E

ig ht

s M en

Li gh

tw ei

gh t E

ig ht

s W om

en Li

gh tw

ei gh

t F ou

rs M

en Li

gh tw

ei gh

t F ou

rs W

om en

Li gh

tw ei

gh t S

in gl

es M

en Li

gh tw

ei gh

t S in

gl es

W om

en M

as te

r D ou

bl es

M en

[4 0+

] M

as te

r D ou

bl es

W om

en [4

0+ ]

M as

te r E

ig ht

s M en

[4 0+

] M

as te

r E ig

ht s W

om en

[4 0+

] M

as te

r F ou

rs M

en [4

0+ ]

M as

te r F

ou rs

W om

en [4

0+ ]

M as

te r S

in gl

es M

en M

as te

r S in

gl es

W om

en Se

ni or

-M as

te r D

ou bl

es M

en [5

0+ ]

Se ni

or -M

as te

r D ou

bl es

W om

en [5

0+ ]

Se ni

or -M

as te

r E ig

ht s M

en [5

0+ ]

Se ni

or -M

as te

r E ig

ht s W

om en

[5 0+

] Se

ni or

-M as

te r F

ou rs

M en

[5 0+

] Se

ni or

-M as

te r F

ou rs

W om

en [5

0+ ]

Se ni

or -M

as te

r S in

gl es

M en

[4 0+

] Se

ni or

-M as

te r S

in gl

es W

om en

[4 0+

] Se

ni or

-V et

er an

S in

gl es

M en

[7 0+

] Se

ni or

-V et

er an

S in

gl es

W om

en [7

0+ ]

Ve te

ra n

Si ng

le s M

en [6

0+ ]

Ve te

ra n

Si ng

le s W

om en

[6 0+

] Yo

ut h

Do ub

le s M

en Yo

ut h

Do ub

le s W

om en

Yo ut

h Ei

gh ts

M en

Yo ut

h Ei

gh ts

W om

en Yo

ut h

Fo ur

s M en

Yo ut

h Fo

ur s W

om en

Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats

StdDev of OfficialTime Rabbit_Proj Rabbit_Act Avg Caboose_Act Caboose_Proj

Full name of each event is completely un- necessary, abbreviations are fine. Visual aide in grouping, such as lines or color coding, would be helpful.

What good are the gridlines here?

Standard Deviation adds nothing to the meaning of the graph, should be removed, or at very least in a light grey without a border to blend in better.

The goal here is to understand where in the schedule we can save time by revising our projections of the fastest and slowest boat. But here we have to work very hard to do that.

Features of Quality Data Visualization Simplicity: Bad Example Improved

12:58

15:50

18:43

21:36

24:29

27:22

30:14

33:07

36:00

38:53

41:46

Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats

Positive Margin* Negative Margin Boats on Course Avg

* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.

Features of Quality Data Visualization Comprehensive: Same Example

12:58

15:50

18:43

21:36

24:29

27:22

30:14

33:07

36:00

38:53

41:46

Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats

Positive Margin* Negative Margin Boats on Course Avg

* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.

This graph shows: 1. Every event in the whole

regatta 2. How long each event takes in

the schedule 3. Whether we have room for

error or we are taking a risk 4. The average time for each

event

And all of that in the blink of an eye.

Features of Quality Data Visualization Interesting: Same Example

12:58

15:50

18:43

21:36

24:29

27:22

30:14

33:07

36:00

38:53

41:46

Head of the Charles Regatta Projected vs. Actual Fastest & Slowest Boats

Positive Margin* Negative Margin Boats on Course Avg

* Note: For fastest boats, positive margin indicates projected is faster than actual. For slowest boats, positive margin indicates projected is slower than actual.

Let’s Examine the Data Trifecta: 1. What is the Question? Do we

have need to improve our projections of the fastest and slowest boats?

2. What does the data say? Yes, for some events, there is as much as 3 minutes of fat. For others, we will most likely run over the scheduled time.

3. What does the graphic say? Some big green lines, and some big red lines, just like the data!

Basic Tactics Ways to Represent Data Visually

1. Size – Very common: the larger the item, the

larger the value being represented. – E.g. a bar graph or pie chart, or in a

bubble chart.

2. Position – Relative to a scale, can convey size (e.g.

in a scatter plot). – Relative to other data points, can convey

order (e.g. rank order, time order) – In a group or cluster, can convey

similarity

3. Shape – Basic shapes can be used for data

markers to convey grouping – Can be used to convey some feature of

physical shape if applicable

There are many different ways to use visual features of a graphic to represent data. A good graphic will use many of these at the same time. Here are some of the most common:

4. Color – Color scale is a very effective way of

showing magnitude of a data value; easily combined with other methods shown here.

– Can be used similar to shape to convey groupings, but with a natural association for the user (e.g. red for loss, black for profit. Red for hot, blue for cold, etc…)

5. Connectedness – Shows that two items are naturally

related or “connected” to each other. E.g. for a time series.

– Can be used to convey order when position is used for something else.

Ways to Represent Data Visually Example: Size & Position

Marimekko of U.S. Government Spending (N.Y. Times): – Size indicates amount. Position within box indicates category. – Interactive graphic, enables drill down into category detail. Link

http://www.nytimes.com/packages/html/newsgraphics/2011/0119-budget/

Ways to Represent Data Visually Example: Position

XKCD: Movie Narrative ChartsCitation: XKCDLink

http://xkcd.com/657/

Ways to Represent Data Visually Example: Shape

Citation: XKCD Link

http://xkcd.com/388/

Ways to Represent Data Visually Example: Connectedness & Position

Citation: XKCD Link

http://xkcd.com/1056/

Ways to Represent Data Visually Example: Color & Position

• This is called a “Cloropleth” or a “Heat Map” – Name two reasons why this is not a very interesting informative.

1. No scale 2. No historical perspective (when is this data from?)

Source: Flowing Data Link

U.S. Unemployment by County

http://flowingdata.com/2009/11/12/how-to-make-a-us-county-thematic-map-using-free-tools/

Ways to Represent Data Visually Example: Size & Position

Bubble Chart of U.S. Insurance Industry Financial Performance

ACE

TRV

CB

ALL

PGR

XL

CINF WRB RNR

TRH

HCC

VRAWH

THG

RLI

SIGI

TWGP

HMN

SAFT

NAVG

STFC

NATL

AMSF

ASI

0.0x

0.5x

1.0x

1.5x

2.0x

2.5x

0.0 5.0 10.0 15.0 20.0

Va lu

at io

n (P

ric e

/ Ta

ng ib

le B

oo k

Va lu

e pe

r S ha

re )

Projected 2013 Return on Equity (%)

U.S. Property & Casualty Insurance Industry Financial Valuation vs. Performance

Bubble Size Indicates Market Cap

Is this graphic too cluttered to be useful? Pretty close, but debatable.

Basic Tactics Guiding the Viewer’s Eye

• The viewer’s eye will go first toward the boldest lines and the most vivid colors, so use those to represent the most important items.

– Thicker lines = Bolder – Bright Red, Blue, Green & Yellow are usually the most vivid, don’t use those

unless you want to yell. THINK OF IT LIKE WRITING IN ALL CAPS!!! – Pastels are calmer and more relaxing. Darker colors are also calm, but can be

difficult to tell apart.

• The viewer’s eye will follow lines. If you want the viewer to see points 1, 2, 3 and 4 in that order, then a line connecting them in that order will help.

• The viewer’s eye will usually follow the graphic in this order: 1. Look at the center (most vivid stuff first) 2. Look at the axes & title to figure out what the center means

(Pause to make sure it makes sense)

3. Look for explanations for anything not immediately clear 4. Form an opinion about the stuff in the center

Principles of Data Visualization The Data/Ink Ratio1

• Rule of Thumb: higher is better • The idea here is simple: ink draws the eye of the

viewer, and you don’t want the viewer’s eye to go where there is no meaningful data. We don’t want to make the viewer work to understand our point.

• Don’t clutter the page with garbage that is loud and strictly decorative (such as a choo-choo-train).

• Arrange your data points in such a way that the comparisons the user will be interested in are easy and don’t require doing mental math.

• Another way of saying this: the KISS principle (Keep It Simple, Stupid)

• Negative Examples: – A bar chart with 2 bars on it – Most pie charts – USA Today Snapshots©

1This idea comes from Edward Tufte in The Visual Display of Quantitative Information, which is sort of the bible of Data Visualization

Principles of Data Visualization The Data/Ink Ratio – Compare & Contrast

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0

2

4

6

8

10

12

14

4Q 09

1Q 10

2Q 10

3Q 10

4Q 10

1Q 11

2Q 11

3Q 11

4Q 11

1Q 12

2Q 12

3Q 12

4Q 12

1Q 13

2Q 13

3Q 13

M ic

ro so

ft

G oo

gl e

Earnings Per Share Comparison: Google vs. Microsoft

(figures in $/common share)

GOOG MSFT

Question: How does the growth in earnings per share of Microsoft compare to that of

Google over the last three years? • Which company’s earnings are

growing faster? • Which graph lets you draw that

conclusion faster?

Other things to note here: • In the bottom graphic, if I want to compare

Microsoft’s earnings quarter over quarter, I have to mentally remove a big red vertical bar to do it. The comparison is not made easy.

• The key comparison facilitated in the bottom graphic is “MSFT vs. GOOG in Quarter X”, but that is a useless comparison here because the scales are different.

• This data is a time series, which means that almost always a line graph is going to be better than a bar graph.

• If we added gridlines here, would they help? 0.0

0.2

0.4

0.6

0.8

1.0

1.2

0

2

4

6

8

10

12

14

4Q 09

1Q 10

2Q 10

3Q 10

4Q 10

1Q 11

2Q 11

3Q 11

4Q 11

1Q 12

2Q 12

3Q 12

4Q 12

1Q 13

2Q 13

3Q 13

M ic

ro so

ft

G oo

gl e

Earnings Per Share Comparison: Google vs. Microsoft

(figures in $/common share)

GOOG MSFT2

Principles of Data Visualization The Disappearing Baseline

• Be wary of making an graph with an axis that doesn’t start from zero. – If starting from zero obscures the relevant comparisons, then it may be

acceptable. – If there’s not a good reason, it will be seen as purposely misleading.

3.95

3.96

3.97

3.98

3.99

4.00

Voon’s GPA John’s GPA

True Story: Mike’s friend Voon was always a good student, and John wasn’t. So when John went back to business school and had a good first semester, he sent this:

*This is also classic Tufte

Principles of Data Visualization The Self-Sufficiency Principle2

• The basic idea: your graphic should be valuable on its own without having to have the data behind it shown at the same time.

– If I have to refer back to the values in order to make sense of the graphic, then what is the point of putting up the graphic? Remember, it’s suppose to help me compare and contrast data points.

• This is not a hard rule, just something to bear in mind. – It’s always helpful to be asking yourself “Do I really need to be doing this?” If you’re not

sure if or why the answer is yes, then maybe you can be doing something more useful.

• Sometimes the values themselves are helpful because they provide additional context beyond the comparisons of interest in the graphic.

– Example: a graph showing growth in profits over time for a business may be effective in showing the trend for the business, but for an investor considering purchasing shares in the company, it may help them to see exactly how much profit that represents.

– One often useful option is to show not the values themselves but some transformation of the values. For example, instead of showing the value of profits over time, the graph could show the values for return on investment represented by the profits. This number is directly interesting to the investor, regardless of trend.

2This idea comes from Kaiser Fung at Junk Charts (http://junkcharts.typepad.com)

Principles of Data Visualization The Self-Sufficiency Principle – Compare & Contrast

9.5 9.2

8.9 8.9 8.8 8.8 8.7 8.6 8.6 8.5 8.5 8.5 8.4 8.1 8.1

7.0

8.0

9.0

10.0

U ne

m pl

oy m

en t R

at e

(% )

Top 15 States by Unemployment – July 2013

Look at the graphic below. Do the values add anything here? • Actually, they do a little. Without them all we have are ten bars that say the same thing in the

middle, with a couple on each end that are a little different. This is not a very informative graphic.

Do we even need the graph—what about a slightly annotated table? • No, we really don’t. Because we’ve sorted the data, there are only a few comparisons of

interest, namely the “break” points between IL/NC and IN/CT. We can convey the same information in a table with a lot less real estate.

Nevada 9.5 Il l inois 9.2 North Carolina 8.9 Rhode Island 8.9 Georgia 8.8 Michigan 8.8 California 8.7 D.C. 8.6 New Jersey 8.6 Kentucky 8.5 Mississippi 8.5 Tennessee 8.5 Indiana 8.4 Connecticut 8.1 South Carolina 8.1

Top 15 States by Unemployment Rate – July

2013

Example Histogram

Sometimes the relevant comparison for the data is to something more abstract like a Normal Distribution:

0

50

100

150

200

250

300

350

400

450

0.0 – 0.5

0.5 – 1.0

1.0 – 1.5

1.5 – 2.0

2.0 – 2.5

2.5 – 3.0

3.0 – 3.5

3.5 – 4.0

4.0 – 4.5

4.5 – 5.0

5.0 – 5.5

5.5 – 6.0

6.0 – 6.5

6.5 – 7.0

7.0 – 7.5

7.5 – 8.0

8.0 – 8.5

8.5 – 9.0

9.0 – 9.5

9.5 – 10.0

10.0 – 10.5

10.5 – 11.0

11.0 – 11.5

11.5 – 12.0

12.0 – 12.5

12.5 – 13.0

13.0 – 13.5

13.5 – 14.0 #

of R

ec or

ds

Histogram of Adjusted Passing Yards/Attempt Based on Quarterback-Seasons, 2002-2012

Note: Passing averages of 0 are omitted, and averages greater than 14 are not shown.

Examples Table with Color

This graphic shows which baseball teams made the playoffs each year, how they did in the playoffs and where they ranked in player salary budget.

The conclusion from this table is quick and easy: playoff teams tend to come from the top of the ranks in budget.

Source: Kaiser Fung (www.junkcharts.typepad.co m) Link

http://junkcharts.typepad.com/junk_charts/2011/07/the-meaning-of-pretty-pictures-and-the-case-of-15-scales.html

Examples Line Chart with Color & Non-Uniform Time-Scale This graphic shows the number of seats available on a United flights from O’Hare to Boston on a specific day, graphed over time as the date approached the departure date.

6:55 AM 8:00 AM

9:21 AM

11:14 AM

1:12 PM

2:37 PM

4:36 PM

5:27 PM 7:08 PM

9:09 PM

-20

0

20

40

60

80

100

120

140

6/13 6/18 6/23 6/28 7/3 7/8 7/13 7/18

# of

A va

ila bl

e Se

at s,

b y

De pa

rt ur

e

As Of

Sunday 7/22: ORD-BOS Flights 6:55 AM

8:00 AM

9:21 AM

11:14 AM

1:12 PM

2:37 PM

4:36 PM

5:27 PM

7:08 PM

9:09 PM

Scheduled capacity can change suddenly if the scheduled aircraft changes

Early flights are often more popular on Sundays, as they were on this day

Note that this graphic conveys four different data points at the same time: 1) The number of seats

available (y-axis) 2) The as-of date for the

capacity (x-axis) 3) Which departures had

the most capacity (grouping by line)

4) What time of day each departure was (color)

Also note the annotation for two odd features of the chart.

Examples Multiple Overlapping and Stacked Time Series (Yahoo! Finance)

Note the use of the blue here to make the major stock of interest (THG) the most vivid. The other three are there for comparison only.

This graphic compares the share price percent gain for a single stock (THG) to three major indices. At the bottom is the THG trading volume for each trading day shown.

This construction of pairing a volume measure with a time series of interest is a common way to provide important context. Often when the metric is a ratio, the “volume” is the denominator.

Note that the gridlines, such as they are, are light-toned. The vertical gridlines also convey how many trading days were in the period.

Example Basic Table

AAPL Fiscal Quarter 3Q2013 2Q2013 1Q2013 FY2012 4Q2012 3Q2012 2Q2012 1Q2012 Calendar Quarter 2Q2013 1Q2013 4Q2012 TFQ 3Q2012 2Q2012 1Q2012 4Q2011 Income Statement Net sales 36,551 43,603 54,512 156,508 35,966 35,023 39,186 46,333 Cost of sales 23,210 27,254 33,452 87,846 21,565 20,029 20,622 25,630

Gross margin 13,341 16,349 21,060 68,662 14,401 14,994 18,564 20,703

Operating expenses: Research and development 1,343 1,119 1,010 3,381 906 876 841 758 Selling, general and administrative 2,672 2,672 2,840 10,040 2,551 2,545 2,339 2,605

Total operating expenses 4,015 3,791 3,850 13,421 3,457 3,421 3,180 3,363

Operating income 9,326 12,558 17,210 55,241 10,944 11,573 15,384 17,340 Other income and expense 347 347 462 522 (51) 288 148 137

Income before provision for income taxes 8,979 12,905 17,672 55,763 10,893 11,861 15,532 17,477 Provision for income taxes 2,424 3,358 4,594 14,030 2,670 3,037 3,910 4,413

Net income 6,555 9,547 13,078 41,733 8,223 8,824 11,622 13,064

Earnings per common share: Basic 10.16 13.93$ 44.64 8.77 9.42$ 12.45 14.03 Diluted 7.12$ 10.09 13.81$ 44.15 8.68 9.32$ 12.30 13.87

Shares used in computing earnings per share: Basic 939,629 938,916 934,818 938,053 936,596 933,582 931,041 Diluted 921,035 946,035 947,217 945,355 947,896 947,059 944,893 941,572 Shares Added During Period (1,182) (679) 837 2,166 3,321 1,900

Cash Dividend Declared Per Share 3.05 2.65 2.65 2.65

Select Balance Sheet Items 7,575 15,861 4,030 7,045 12,575 16,031 Cash on hand 138,433 144,687 137,112 121,251 121,251 117,221 110,176 97,601 Cash per share 150.30$ 152.94$ 144.75$ 127.92$ 127.92$ 123.77$ 116.60$ 103.66$

This table is a summary income statement for AAPL for the most recent 7 quarters. Projected results are in blue, and important summary lines are in peach.

The table layout allows the viewer to follow the math behind each calculation, essentially a different use of position.

Note that the viewers eye is guided toward the most important lines via the summary underlines and the bold type. Also the peach color.

Exercise What Works or Doesn’t Work About This Graphic?

Graphic from a previous Career Center in-class presentation

Exercise What Works or Doesn’t Work About This Graphic?

Graphic pulled from here via Junk Charts.

http://dish.andrewsullivan.com/2013/08/29/households-shrink-houses-dont/
  • Data Visualization�Theory & Best Practices
  • Data Analysis and the �Goal of Visualizing Data
  • Data Analysis and the �Goal of Visualizing Data
  • Data Visualization�Basic Manners
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Data Visualization�Basic Manners (Dos & Don’ts)
  • Basic Tactics�Features of Quality Data Visualization
  • Features of Quality Data Visualization�Representationally Faithful: Bad Example
  • Features of Quality Data Visualization�Simplicity: Bad Example
  • Features of Quality Data Visualization�Simplicity: Bad Example
  • Features of Quality Data Visualization�Simplicity: Bad Example Improved
  • Features of Quality Data Visualization�Comprehensive: Same Example
  • Features of Quality Data Visualization�Interesting: Same Example
  • Basic Tactics�Ways to Represent Data Visually
  • Ways to Represent Data Visually�Example: Size & Position
  • Ways to Represent Data Visually�Example: Position
  • Ways to Represent Data Visually�Example: Shape
  • Ways to Represent Data Visually�Example: Connectedness & Position
  • Ways to Represent Data Visually�Example: Color & Position
  • Ways to Represent Data Visually�Example: Size & Position
  • Basic Tactics�Guiding the Viewer’s Eye
  • Principles of Data Visualization�The Data/Ink Ratio1
  • Principles of Data Visualization�The Data/Ink Ratio – Compare & Contrast
  • Principles of Data Visualization�The Disappearing Baseline
  • Principles of Data Visualization�The Self-Sufficiency Principle2
  • Principles of Data Visualization�The Self-Sufficiency Principle – Compare & Contrast
  • Example�Histogram
  • Examples�Table with Color
  • Examples�Line Chart with Color & Non-Uniform Time-Scale
  • Examples�Multiple Overlapping and Stacked Time Series (Yahoo! Finance)
  • Example�Basic Table
  • Exercise�What Works or Doesn’t Work About This Graphic?
  • Exercise�What Works or Doesn’t Work About This Graphic?
[meteor_slideshow slideshow="slide3"]

Are you looking for a similar paper or any other quality academic essay? Then look no further. Our research paper writing service is what you require. Our team of experienced writers is on standby to deliver to you an original paper as per your specified instructions with zero plagiarism guaranteed. This is the perfect way you can prepare your own unique academic paper and score the grades you deserve.

[meteor_slideshow slideshow="slide2"]

Use the order calculator below and get ordering with idealtermpapers.com now! Contact our live support team for any assistance or inquiry.

[order_calculator]

Purchase Guarantee

Why ORDER at IdealTermPapers.com?

  • Educated and experienced writers.
  • Quality, Professionalism and experience.
  • Original Content writing.
  • Best customer support.
  • Affordable Pricing on orders.
  • Thorough research.
  • Ontime delivery of finished work.
  • 100% plagiarism free papers.

Reasonable Prices

  • To get the best quality papers isn’t cheap so don’t trust extremely low prices.
  • We can’t claim that we have unreasonably low prices because low prices equal to low quality.
  • Our prices are good and they balance with the quality of our work.
  • We have a Moneyback guarantee.

Original and Quality work

  • Our writers are professionals and they write your paper from scratch and we don’t encourage copy pasting.
  • All writers are assessed and they have to pass our standards for them to work with us.
  • Plagiarism is an offence and it’s never tolerated in our company.

Native Writers plus Researchers

  • Our writers are qualified and excellent and will guarantee the best performance in your order.
  • Our team has writers who have master's and PhD qualifications who can handle any assignment
  • We have the best standards in essay writing.

We have been in business for over 7 syears

  • We have always served our customers from all over the world and they have continued to order with us.
  • We value our customers since they have trusted us to do their assignments.
  • We are competent in our writing gained from experience over the years
  • Our company has 24/7 Live Support.

You will get

  •  Custom Admission Essay written by competent professional English writers.
  •  Free revisions according to our revision policy if required
  •  Paper format:  275 words per page, Times New Roman font and size 12, doublespaced text and1 inch margin
  •  On time delivery and direct order download
  •  Privacy guaranteed

We can help you:

  •  acquire a comprehensive professional presentation.
  •  get a unique and remarkable content as per your instructions.
  •  Get an additional portion that can be included to your existing presentation;
  •  turn your work in to an eyecatching presentation with well communicated ideas.
  •  improve your presentation to acquire the best professional standards.