WEBVTT

00:00:04.500 --> 00:00:06.890
The good, the bad, and the ugly.

00:00:06.890 --> 00:00:11.540
This is a recitation for
the visualization week.

00:00:11.540 --> 00:00:14.570
With great power comes
great responsibility.

00:00:14.570 --> 00:00:17.580
There are many ways to
visualize the same data.

00:00:17.580 --> 00:00:21.210
You have just seen how to make
quite attractive visualizations

00:00:21.210 --> 00:00:24.590
with ggplot2, which has
good default settings,

00:00:24.590 --> 00:00:27.230
but judgment is still
required from the user.

00:00:27.230 --> 00:00:30.860
For example, do I decide
to vary the size of a point

00:00:30.860 --> 00:00:33.810
or do I vary the
color of a point?

00:00:33.810 --> 00:00:35.960
It is worth noting at
this point that Excel

00:00:35.960 --> 00:00:37.670
and other similar
programs can also

00:00:37.670 --> 00:00:40.810
be used to make perfectly
acceptable visualizations,

00:00:40.810 --> 00:00:42.560
or terrible ones.

00:00:42.560 --> 00:00:45.150
The tool can help but it's
ultimately up to the user

00:00:45.150 --> 00:00:48.040
it to make decisions.

00:00:48.040 --> 00:00:50.290
So what is the difference
between a good visualization

00:00:50.290 --> 00:00:53.020
and a bad visualization then?

00:00:53.020 --> 00:00:55.430
I would argue that
a good visualization

00:00:55.430 --> 00:01:00.570
clearly and accurately conveys
the key messages in the data.

00:01:00.570 --> 00:01:03.740
A bad visualization will
obfuscate the data either

00:01:03.740 --> 00:01:07.020
through ignorance or malice.

00:01:07.020 --> 00:01:09.480
So what does this mean?

00:01:09.480 --> 00:01:11.850
Visualizations can
be used by an analyst

00:01:11.850 --> 00:01:16.180
for their own consumption to
gain insights into the data.

00:01:16.180 --> 00:01:19.370
Visualizations can also be
used to provide information

00:01:19.370 --> 00:01:24.150
to a decision maker and/or to
convince someone of something.

00:01:24.150 --> 00:01:26.060
Now, a bad
visualization can hide

00:01:26.060 --> 00:01:30.760
patterns that could give insight
or mislead decision makers.

00:01:30.760 --> 00:01:32.390
This is where the
malice part comes in.

00:01:35.310 --> 00:01:39.640
So today, we will look at a
few examples of visualizations

00:01:39.640 --> 00:01:42.200
taken from a variety of sources.

00:01:42.200 --> 00:01:45.530
We'll discuss what is good
and what is bad about them.

00:01:45.530 --> 00:01:48.490
Then we will switch into R to
build better versions of them

00:01:48.490 --> 00:01:50.539
for ourselves.

00:01:50.539 --> 00:01:53.430
But I want you to think for
yourself in this presentation.

00:01:53.430 --> 00:01:55.350
You might not agree with
all the points I make

00:01:55.350 --> 00:01:57.990
or my opinions about
these visualizations.

00:01:57.990 --> 00:02:00.890
Visualization is
inherently subjective

00:02:00.890 --> 00:02:04.750
and the right visualization
will depend on the situation.

00:02:04.750 --> 00:02:06.930
So use your own
judgment and think

00:02:06.930 --> 00:02:09.669
about what I talked about
before with a good visualization

00:02:09.669 --> 00:02:11.920
and a bad visualization.