15.071 | Spring 2017 | Graduate

The Analytics Edge

7 Visualization

7.2 Visualizing the World: An Introduction to Visualization

Quick Question

Normally, a scatterplot only allows us to visualize two dimensions - one on the x-axis, and one on the y-axis. In the previous video we were able to show a third dimension on the scatterplot using what attribute?

 
 
 

Explanation On slide 3, we show the scatterplot from slide 2, but with the number of cylinders shown by the color of the points. This allows us to visualize a third dimension of our data.

Continue: Video 2: The World Health Organization (WHO)

Quick Question

Why is it particularly helpful for WHO to provide data visualizations? Select all that apply.

Explanation While there are other ways to display the data given in many visualizations (like tables), visualizations help to better communicate data to the public and can easily be used by others in presentations.

Quick Question

In this quick question, we’ll be asking you questions about the following three plots, that we saw in Video 1. We’ll refer to them as the “Scatterplot”, the “Histogram”, and the “US Map”.

The Scatterplot:

Scatterplot showing the miles per gallon of a car as a function of the car's weight.

The Histogram:

Histograms of different categories using Hubway data.

The US Map:

U.S. state map indicating unemployment by state by varying shades of red.

In the Scatterplot, what are the geometric objects?

Explanation The geometric objects for the Scatterplot are points, for the Histogram are bars, and for the US Map are polygons (the States). All three plots defined colors in the plot.

In the Histogram, what are the geometric objects?

Explanation The geometric objects for the Scatterplot are points, for the Histogram are bars, and for the US Map are polygons (the States). All three plots defined colors in the plot.

In the US Map, what are the geometric objects?

Explanation The geometric objects for the Scatterplot are points, for the Histogram are bars, and for the US Map are polygons (the States). All three plots defined colors in the plot.

All three of these plots defined a particular aesthetic property. What is it?

Explanation The geometric objects for the Scatterplot are points, for the Histogram are bars, and for the US Map are polygons (the States). All three plots defined colors in the plot.

Quick Question

Create the fertility rate versus population under 15 plot again:

ggplot(WHO, aes(x = FertilityRate, y = Under15)) + geom_point()

Now, color the points by the Region variable.

Note: You can add scale_color_brewer(palette=“Dark2”) to your plot if you are having a hard time distinguishing the colors (this color palette is often better if you are colorblind). To use this option, you should just add scale_color_brewer(palette=“Dark2”) to your plotting command right after geom_point().

One region in particular has a lot of countries with a very low fertility rate and a very low percentage of the population under 15. Which region is it?

  
  
  
  
  
  

Explanation You can color the points by region if you adjust the command to the following: ggplot(WHO, aes(x = FertilityRate, y = Under15, color=Region)) + geom\_point() Most of the countries in Europe have a very low fertility rate and a very low percentage of the population under 15.

Back: Video 5: Advanced Scatterplots Using ggplot

Video 4: Basic Scatterplots Using ggplot

In the rest of this lecture, we’ll be using the data file WHO (CSV). Please download this file to your computer, and save it to a location that you will remember. This data comes from the Global Health Observatory Data Repository

An R script file with all of the commands used in this lecture can be downloaded here: Resource Unit7_WHO (R).

Colors and shapes in R

If you want to see all of the available colors in R, type in your R console:

colors()

All of the available shapes are described in the following image:

Variety of colors and shapes available in R using ggplot.

This image comes from Cookbook for R. License: CC BY-SA. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

The number 0 corresponds to an empty square, the number 6 corresponds to an upside down triangle, etc.

Course Info

As Taught In
Spring 2017
Level
Learning Resource Types
Lecture Videos
Lecture Notes
Problem Sets with Solutions