15.071 | Spring 2017 | Graduate

The Analytics Edge

7 Visualization

7.3 The Analytical Policeman: Visualization for Law and Order

Quick Question

The Los Angeles Police Department sees the benefits of predictive policing as which of the following? Select all that apply.

 
 
 
 
 

Explanation According the the Los Angeles Police Department, predictive policing does not eliminate the need for police officers or increase the rate at which they catch criminals. It does, however, allow more intelligent officer deployment, prevents crime, and helps them use resources more effectively.

Continue: Video 2: Visualizing Crime Over Time

Quick Question

For which of the following situations would a heat map be an appropriate visualization choice? Select all that apply.

Explanation A heatmap would be useful for the middle two options, because they are trying to visualize crime counts relative to two variables. For the first option, you could use a basic scatterplot with time on the x-axis and amount of crime on the y-axis. For the last option, you could use a bar plot with a bar for each month and the height being the average amount of crime in that month.

Quick Question

Create a new line plot, like the one in Video 3, but add the argument “linetype=2”. So the geom_line part of the plotting command should look like:

geom_line(aes(group=1), linetype=2)

What does this do?

Explanation The linetype parameter makes the line dashed, and the alpha parameter makes the line lighter in color, or more transparent. The two plots can be generated with the following commands: ggplot(WeekdayCounts, aes(x = Var1, y = Freq)) + geom\_line(aes(group=1), linetype=2) + xlab("Day of the Week") + ylab("Total Motor Vehicle Thefts") ggplot(WeekdayCounts, aes(x = Var1, y = Freq)) + geom\_line(aes(group=1), alpha=0.3) + xlab("Day of the Week") + ylab("Total Motor Vehicle Thefts")

Now, change the alpha parameter to 0.3 by replacing “linetype=2” with “alpha=0.3” in the plot command. What does this do?

Explanation The linetype parameter makes the line dashed, and the alpha parameter makes the line lighter in color, or more transparent. The two plots can be generated with the following commands: ggplot(WeekdayCounts, aes(x = Var1, y = Freq)) + geom\_line(aes(group=1), linetype=2) + xlab("Day of the Week") + ylab("Total Motor Vehicle Thefts") ggplot(WeekdayCounts, aes(x = Var1, y = Freq)) + geom\_line(aes(group=1), alpha=0.3) + xlab("Day of the Week") + ylab("Total Motor Vehicle Thefts")

Quick Question

In this quick question, we’ll ask you questions about the following plots. Plot (1) is the heat map we generated at the end of Video 4. Plot (2) and Plot (3) were generated by changing argument values of the command used to generate Plot (1).

Plot (1)

Heatmap of total motor vehicle thefts according to time and day in shades of red.

Plot (2)

Alternate heatmap of total motor vehicle thefts according to time and day in shades of red.

Plot (3)

Heatmap of total motor vehicle thefts according to time and day in shades of grey.

Which argument(s) did we change to get Plot (2)? Select all that apply.

Explanation To get Plot (2), we changed the arguments "x" and "y" (we flipped them). Plot (2) can be generated with the following code: ggplot(DayHourCounts, aes(x = Var1, y = Hour)) + geom\_tile(aes(fill=Freq)) + scale\_fill\_gradient(name="Total MV Thefts", low="white", high="red") + theme(axis.title.y=element\_blank())

Which argument(s) did we change to get Plot (3)? Select all that apply.

Explanation To get Plot (3), we changed the argument "high" to "black". Plot (3) can be generated with the following code: ggplot(DayHourCounts, aes(x = Hour, y = Var1)) + geom\_tile(aes(fill=Freq)) + scale\_fill\_gradient(name="Total MV Thefts", low="white", high="black") + theme(axis.title.y=element\_blank())

Quick Question

 

In the previous video, our heatmap was plotting squares out in the water, which seems a little strange. We can fix this by removing the observations from our data frame that have Freq = 0.

Take a subset of LatLonCounts, only keeping the observations for which Freq > 0, and call it LatLonCounts2.

Redo the heatmap from the end of Video 5, using LatLonCounts2 instead of LatLonCounts. You should no longer see any squares out in the water, or in any areas where there were no motor vehicle thefts.

How many observations did we remove?

Exercise 1

 Numerical Response 

 

Explanation

You can take a subset of LatLonCounts, only keeping the observations for which Freq > 0 with the following command:

LatLonCounts2 = subset(LatLonCounts, Freq > 0)

Then, you can generate the new heatmap with the following command:

ggmap(chicago) + geom_tile(data=LatLonCounts2, aes(x = Long, y = Lat, alpha=Freq), fill=“red”)

The number of observations in LatLonCounts2 is 686, and the number of observations in LatLonCounts is 1638. These numbers can be found by using nrow or str.

CheckShow Answer

 

Quick Question

Redo the map from Video 6, but this time fill each state with the variable GunOwnership. This shows the percentage of people in each state who own a gun.

Which of the following states has the highest gun ownership rate? To see the state labels, take a look at the World Atlas map.

Explanation You can generate the gun ownership plot using the following command: ggplot(murderMap, aes(x = long, y = lat, group=group, fill = GunOwnership)) + geom\_polygon(color="black") + scale\_fill\_gradient(low = "black", high = "red", guide="legend") Of these five states, the one that is the most red is Montana.

Video 3: A Line Plot

In the next few videos, we’ll be using the dataset mvt (CSV - 7.0MB). Please download this dataset before starting this video. This data comes from the Chicago Police Department

An R script file with all of the commands used in this lecture can be downloaded here: Resource Unit7_Crime (R)

Course Info

As Taught In
Spring 2017
Level
Learning Resource Types
Lecture Videos
Lecture Notes
Problem Sets with Solutions