Before working through this activity it is helpful to have some familiarity with ggplot2 and making maps with ggplot2[[links needed]]
.
This activity will require the following packages:
library(maptools) # creates maps and work with spatial files
library(broom) # assists with tidy data
library(dplyr) # joining data frames
library(ggmap) # spatial visualization with ggplot
In this activity we create maps using data collected from the 2015 records of the New York Police Department. This information is restricted to only police stops that resulted in an actual arrest. We will be combining this data with information from the 2010 census on population and unemployment rates1 as well as information from schools in the area.
Precincts.sp <- readShapeSpatial("shapefiles/precincts/precincts")
Schools.sp <- readShapeSpatial("shapefiles/schools/schools")
The objects precincts.sp
and schools.sp
are spatial objects. As described in the Making Maps with Shapefiles [[Reference Needed]]
packages such as ggmap
and ggplot2
cannot read shapefiles directly. As before, we use the tidy
function from the broom
package to convert the spatial objects to a data frame.
Precincts <- tidy(Precincts.sp)
# Join tidied spatial data to the descriptive precict data
Precincts.sp$id <- rownames(Precincts.sp@data)
Precincts <- full_join(Precincts, Precincts.sp@data, by="id")
# Use dplyr to create lat and lon columns
Schools <- Schools.sp@data
Schools <- mutate(Schools, longitude = coords_x1, latitude = coords_x2)
Each of the variables in the Precincts
object are described in the NYPD2015 Codebook
We can create a map of precinct data by using the ggplot
package. The code below colors each police precinct by the total number of arrests in each area.
g <- ggplot() +
geom_polygon(data = Precincts, alpha = 0.7,
aes(x = long, y = lat, group = group, fill = TotalArr))
g
g <- g + geom_path(data = Precincts, size = 0.3,
aes(x = long, y = lat, group = group)) +
scale_fill_continuous(name="Total Arrests", low = "white",
high = "darkgreen") +
ggtitle( "Total Arrests by Precinct" ) +
coord_cartesian(xlim = c(-74.3, -73.6), ylim = c(40.48, 40.94))
g
The ggmap
package allows us to easily add landmarks and geographic images to our maps by integrating information from Google Maps, OpenStreetMap, Stamen Maps or CloudMade Maps.
The code below allows us to create a background map of New York. Notice that the code for the graphic is identical to the one above except ggplot()
is replaced by ggmap(NewYork)
. The get_map
function below is computationally intensive. It may take two or three tries to get it to download. If it does not work, the rest of this lab can be completed using either the ggplot()
or the ggmap(NewYork)
function.
# It might take a while for R to download the map, if it succeeds at all.
NewYork <- get_map(location = "New York", force = FALSE)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=New+York&zoom=10&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=New%20York&sensor=false
p <- ggmap(NewYork) +
geom_polygon(data = Precincts, alpha = 0.7,
aes(x = long, y = lat, group = group, fill = TotalArr))
p
p <- p + geom_path(data = Precincts, size = 0.3,
aes(x = long, y = lat, group = group)) +
scale_fill_continuous(name="Total Arrests", low = "white",
high = "darkgreen") +
ggtitle( "Total Arrests by Precinct" ) +
coord_cartesian(xlim = c(-74.3, -73.6), ylim = c(40.48, 40.94))
p
Questions
Use either ggplot()
or ggmap(NewYork)
to finish the following questions.
fill = TotalArr/TotalPop
to create a map of New York City police precincts, where each precinct is colored by the percent arrests in each of the precincts.dplyr
or other means to remove Precinct 22 and then recreate the map in Question 2. How does the scale change?The Schools
data frame contains multiple variables that are described in the NYPD2015 Codebook
To plot schools as points onto the map, we only need to add a geom_point
function. Note that to properly visualize this large number of points, you will need to use the Zoom
option when viewing the graph within RStudio.
# Notice that this code will work for either g (ggplot) or p (get_map)
g <- g + geom_point(data = Schools,
aes(x = longitude, y = latitude, color = MAT_Mean,
size = TotalStdn)) +
scale_color_gradient(low = "yellow", high = "red",
trans = "sqrt") +
ggtitle( "NYPD Precinct and School Information" ) +
coord_cartesian(xlim = c(-74.3, -73.6), ylim = c(40.48, 40.94))
g
## Warning: Removed 49 rows containing missing values (geom_point).
Questions
MAT_Mean
values are missing. The schools with missing TotalStdn
have automatically been removed from the plot. Determine how many missing values exist in the MAT_Mean
column and how many missing values exist in the TotalStdn
column.MAT_Mean
values).Option 1: There has been some controversy around New York City’s Stop-and-Frisk policies, which gave police officers the right to stop, search, or arrest any suspicious person with reasonable grounds for action. The article Police stop more than 1 million people on the street, states that “Civil liberties groups say the practice is racist and fails to deter crime. Police departments maintain it is a necessary tool that turns up illegal weapons and drugs and prevents more serious crime.” Page 10 of the the New York CIty Bar Association Report on the NYPD’s Stop-and Frisk Policy states that, “The NYPD has defended the.policy on the grounds that most stops are conducted in high-crime neighborhoods with high concentrations of people of color.”
Create a graphic showing the percentage of people of color in each precinct. Create another graph representing arrests in each precinct. Finally create a one page report, including graphics, that addresses the issues stated in the above NBC article. The article was written in 2009. Does the Precincts data, which only has 2015 arrest data, provide support for either the civil liberties groups or for the NYPD?
Option 2: In July of 2015, the radio show This American Life presented an episode that discussed how public education is critically related to crime rates.
Create graphics and a one page report to evaluate any relationship between quality of education, (such as the school’s average math or English scores) and crime rates.
ggmap Cheat Sheet: https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/ggmap/ggmapCheatsheet.pdf
ggmap: Spatial visualization with ggplot: https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf
1 Its important to recognise that census tracts and school districs do not directly fit with police precincts. The census data in this lab has been modified to precinct data, but the information is based upon estimates, not exact values.