# load required packages
library(maptools) # creates maps and work with spatial files
library(broom) # assists with tidy data
library(ggplot2) # graphics package
library(leaflet) # interactive graphics (output does not show in RMD files)
library(dplyr) # joining data frames
library(readr) # quickly reads files into R
Before working through this activity it is helpful to have some familiarity with ggplot2 and making maps with ggplot2[[links needed]]
.
In the previous Introduction to Creating Maps with ggplot2[[link needed]]
, we used a simplistic .CSV file that contained latitude, longitude and group information. However, in most cases, geographic information for maps is stored in a more complex format, commonly referred to as a shapefile. A shapefile consists of several component files. Not all components are needed, but each shapefile must have at least the following three component files:
Shapefiles allow you to easily draw different polygons (i.e. countries, administrative divisions), lines (i.e. roads, rivers), and points (i.e. fire departments, mountains, battles).
Data: We will use a dataset based upon the Global Terrorism Database which consists of all terrorist incidents from 1970-2013. This is a large file and may take a while to load. We will also use a shapefile of the world’s state boundaries that was downloaded from the [Natural Earth website] (http//www.naturalearthdata.com/download/50m/cultural/ne_50m_admin_0_countries.zip).
# Reads the shapefile into the R workspace.
TerrorismData <- read_csv("data/terrorismData.csv")
Worldshapes <- readShapeSpatial("data/ne_50m_admin_0_countries")
str(Worldshapes, max.level = 2)
## Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 241 obs. of 63 variables:
## .. ..- attr(*, "data_types")= chr [1:63] "N" "C" "N" "C" ...
## ..@ polygons :List of 241
## .. .. [list output truncated]
## ..@ plotOrder : int [1:241] 12 184 39 227 42 33 16 87 113 9 ...
## ..@ bbox : num [1:2, 1:2] -180 -90 180 83.6
## .. ..- attr(*, "dimnames")=List of 2
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
Remarks
readShapeSpatial
from the maptools
package allows us to load all component files simultaneously.str
command allows us to see that the Worldshapes
object is of the class SpatialPolygonsDataFrame
. This means that R is representing this shapefile as a special object consisting of 5 slots of geographic data. The first slot, (and the most relevant to us) is the data slot, which contains a data frame representing the actual data adjoined to the geometries. Similar to how we access a column in a data frame with the $
infix, we can also access a slot in a shapefile with the @
infix.max.level=2
limits the information that is printed from the str
command.names(Worldshapes@data)
## [1] "scalerank" "featurecla" "labelrank" "sovereignt" "sov_a3"
## [6] "adm0_dif" "level" "type" "admin" "adm0_a3"
## [11] "geou_dif" "geounit" "gu_a3" "su_dif" "subunit"
## [16] "su_a3" "brk_diff" "name" "name_long" "brk_a3"
## [21] "brk_name" "brk_group" "abbrev" "postal" "formal_en"
## [26] "formal_fr" "note_adm0" "note_brk" "name_sort" "name_alt"
## [31] "mapcolor7" "mapcolor8" "mapcolor9" "mapcolor13" "pop_est"
## [36] "gdp_md_est" "pop_year" "lastcensus" "gdp_year" "economy"
## [41] "income_grp" "wikipedia" "fips_10" "iso_a2" "iso_a3"
## [46] "iso_n3" "un_a3" "wb_a2" "wb_a3" "woe_id"
## [51] "adm0_a3_is" "adm0_a3_us" "adm0_a3_un" "adm0_a3_wb" "continent"
## [56] "region_un" "subregion" "region_wb" "name_len" "long_len"
## [61] "abbrev_len" "tiny" "homepart"
dim(Worldshapes@data)
## [1] 241 63
The worldshapes@data
file containes 241 rows and 63 columns describing aspects of each country. Each row represents a country.
ggplot()
We can use ggplot2
to draw maps using shapefiles. However, ggplot2
on its own cannot read shapefiles directly – the shapefile must first be converted to a data frame. We can convert a spatial object to a data frame with the tidy
function from the broom
package.
Worldshapes_tidied <- tidy(Worldshapes)
g <- ggplot() +
geom_polygon(data = Worldshapes_tidied,
aes(x = long, y = lat, group = group),
fill = "lightblue", color = "black")
g
Remarks
Worldshapes_tidied
now contains the latitude, longitude and group information that allows use to create a base map.# Subset the terrorism database to only include events from the year 1984.
IncidentsIn1984 <- filter(TerrorismData, iyear == 1984)
g <- g + geom_point(data = IncidentsIn1984,
aes(x = longitude, y = latitude, size = severity, color = severity))
g
Remark
size
and color
, were added to show the relative severity1 of each incident.Other useful ggplot2
capabilities can be added onto the map. For example, in the following graph we can fill in our countries based upon a variable in our dataset, the gross domestic product.
# Create a new column called "id"
Worldshapes1 <- mutate(Worldshapes@data, id=as.character(0:240))
# Join the tidied shapefile with the data from our original shapefile.
Worldshapes2 <- left_join(Worldshapes_tidied, Worldshapes1,
by = "id")
# Plots map, coloring each country by the log(Gross Domestic Product)
ggplot() +
geom_polygon(data = Worldshapes2,
aes(x = long, y = lat, group = group, fill = sqrt(gdp_md_est)),
color = "black") +
scale_fill_continuous(name="Population(millions)",limits = c(0,4000),
breaks=c(1000,2000,3000),
low = "white", high = "darkblue")
## Warning in sqrt(gdp_md_est): NaNs produced
geom_point(data = IncidentsIn1984,
aes(x = longitude, y = latitude),
size = 1, color = "red")
## mapping: x = longitude, y = latitude
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
Remarks
size = 1, color = "red"
are outside the aes()
command. This is because the size and color are fixed values and are not dependent upon the data. However, in the previous graph severity is a variable within the data frame, so the code, size = severity, color = severity
, was within the aes()
.NaNs produced
. This data uses -99
to represent missing values and so provides a warning when a few of the countries GPD values are not able to be calculated. By viewing the data and the graph, we see there is very little impact on the overall graph, so we will not go back to change the -99
values.This graph does not show temporal information, or how terrorism changes through time. One way to implement this is to have multiple maps plotted in the same window by making use of the facet
command. In the graph below we look at the number of terrorism incidents over time in the Republic of Ireland and the United Kingdom.
# create a new terrorism data frame that includes only four years
Incidents2 <- filter(TerrorismData, iyear == 1975 | iyear == 1985 | iyear == 1995 |iyear == 2005)
p <- ggplot() + geom_point(data = Incidents2,
aes(x = longitude, y = latitude, size = severity),
size = 1, color = "red", alpha = .5) +
facet_wrap(~iyear) +
coord_cartesian(xlim = c(-11, 3), ylim = c(51, 59)) +
geom_polygon(data = Worldshapes2,
aes(x = long, y = lat, group = group),
fill = "lightblue", color = "black", alpha = .3)
p
Questions
Create a graph that shows the terrorism incidents that occured in the United States during 2001. Have the size and color of the incident be determined by severity
.
Suppose you want to look the effects of terrorism before, during, and after the United States invasion of Iraq in 2003. Create three maps of the area, displayed side-by-side. Hint: You might also want to center the map on Iraq using xlim = c(35,50)
and ylim = c(28,38)
.
Create a world map colored by the square root of the estimated population sqrt(pop_est)
from the Worldshapes@data
. Does it appear that population is highly correlated with the number of incidents that occur?
leaflet
packageIf you are using electronic handouts or websites, you may want to be able to zoom in, click on objects, and have our maps update according to user inputs. This can be made possible with the use of HTML widgets. In this exercise, we will be using the leaflet
package, a great tool for rendering dynamic maps with various interactive elements.
Below is some code which renders terrorist incidents in Europe using the leaflet
package. Note that the code does run within Rstudio, but a graphic cannot be posted to a knitted HTML or word document.
library(leaflet)
# Subset terrorism database to only contain events in Europe in 1984
Europe1984Incidents <- filter(TerrorismData, iyear == 1984,
region2 == "Europe & Central Asia")
# addTiles() Add background map
# setView( Set where the map should originally zoom to
leaflet() %>%
addTiles() %>%
setView(lng = 10, lat = 50, zoom = 4) %>%
addCircleMarkers(data = Europe1984Incidents,
lat = ~latitude, lng = ~longitude,
radius = ~severity, popup = ~info,
color = "red", fillColor = "yellow")
Maps created with leaflet
are embedded with even more powerful capabilities. In this map, for example, you can click on an incident marker to see a popup of detailed information concerning the specific event you clicked on.
Questions
leaflet
map created with the above code, identify the location of the indicent that resulted in the most deaths in Great Britain. Who was the intended target during this incident?leaflet
package to create an interactive map of the United States using years 2000 through 2005.Both Worldshapes and TerrorismData files have a column that defines a region as Europe and Central Asia
(see Worldshapes@data$region_wb
and TerrorismData$region2
).
Create a map of all incidents that occured in Europe and Central Asia
during 2013. What countries appear to have the most incidents? Give a short (i.e. one paragraph) description of the graph. This description should include an identification of the countries with the most incidents.
A wide variety of global shapefiles can be found at Natural Earth: http://www.naturalearthdata.com.
A large repository of lower-level administrative divisions can be accessed at the GADM database: http://www.gadm.org/country.
More shapefiles specific to the geographic information you want to show can usually be found via a quick Google search.
1 Severity is defined in our dataset as the natural log of a weighted sum of the number of people killed and wounded: \(2 \cdot \log{(4 \cdot killed + wounded)}\).