+ - 0:00:00
Notes for current slide
Notes for next slide

Is the Parkland Event Different?

Using R to Investigate the Persistence of Public Interest in Mass Shooting Events

Ben Marwick

05 April 2018

1 / 26
4 / 26

Immediately reproducible

Binder builds a Docker image from my GitHub repo
Low friction interaction with a specified computational environment
Follow along at http://bit.ly/rstats-public-interest

6 / 26

Data from the Mother Jones article US Mass Shootings, 1982-2018: Data From Mother Jones’ Investigation

198019902000201020200369
YearNumber of mass shootingevents per year
199020002010204060
DateNumber of fatalitiesper mass shooting event
7 / 26

Side note: interactive plots

library(ggplot2)
library(plotly)
p0 <-
ggplot(us_mass_shootings,
aes(Year)) +
geom_bar() +
ylab("Number of mass shooting events per year") +
theme_minimal(base_size = 14)
ggplotly(p0)
8 / 26

98 mass shooting events in the data. School shootings seem clustered.

199020002010AirportMilitaryOtherReligiousSchoolWorkplace
DateVenue
9 / 26

Google search trends

gtrendsR by Philippe Massicotte and Dirk Eddelbuettel to perform and display Google Trends queries
Easy to get data from the API
Issues with resolution (month-day) and scale (0-100 per query only)

10 / 26

11 / 26

Side note: text halos in ggplot

library(ggplot2)
g <- ggplot(...)
# get a halo on the text https://stackoverflow.com/a/10691826/1036500
theta <- seq(pi/8, 2*pi, length.out=16)
xo <- diff(range(my_data_frame$x))/200
yo <- diff(range(my_data_frame$y))/200
for(i in theta) {
g <- g + geom_text( data = my_data_frame,
aes_q(x = bquote(x+.(cos(i)*xo)),
y = bquote(y+.(sin(i)*yo)),
label = ~label),
size=4, colour='white', hjust = 1)
}
g + geom_text(..., colour = "black")
12 / 26

Small multiples to show the decay in Google search activity

13 / 26

Side note: iterate by row

library(gtrendsR)
library(glue)
library(purrrlyr)
search_results_over_time_with_cases <-
by_row(search_intervals,
~gtrends(c("gun control"),
gprop = "web",
time = glue('{.x$Query_start_date} {.x$Query_end_date}'))
)
14 / 26

Interactive overlay to compare decay in Google search activity

02040600255075100
Aurora ...Bingham...Fort Ho...Las Veg...Orlando...San Ber...Sandy H...Stonema...Texas F...Washing...Days relative to event (red vertical line is the day of the event)Google search volume for 'gun control'
15 / 26

Wikipedia page views

pageviews by Oliver Keyes to search and download article page view counts of Wikipedia and its sister projects
One of the most popular sites on the Web to satisfy information needs
Current API limited to 2015 onward

16 / 26

Side note: reading large text files

# https://gist.github.com/benmarwick/20eac969ce9199756dc074801f5b531d
library(chunked)
library(tidyverse)
my_file <- 'pagecounts-2012-12-14/pagecounts-2012-12-14' # 3.5 GB
# to find where the content starts, vary the skip value,
read.table(my_file, nrows = 10, skip = 25)
# work on chunks of the file
df <-
read_chunkwise(my_file,
chunk_size=5000,
skip = 30,
format = "table",
header = TRUE) %>%
filter(stringr::str_detect(De.mw.De.5.J3M1O1, "Gun_control"))
17 / 26

18 / 26
0204060050000100000150000
Las Vegas Strip massacre (2017)Orlando nightclub massacre (2016)San Bernardino mass shooting (2015)Stoneman Douglas High School shooting (2018)Texas First Baptist Church massacre (2017)Days relative to event(red vertical line is the day of the event)Page views for Wikipedia articlesrelated to gun controlCase
19 / 26

Side note: non-equi joins

# https://stackoverflow.com/q/41132081/1036500
library(tidyverse)
elements <- c(0.1, 0.2, 0.5, 0.9, 1.1, 1.9, 2.1)
intervals <- frame_data(~phase, ~start, ~end,
"a", 0, 0.5,
"b", 1, 1.9,
"c", 2, 2.5)
library(fuzzyjoin)
fuzzy_left_join(data.frame(elements),
intervals,
by = c("elements" = "start",
"elements" = "end"),
match_fun = list(`>=`, `<=`)) %>%
distinct()
## elements phase start end
## 1 0.1 a 0 0.5
## 2 0.2 a 0 0.5
## 3 0.5 a 0 0.5
## 4 0.9 <NA> NA NA
## 5 1.1 b 1 1.9
## 6 1.9 b 1 1.9
## 7 2.1 c 2 2.5
20 / 26

Do mass shootings with more fatalities result in more page views?

1e+062e+060204060
Page views for Wikipedia articles related to gun controlFatalities
21 / 26

Broadcast television news coverage

newsflash by Bob Rudis for tools to work with the Internet Archive and GDELT Television Explorer
Keyword search the closed captioning streams, daily data going back to 2009 for MSNBC, CNN and FOX
Measurement variable is % airtime (15 second blocks)

22 / 26
02040600123
Aurora ...Fort Ho...Las Veg...Orlando...San Ber...Sandy H...Stonema...Texas F...Washing...Days relative to event(red vertical line is the day of the event)% airtime (15 second blocks)
23 / 26

Summary

☚ī¸ There are too many mass shootings
👉 Stoneman Douglas resembles previous mass shooting events in generating and sustaining public interest in gun control, as measured by Google search volume and TV broadcast news content.
👉 Most striking difference between Stoneman Douglas and previous mass shootings is the high number of page views to Wikipedia articles
If Sandy Hook didn't change anything, probably no single event will

24 / 26

Colophon

Presentation written in R Markdown using xaringan

Compiled into HTML5 using RStudio & knitr

Source code hosting: https://github.com/benmarwick/Seattle-UseR-Group-April-2018

ORCID: http://orcid.org/0000-0001-7879-4531

Licensing:

26 / 26
Paused

Help

Keyboard shortcuts

↑, ←, Pg Up, k Go to previous slide
↓, →, Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow