class: center, middle, inverse, title-slide # Is the Parkland Event Different? ## Using R to Investigate the Persistence of Public Interest in Mass Shooting Events ### Ben Marwick ### 05 April 2018 --- class: normal, center # Motivation <img src="images/nyt-guns-pop-countries.png" alt="" style="width: 550px;"/> https://www.nytimes.com/2017/11/07/world/americas/mass-shootings-us-international.html --- class: normal, center <table> <tr> <td valign="top"><img src="images/nate-silver-tweet.png" style="width: 250px;"/></td> <td valign="top"><img src="images/alex-agadjanian-tweet.png" style="width: 250px;"/></td> </tr> <tr> <td valign="bottom"><img src="images/vox-screenshot.png" style="width: 250px;"/></td> <td valign="bottom"><img src="images/mashable-screenshot.png" style="width: 250px;"/></td> </tr> </table> .small[https://www.vox.com/policy-and-politics/2018/2/21/17033308/florida-shooting-media-gun-control, https://mashable.com/2018/02/19/parkland-students-gun-control-debate-google-trends/] --- class: normal, center <table> <tr> <td valign="top"><img src="images/vox-1.jpg" style="width: 250px;"/></td> <td valign="top"><img src="images/vox-2.jpg" style="width: 250px;"/></td> </tr> <tr> <td valign="top"><img src="images/vox-3.jpg" style="width: 250px;"/></td> <td valign="top"><img src="images/mashable-screenshot-2.png" style="width: 250px;"/></td> </tr> </table> --- class: normal # How can R help us understand when this situation might change? .large[<i class="fa fa-wrench"></i> [Make the code immediately reproducible by anyone, anywhere](https://mybinder.org/)] .large[<i class="fa fa-wrench"></i> [Google search trends](https://github.com/PMassicotte/gtrendsR)] .large[<i class="fa fa-wrench"></i> [Wikipedia page views](https://github.com/petermeissner/wikipediatrend)] .large[<i class="fa fa-wrench"></i> [Broadcast news airtime](https://github.com/hrbrmstr/newsflash)] --- class: normal # Immediately reproducible .large[<i class="fa fa-flask"></i> [Binder](https://mybinder.org/) builds a Docker image from my GitHub repo] .large[<i class="fa fa-thumbs-up"></i> Low friction interaction with a specified computational environment] .large[<i class="fa fa-link"></i> Follow along at http://bit.ly/rstats-public-interest] --- class: normal Data from the Mother Jones article [US Mass Shootings, 1982-2018: Data From Mother Jones’ Investigation](https://www.motherjones.com/politics/2012/12/mass-shootings-mother-jones-full-data/)
--- class: normal # <i class="fa fa-music"></i> Side note: interactive plots ```r library(ggplot2) library(plotly) p0 <- ggplot(us_mass_shootings, aes(Year)) + geom_bar() + ylab("Number of mass shooting events per year") + theme_minimal(base_size = 14) ggplotly(p0) ``` --- class: normal 98 mass shooting events in the data. School shootings seem clustered.
--- class: normal # Google search trends .mediumish[<i class="fa fa-archive"></i> [gtrendsR](https://github.com/PMassicotte/gtrendsR) by Philippe Massicotte and Dirk Eddelbuettel to perform and display Google Trends queries] .mediumish[<i class="fa fa-thumbs-up"></i> Easy to get data from the API] .mediumish[<i class="fa fa-thumbs-down"></i> Issues with resolution (month-day) and scale (0-100 per query only)] --- class: normal, center ![](Seattle-UseR-Group-April-2018_files/figure-html/unnamed-chunk-6-1.png)<!-- --> --- class: normal # <i class="fa fa-music"></i> Side note: text halos in ggplot ```r library(ggplot2) g <- ggplot(...) # get a halo on the text https://stackoverflow.com/a/10691826/1036500 theta <- seq(pi/8, 2*pi, length.out=16) xo <- diff(range(my_data_frame$x))/200 yo <- diff(range(my_data_frame$y))/200 for(i in theta) { g <- g + geom_text( data = my_data_frame, aes_q(x = bquote(x+.(cos(i)*xo)), y = bquote(y+.(sin(i)*yo)), label = ~label), size=4, colour='white', hjust = 1) } g + geom_text(..., colour = "black") ``` --- class: normal # Small multiples to show the decay in Google search activity ![](Seattle-UseR-Group-April-2018_files/figure-html/unnamed-chunk-9-1.png)<!-- --> --- class: normal # <i class="fa fa-music"></i> Side note: iterate by row ```r library(gtrendsR) library(glue) library(purrrlyr) search_results_over_time_with_cases <- by_row(search_intervals, ~gtrends(c("gun control"), gprop = "web", time = glue('{.x$Query_start_date} {.x$Query_end_date}')) ) ``` --- class: normal # Interactive overlay to compare decay in Google search activity
--- class: normal # Wikipedia page views .mediumer[<i class="fa fa-archive"></i> [pageviews](https://github.com/Ironholds/pageviews/) by Oliver Keyes to search and download article page view counts of Wikipedia and its sister projects] .mediumer[<i class="fa fa-wikipedia-w"></i> One of the most popular sites on the Web to satisfy information needs] .mediumer[<i class="fa fa-thumbs-down"></i> Current API limited to 2015 onward] --- class: normal # <i class="fa fa-music"></i> Side note: reading large text files ```r # https://gist.github.com/benmarwick/20eac969ce9199756dc074801f5b531d library(chunked) library(tidyverse) my_file <- 'pagecounts-2012-12-14/pagecounts-2012-12-14' # 3.5 GB # to find where the content starts, vary the skip value, read.table(my_file, nrows = 10, skip = 25) # work on chunks of the file df <- read_chunkwise(my_file, chunk_size=5000, skip = 30, format = "table", header = TRUE) %>% filter(stringr::str_detect(De.mw.De.5.J3M1O1, "Gun_control")) ``` --- class: normal, center ![](Seattle-UseR-Group-April-2018_files/figure-html/unnamed-chunk-14-1.png)<!-- --> --- class: normal, center
--- class: normal # <i class="fa fa-music"></i> Side note: non-equi joins ```r # https://stackoverflow.com/q/41132081/1036500 library(tidyverse) elements <- c(0.1, 0.2, 0.5, 0.9, 1.1, 1.9, 2.1) intervals <- frame_data(~phase, ~start, ~end, "a", 0, 0.5, "b", 1, 1.9, "c", 2, 2.5) library(fuzzyjoin) fuzzy_left_join(data.frame(elements), intervals, by = c("elements" = "start", "elements" = "end"), match_fun = list(`>=`, `<=`)) %>% distinct() ## elements phase start end ## 1 0.1 a 0 0.5 ## 2 0.2 a 0 0.5 ## 3 0.5 a 0 0.5 ## 4 0.9 <NA> NA NA ## 5 1.1 b 1 1.9 ## 6 1.9 b 1 1.9 ## 7 2.1 c 2 2.5 ``` --- class: normal # Do mass shootings with more fatalities result in more page views?
--- class: normal # Broadcast television news coverage .mediumer[<i class="fa fa-archive"></i> [newsflash](https://github.com/hrbrmstr/newsflash) by Bob Rudis for tools to work with the Internet Archive and GDELT Television Explorer] .mediumer[<i class="fa fa-tv"></i> Keyword search the closed captioning streams, daily data going back to 2009 for MSNBC, CNN and FOX] .mediumer[<i class="fa fa-question-circle"></i> Measurement variable is % airtime (15 second blocks)] --- class: normal, center
--- class: normal # Summary .largerer[☹️ There are too many mass shootings] .largerer[👉 Stoneman Douglas resembles previous mass shooting events in generating and sustaining public interest in gun control, as measured by Google search volume and TV broadcast news content.] .largerer[👉 Most striking difference between Stoneman Douglas and previous mass shootings is the high number of page views to Wikipedia articles] .largerer[<i class="fa fa-puzzle-piece"></i> If Sandy Hook didn't change anything, probably no single event will] --- class: normal, center # What do to? <table> <tr> <td valign="top"><img src="images/nyt-what-to-do.png" style="width: 320px;"/></td> <td valign="top"><img src="images/nyt-what-to-do-mass-shootings.png" style="width: 320px;"/></td> </tr> </table> https://www.nytimes.com/interactive/2017/01/10/upshot/How-to-Prevent-Gun-Deaths-The-Views-of-Experts-and-the-Public.html, http://lawcenter.giffords.org/ --- class: normal # Colophon .larger[ Presentation written in [R Markdown using xaringan](https://github.com/yihui/xaringan) Compiled into HTML5 using [RStudio](http://www.rstudio.com/ide/) & [knitr](http://yihui.name/knitr) Source code hosting: https://github.com/benmarwick/Seattle-UseR-Group-April-2018 ORCID: http://orcid.org/0000-0001-7879-4531 Licensing: * Presentation: [CC-BY-3.0](http://creativecommons.org/licenses/by/3.0/us/) * Source code: [MIT](http://opensource.org/licenses/MIT) ]