This is a lesson on tidying data. Specifically, what to do when a conceptual variable is spread out over 2 or more variables in a data frame.
Data used: words spoken by characters of different races and gender in the Lord of the Rings movie trilogy
gather()
from the tidyr
package. Includes references, resources, and exercises.Learner-facing dependencies:
tidy-data
sub-directory of the Data Carpentry data
directorytidyr
package (only true dependency)ggplot2
is used for illustration but is not mission criticaldplyr
and reshape2
are used in the bonus contentInstructor dependencies:
curl
if you execute the code to grab the Lord of the Rings data used in examples from GitHub. Note that the files are also included in the datacarpentry/data/tidy-data
directory, so data download is avoidable.rmarkdown
, knitr
, and xtable
if you want to compile the Rmd
to md
and html