Setup

Copy the code below to install and load the packages needed for the exercise.

install.packages("ggplot")
install.packages("tidyverse")
install.packages("devtools")

library(devtools)
library(tidyverse)
library(ggplot2)

install_github("azizka/speciesgeocodeR") #latest version from speciesgeocoder from GitHub
library(speciesgeocodeR)

Exercise

This exercise shall give you some awareness of common issues with data from public databases and provide you with basic tools to address those. You can find potentially useful R functions for each question in the brackets. Get help for all functions with ?FUNCTIONNAME.

  1. Copy the above state code chunk to install and load all packages needed for the execise. Load the lion example data (“lions_gbif.csv”) and select potentially interesting columns (read_delim, select).
  2. Visualize the occurrence records on a world map (borders, ggplot, geom_point).
  3. Remove records without coordinates as well as records based on fossils and unknown source and take a look at the coordinate uncertainty (filter, group_by, summarize, arrange, geom_histogram).
  4. Run an automated coordinate cleaning using speciesgeocoder (CleanCoordinates).
  5. Visualize the results of the cleaning and exclude flagged records (plot, filter).
  6. Visualize the data clean data again. Does the data look better now? If not, what could be the problem? (filter, group_by, summarize, geom_histogram).
  7. Take a look at the CoordinateCleaner graphical user interaface at: (https://azizka.shinyapps.io/CoordinateCleaner/) and the speciesgeocodeR wiki for further documentation (https://github.com/azizka/speciesgeocodeR/wiki).