January 24, 2017

R in Biodiversity Analysis: rOpenSci for all
Stockholm, Sweden, 24th-25th Jan 2016

Jan 24th 09:10 - 09:40 

    "R for scientific computation and biodiversity analysis"

Keyboard shortcuts for presentation viewing mode:

'f' enable fullscreen mode 
'w' toggle widescreen mode
'o' enable overview mode
'h' enable code highlight mode
'p' show presenter notes

Collaborating using R - "Units"

  • Scripts - describe steps in your workflow, can re-run and reproducing results, by calling Functions - which encapsulates re-usable ways to process datasets

  • R Packages - encapsulates a larger unit of work, including:
    • Datasets
    • Documentation of datasets
    • Functions
    • Documentation of functions
    • Tutorials showing workflows involving data using the functions
    • Web UIs
    • License

Collaboration using R - "Processes"

  • Use git and GitHub or equivalent public online respository for publishing source code
  • Publish data to the Internet - enable others to use your datasets
  • Write books together - collaborate and use markdown, see bookdown

Biodiversity Analysis

  • More than just computations or showing visuals - it is about the whole narrative and explaining the results
  • Involves several reproducible steps:
    • processing data from different sources
    • modelling / computations
    • presenting results, while enabling peer-review to build trust in results
    • tutorials/documentation - explaining to peers how to reproduce results, enabling peer-review
  • ROpenSci offers many excellent packages for biodiversity analysis (but also for many other disciplines)

Scientific Computations

Functions may involve scientific calculations - models etc…

  • FP vs OOP - Functional Programming versus Object-Oriented Programming - focus on VERBs not NOUNs, less of side-effects and state contrasting with Object Oriented paradigm where focus is on NOUNs not VERBs and the data and algorithms often carries a lot of state and side-effects of one method to other objects in the system are often plenty and difficult to control and understand
  • Designing R functions for calculations - well situated for scaling and doing computations in parallel at scale
  • Look at OpenCPU

Presenting results from computations

  • Visualization
  • Communicate results clearly
  • Create a Shiny web app as part of your R package

Use Data Science Open Source Tools

  • Docker - for running anything anywhere
  • ROpenSci-packages - for Biodiversity Analysis tools
  • OpenCPU - for exposing data as a service
  • Shiny - for providing a web based interactive user interface
  • Various R packages - create your own or use existing:
    • rgbif
    • bookdown
    • figshare
    • dryad

Modus Operandum for Software Carpentry

Learning resources for R work