Introduction and Principles

Mark Dunning

Last modified: 05 Dec 2016

Why is this course important?

Copyright by Sidney Harris

Copyright by Sidney Harris

Reproducible Research

Five selfish reasons

A famous example

But tools like R have been around for years; why is this still an issue?

Key concepts

The key areas that we will address in this course are:-

  1. Making sure the analysis can be automated
  2. Knowing exactly what files were used as input and output
  3. Making your data available to others

Barriers to learning R, Python etc

Are spreadsheets programs like Excel evil?

Bottom-line

Meta-data

What about other people’s metadata?

When do we have to worry about such things?

Plan for the workshop

Some general principles and themes

Principle 1

  1. Make sure your files are amenable for analysis

We will discuss this point in more detail in the next section.

Principle 2

  1. Never work directly on the raw data

http://www.inquisitr.com/309687/jesus-painting-restoration-goes-wrong-well-intentioned-old-lady-destroys-100-year-old-fresco/

Principle 3

  1. Ensure you have a secure backup strategy

Should be self-explanatory!

Cambridge News; Jan 30th 2016

Cambridge News; Jan 30th 2016

Principle 4

  1. Embrace version control

Principle 5

  1. Organise your files / directories

Principle 6

  1. Make yourself visible

whyopenresearch.org/

Erin and John McKiernan

References