Medical Research Council, Clinical Sciences Centre, London
thomas.carroll@csc.mrc.ac.uk
Nuno
Instituto de Medicina Molecular, Lisboa
nmorais@medicina.ulisboa.pt
Admin
About the Course
We will tell about ‘best practice’ tools that we use in daily work as Bioinformaticians
You will (probably) not come away being an expert
We cannot teach you everything about NGS data
plus, it is a fast-moving field
RNA and ChIP only
much of the initial processing is the same for other assays
However, we hope that you will
Understand how your data are processed
Be able to explore your data - no programming required
Increase confidence with R and Bioconductor
Be able to explore new technologies, methods, tools as they come out
Further disclaimer
To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”. R.A. Fisher, 1938
If you haven’t designed your experiment properly, then all the Bioinformatics we teach you won’t help: Consult with your local statistician - preferably not the day before your grant is due!!!!
Experimental Design; despite this fancy new technolgy, if we don’t design the experiments properly we won’t get meaningful conclusions
Quality assessment; Yes, NGS experiments can still go wrong!
Normalisation; NGS data come with their own set of biases and error that need to be accounted for
Stats; testing for RNA-seq is built-upon the knowledge from microarrays
Plenty of tools and workflows were established.
Don’t forget about arrays; the data are all out there somewhere waiting to be discovered and explored
Reproducibility is key
Two Biostatiscians (later termed ‘Forensic Bioinformaticians’) from M.D. Anderson used R extensively during their re-analysis and investigation of a Clinical Prognostication paper from Duke. The subsequent scandal put Reproducible Research on the map.
Keith Baggerly’s talk from Cambridge in 2010 is highy-recommended.
Advantages of R
The R programming language is now recognised beyond the academic community as an effect solution for data analysis and visualisation. Notable users of R include Facebook, google, Microsoft (who recently invested in a commerical provider of R), and the New York Times.
Key features
Open-source
Cross-platform
Access to existing visualisation / statistical tools
Flexibility
Visualisation and interactivity
Add-ons for many fields of research
Facilitating Reproducible Research
Crash-course in R
Support for R
Online forums such as Stack Overflow regularly feature R