Introducing the IGV Browser

Arguably the most-important tool you will learn about in this course is IGV. Whilst tools like R are very powerful and allow you to perform statistical analyses and test hypotheses, there is no substitute for looking at the data. A trained-eye can quite quickly get a sense of the data quality before any computational analyses have been run. Futhermore, as the person requesting the sequencing, you probably know a lot about the biological context of the samples and what to expect.

Many of the exercises in the course will use IGV, so you will have plenty of time to practice.

Introduction

A quick tour of IGV

Full set of slides from MRC Clinical Sciences Centre

  1. Sample information panel
    • Information about samples you have loaded
    • e.g. Sample ID, Gender, Age, Tumour / Normal
  2. Genome Navigation panel
    • Jump to a genomic region in Chr:Start-End format
    • Jump to a gene symbol of interest
  3. Data panel
    • Your sequencing reads will be displayed here
    • Or whatever data you have loaded
  4. Attribute panel
    • Gene locations
    • Genome sequence (if zoomed-in at appropriate level)
    • Proteins

Example

Go to File -> Load from file and select /home/participant/Course_Materials/paired.bam. Note that the file paired.bam.bai needs to be present in the same directory. However, you only need to click on the .bam

SRR081708.237649    163 1   10003   6   1S67M   =   10041   105 GACCCTGACCCTAACCCTGACCCTGACCCTAACCCTGACCCTGACCCTAACCCTGACCCTAACCCTAA    S=<====<<>=><?=?=?>==@??;?>@@@=??@@????@??@?>?@@<@>@'@=?=??=<=>?>?=Q    ZA:Z:<&;0;0;;308;68M;68><@;0;0;;27;;>MD:Z:5A11A5A11A5A11A13 RG:Z:SRR081708  NM:i:6  OQ:Z:GEGFFFEGGGDGDGGGDGA?DCDD:GGGDGDCFGFDDFFFCCCBEBFDABDD-D:EEEE=D=DDDDC:

The view in IGV is not static and we can scroll-along the genome by holding-down the left mouse in the data panel and dragging left and right

Things to practice

Viewing preferences

IGV allows us to configure many aspects of the data display

Menu:- View -> Alignments

It’s worth noting that the display settings may be showing fewer reads than you have (downsampling) in order to conserve memory. Also, some QC-fail or PCR duplicates may be filtered.

We also have some options on how to display the reads themselves, which we can acccess by right-clicking on the bam track

Sorting alignments by:-

The reads themselves can also be coloured according to

We will re-visit these options later when we come to examine particular variant calls