# Quantitative Big Imaging

ETHZ: 227-0966-00L

## Loading required package: knitr

# Course Outline

• 25th February - Introduction and Workflows
• 3rd March - Image Enhancement (A. Kaestner)
• 10th March - Basic Segmentation, Discrete Binary Structures
• 17th March - Advanced Segmentation
• 24th March - Analyzing Single Objects
• 7th April - Analyzing Complex Objects
• 14th April - Many Objects and Distributions
• 21st April - Statistics and Reproducibility
• 28th April - Dynamic Experiments
• 12th May - Scaling Up / Big Data
• 19th May - Guest Lecture - High Content Screening
• 26th May - Guest Lecture - Machine Learning / Deep Learning and More Advanced Approaches
• 2nd June - Project Presentations

# Previously on QBI …

• Image Enhancment
• Highlighting the contrast of interest in images
• Minimizing Noise
• Understanding image histograms
• Automatic Methods
• Component Labeling
• Single Shape Analysis
• Complicated Shapes
• Distribution Analysis

# Quantitative “Big” Imaging

The course has covered imaging enough and there have been a few quantitative metrics, but “big” has not really entered.

What does big mean?

• Not just / even large
• it means being ready for big data
• volume, velocity, variety (3 V’s)
• scalable, fast, easy to customize

So what is “big” imaging

#### doing analyses in a disciplined manner

• fixed steps
• easy to regenerate results
• no magic

#### having everything automated

• 100 samples is as easy as 1 sample

#### being able to adapt and reuse analyses

• one really well working script and modify parameters
• different types of cells
• different regions

# Objectives

1. Scientific Studies all try to get to a single number
• Make sure this number is describing the structure well (what we have covered before)
• Making sure the number is meaningful (today!)
1. How do we compare the number from different samples and groups?
• Within a sample or same type of samples
• Between samples
1. How do we compare different processing steps like filter choice, minimum volume, resolution, etc?
2. How do we evaluate our parameter selection?
3. How can we ensure our techniques do what they are supposed to do?
4. How can we visualize so much data? Are there rules?

# Outline

• Motivation (Why and How?)
• Scientific Goals
• Reproducibility
• Statistical metrics and results
• Parameterization
• Parameter sweep
• Sensitivity analysis
• Unit Testing
• Validation
• Visualization

Going back to our original cell image

1. We have been able to get rid of the noise in the image and find all the cells (lecture 2-4)
2. We have analyzed the shape of the cells using the shape tensor (lecture 5)
3. We even separated cells joined together using Watershed (lecture 6)
4. We have created even more metrics characterizing the distribution (lecture 7)

We have at least a few samples (or different regions), large number of metrics and an almost as large number of parameters to tune

# Correlation and Causation

One of the most repeated criticisms of scientific work is that correlation and causation are confused.

1. Correlation
• means a statistical relationship
• very easy to show (single calculation)
1. Causation
• implies there is a mechanism between A and B
• very difficult to show (impossible to prove)

# Controlled and Observational

There are two broad classes of data and scientific studies.

### Observational

• Exploring large datasets looking for trends
• Population is random
• Not always hypothesis driven

We examined 100 people and the ones with blue eyes were on average 10cm taller

In 100 cake samples, we found a 0.9 correlation between cooking time and bubble size

### Controlled

• Most scientific studies fall into this category
• Specifics of the groups are controlled

We examined 50 mice with gene XYZ off and 50 gene XYZ on and as the foot size increased by 10%

We increased the temperature and the number of pores in the metal increased by 10%

# Simple Model: Magic / Weighted Coin

incremental: true

Since most of the experiments in science are usually specific, noisy, and often very complicated and are not usually good teaching examples

• Magic / Biased Coin
• You buy a magic coin at a shop
• How many times do you need to flip it to prove it is not fair?
• If I flip it 10 times and another person flips it 10 times, is that the same as 20 flips?
• If I flip it 10 times and then multiple the results by 10 is that the same as 100 flips?
• If I buy 10 coins and want to know which ones are fair what do I do?

# Simple Model: Magic / Weighted Coin

1. Each coin represents a stochastic variable $$\mathcal{X}$$ and each flip represents an observation $$\mathcal{X}_i$$.
2. The act of performing a coin flip $$\mathcal{F}$$ is an observation $$\mathcal{X}_i = \mathcal{F}(\mathcal{X})$$

We normally assume

1. A fair coin has an expected value of $$E(\mathcal{X})=0.5 \rightarrow$$ 50% Heads, 50% Tails
2. An unbiased flip(er) means
• each flip is independent of the others $P(\mathcal{F}_1(\mathcal{X})*\mathcal{F}_2(\mathcal{X}))= P(\mathcal{F}_1(\mathcal{X}))*P(\mathcal{F}_2(\mathcal{X}))$
• the expected value of the flip is the same as that of the coin $E(\prod_{i=0}^\infty \mathcal{F}_i(\mathcal{X})) = E(\mathcal{X})$

# Simple Model to Reality

### Coin Flip

1. Each flip gives us a small piece of information about the flipper and the coin
• Random / Stochastic variations in coin and flipper cancel out
• Systematic variations accumulate

### Real Experiment

1. Each measurement tells us about our sample, out instrument, and our analysis
• Random / Stochastic variations in sample, instrument, and analysis cancel out
• Normally the analysis has very little to no stochastic variation
• Systematic variations accumulate

# Reproducibility

A very broad topic with plenty of sub-areas and deeper meanings. We mean two things by reproducibility

### Analysis

The process of going from images to numbers is detailed in a clear manner that anyone, anywhere could follow and get the exact (within some tolerance) same numbers from your samples - No platform dependence - No proprietary or “in house” algorithms - No manual clicking, tweaking, or copying - One script to go from image to result

### Measurement

Everything for analysis + taking a measurement several times (noise and exact alignment vary each time) does not change the statistics significantly - No sensitivity to mounting or rotation - No sensitivity to noise - No dependence on exact illumination

# Reproducible Analysis

The basis for reproducible scripts and analysis are scripts and macros. Since we will need to perform the same analysis many times to understand how reproducible it is.

IMAGEFILE=$1 THRESHOLD=130 matlab -r "inImage=$IMAGEFILE; threshImage=inImage>\$THRESHOLD; analysisScript;"
• or java -jar ij.jar -macro TestMacro.ijm blobs.tif
• or Rscript -e "library(plyr);..."

# Comparing Groups: Intraclass Correlation Coefficient

The intraclass correlation coefficient basically looking at how similar objects within a group are compared to between groups

# Intraclass Correlation Coefficient Definition

$ICC = \frac{S_A^2}{S_A^2+S_W^2}$

where - $$S_A^2$$ is the variance among groups or classes - Estimate with the standard deviations of the mean values for each group - $$S_W^2$$ is the variance within groups or classes. - Estimate with the average of standard deviations for each group

• 1 means 100% of the variance is between classes
• 0 means 0% of the variance is between classes

# Intraclass Correlation Coefficient: Values for Coin-Flips

We have one biased coin and try to figure out how many flips we need for the ICC to tell the difference to the normal coin

With many thousands of flips we eventually see a very strong difference but unless it is very strongly biased ICC is a poor indicator for the differences

# Comparing Groups: Tests

Once the reproducibility has been measured, it is possible to compare groups. The idea is to make a test to assess the likelihood that two groups are the same given the data

1. List assumptions
2. Establish a null hypothesis
• Usually both groups are the same
1. Calculate the probability of the observations given the truth of the null hypothesis
• Requires knowledge of probability distribution of the data
• Modeling can be exceptionally complicated

We have 1 coin from a magic shop - our assumptions are - we flip and observe flips of coins accurately and independently - the coin is invariant and always has the same expected value - our null hypothesis is the coin is unbiased $$E(\mathcal{X})=0.5$$ - we can calculate the likelihood of a given observation given the number of flips (p-value)

Number of Flips Probability of All Heads Given Null Hypothesis (p-value)
1 50 %
5 3.1 %
10 0.1 %

How good is good enough?

# Comparing Groups: Student’s T Distribution

Since we do not usually know our distribution very well or have enough samples to create a sufficient probability model

### Student T Distribution

We assume the distribution of our stochastic variable is normal (Gaussian) and the t-distribution provides an estimate for the mean of the underlying distribution based on few observations.

• We estimate the likelihood of our observed values assuming they are coming from random observations of a normal process

### Student T-Test

Incorporates this distribution and provides an easy method for assessing the likelihood that the two given set of observations are coming from the same underlying process (null hypothesis)

• Assume unbiased observations
• Assume normal distribution

# Multiple Testing Bias

Back to the magic coin, let’s assume we are trying to publish a paper, we heard a p-value of < 0.05 (5%) was good enough. That means if we get 5 heads we are good!

Number of Flips Probability of All Heads Given Null Hypothesis (p-value)
1 50 %
4 6.2 %
5 3.1 %
Number of Friends Flipping Probability Someone Flips 5 heads
1 3.1 %
10 27.2 %
20 47 %
40 71.9 %
80 92.1 %

Clearly this is not the case, otherwise we could keep flipping coins or ask all of our friends to flip until we got 5 heads and publish

The p-value is only meaningful when the experiment matches what we did. - We didn’t say the chance of getting 5 heads ever was < 5% - We said if we have exactly 5 observations and all of them are heads the likelihood that a fair coin produced that result is <5%

Many methods to correct, most just involve scaling $$p$$. The likelihood of a sequence of 5 heads in a row if you perform 10 flips is 5x higher.

# Multiple Testing Bias: Experiments

This is very bad news for us. We have the ability to quantify all sorts of interesting metrics - cell distance to other cells - cell oblateness - cell distribution oblateness

So lets throw them all into a magical statistics algorithm and push the publish button

With our p value of less than 0.05 and a study with 10 samples in each group, how does increasing the number of variables affect our result