Petr Keil
March 2017, iDiv
It would be wonderful if, after the course, you would:
DAY 1
DAY 2
DAY 3
Statistical models are stories about how the data came to be.
Parametric statistical modeling means describing a caricature of the “machine” that plausibly could have produced the nubmers we observe.
Kéry 2010
x y
1 -1.6902124 -2.8312840
2 -1.5927444 -2.1346018
3 -1.3144798 -3.5481984
4 -1.2741388 -0.6909243
5 -1.1868903 -3.0635968
6 -0.8540381 -1.5809843
7 -0.7117748 -0.5379842
8 -0.6501826 2.0892109
9 -0.3334035 2.9319640
10 -0.2988843 0.6457664
11 0.1374639 2.8685802
12 0.3842709 3.7274582
13 0.5925691 3.1164421
14 0.6984226 6.9814234
15 0.9002922 6.6296795
16 1.0339445 3.8036975
17 1.0944699 5.4047010
18 1.4270767 6.1245379
19 1.9464882 8.0623618
20 2.2952422 8.1494960
\( y_i \sim Normal(\mu_i, \sigma) \)
\( \mu_i = a + b \times x_i \)
Can you separate the deterministic and the stochastic part?
Can you separate the deterministic and the stochastic part?
\( x_i \sim Normal(\mu, \sigma) \)
Let's use \( y \) for data, and \( \theta \) for parameters.
\( p(\theta | y, model) \) or \( p(y | \theta, model) \)
The model is always given (assumed), and usually omitted:
\( p(y|\theta) \) … “likelihood-based” or “frequentist” statistics
\( p(\theta|y) \) … Bayesian statistics
x <- c(2.3, 4.7, 2.1, 1.8, 0.2)
x
[1] 2.3 4.7 2.1 1.8 0.2
x[3]
[1] 2.1
X <- matrix(c(2.3, 4.7, 2.1, 1.8),
nrow=2, ncol=2)
X
[,1] [,2]
[1,] 2.3 2.1
[2,] 4.7 1.8
X[2,1]
[1] 4.7
x <- c(2.3, 4.7, 2.1, 1.8, 0.2)
N <- 5
data <- list(x=x, N=N)
data
$x
[1] 2.3 4.7 2.1 1.8 0.2
$N
[1] 5
data$x # indexing by name
[1] 2.3 4.7 2.1 1.8 0.2
for (i in 1:5)
{
statement <- paste("Iteration", i)
print(statement)
}
[1] "Iteration 1"
[1] "Iteration 2"
[1] "Iteration 3"
[1] "Iteration 4"
[1] "Iteration 5"