The aim of this lesson is to (1) leave the participants to come up with their code for simple one-way ANOVA, and (2) to experiment with random effects ANOVA.
We will use modified data from the example from Marc Kery’s Introduction to WinBUGS for Ecologists, page 119 (Chapter 9 - ANOVA). The data describe snout-vent lengths in 5 populations of Smooth snake (Coronella austriaca) (Uzovka hladka in CZ).
Loading the data from the web:
snakes <- read.csv("http://www.petrkeil.com/wp-content/uploads/2014/02/snakes.csv")
# we will artificially delete 9 data points in the first population
snakes <- snakes[-(1:9),]
summary(snakes)
## population snout.vent
## Min. :1.00 Min. :36.6
## 1st Qu.:2.00 1st Qu.:43.0
## Median :3.00 Median :49.2
## Mean :3.44 Mean :50.1
## 3rd Qu.:4.00 3rd Qu.:57.6
## Max. :5.00 Max. :61.4
Plotting the data:
par(mfrow=c(1,2))
plot(snout.vent ~ population, data=snakes,
ylab="Snout-vent length [cm]")
boxplot(snout.vent ~ population, data=snakes,
ylab="Snout-vent length [cm]",
xlab="population",
col="grey")
For a given snake \(i\) in population \(j\) the model can be written as:
\(y_{ij} \sim Normal(\alpha_j, \sigma)\)
Here is how we prepare the data:
snake.data <- list(y=snakes$snout.vent,
x=snakes$population,
N=nrow(snakes),
N.pop=5)
Loading the library that communicates with JAGS
library(R2jags)
JAGS Model definition:
cat("
model
{
# priors
sigma ~ dunif(0,100)
tau <- 1/(sigma*sigma)
for(j in 1:N.pop)
{
alpha[j] ~ dnorm(0, 0.001)
}
# likelihood
for(i in 1:N)
{
y[i] ~ dnorm(alpha[x[i]], tau)
}
}
", file="fixed_anova.txt")
And we will fit the model:
model.fit.fix <- jags(data=snake.data,
model.file="fixed_anova.txt",
parameters.to.save=c("alpha"),
n.chains=3,
n.iter=2000,
n.burnin=1000,
DIC=FALSE)
## module glm loaded
## module dic loaded
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 96
##
## Initializing model
plot(as.mcmc(model.fit.fix))
model.fit.fix
## Inference for Bugs model at "fixed_anova.txt", fit using jags,
## 3 chains, each with 2000 iterations (first 1000 discarded)
## n.sims = 3000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha[1] 45.18 3.272 38.79 43.03 45.15 47.35 51.57 1.001 3000
## alpha[2] 41.26 1.010 39.29 40.58 41.24 41.92 43.25 1.001 3000
## alpha[3] 45.85 1.015 43.91 45.19 45.85 46.52 47.98 1.001 3000
## alpha[4] 54.38 1.027 52.34 53.69 54.36 55.05 56.46 1.002 1700
## alpha[5] 59.01 1.001 56.99 58.36 59.02 59.69 60.96 1.001 3000
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
For a given snake \(i\) in population \(j\) the model can be written in a similar way as for the fixed-effects ANOVA above:
\(y_{ij} \sim Normal(\alpha_j, \sigma)\)
But now we will also add a random effect:
\(\alpha_j \sim Normal(\mu, \sigma)\)
In short, a random effect means that the parameters itself come from (are outcomes of) a given distribution, here it is the Normal.
The data stay the same as in the fixed-effect example above.
Loading the library that communicates with JAGS
library(R2jags)
JAGS Model definition:
cat("
model
{
# priors
grand.mean ~ dnorm(0, 0.001)
grand.sigma ~ dunif(0,100)
grand.tau <- 1/(grand.sigma*grand.sigma)
group.sigma ~ dunif(0, 100)
group.tau <- 1/(group.sigma*group.sigma)
for(j in 1:N.pop)
{
alpha[j] ~ dnorm(grand.mean, grand.tau)
}
# likelihood
for(i in 1:N)
{
y[i] ~ dnorm(alpha[x[i]], group.tau)
}
}
", file="random_anova.txt")
And we will fit the model:
model.fit.rnd <- jags(data=snake.data,
model.file="random_anova.txt",
parameters.to.save=c("alpha"),
n.chains=3,
n.iter=2000,
n.burnin=1000,
DIC=FALSE)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 100
##
## Initializing model
plot(as.mcmc(model.fit.rnd))
model.fit.rnd
## Inference for Bugs model at "random_anova.txt", fit using jags,
## 3 chains, each with 2000 iterations (first 1000 discarded)
## n.sims = 3000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha[1] 46.05 3.121 39.95 43.95 46.02 48.21 52.05 1.001 3000
## alpha[2] 41.40 1.018 39.50 40.69 41.39 42.06 43.45 1.001 3000
## alpha[3] 45.97 1.040 43.91 45.26 45.99 46.64 48.01 1.001 2400
## alpha[4] 54.40 1.035 52.37 53.69 54.38 55.08 56.43 1.001 3000
## alpha[5] 58.92 1.023 56.90 58.24 58.91 59.58 60.90 1.001 3000
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
Let’s extract the medians posterior distributions of the expected values of \(\alpha_j\) and their 95% credible intervals:
rnd.alphas <- model.fit.rnd$BUGSoutput$summary
fix.alphas <- model.fit.fix$BUGSoutput$summary
plot(snout.vent ~ population, data=snakes,
ylab="Snout-vent length [cm]", col="grey", pch=19)
points(rnd.alphas[,'2.5%'], col="red", pch="-", cex=1.5)
points(fix.alphas[,'2.5%'], col="blue", pch="-", cex=1.5)
points(rnd.alphas[,'97.5%'], col="red", pch="-", cex=1.5)
points(fix.alphas[,'97.5%'], col="blue", pch="-", cex=1.5)
points(rnd.alphas[,'50%'], col="red", pch="+", cex=1.5)
points(fix.alphas[,'50%'], col="blue", pch="+", cex=1.5)
abline(h=mean(snakes$snout.vent), col="grey")
legend("bottomright", pch=c(19,19), col=c("blue","red"),
legend=c("classical","random effects"))
Note the shrinkage effect!
Also, how would you plot the grand.mean
estimated in the random effects model? How would you extract the between- and within- group variances?