This is a brain dump.

Fitting (spatially or temporally) correlated data is an important use case for mixed models, especially (for example) for longitudinal data. While there may be other solutions (e.g. additive models, cf. recent Bates papers?), autocorrelated error structures seem like a simple, basic tool that should be available to people fitting mixed models in R.

We’re looking at the standard GLMM formulation:

\[ \begin{split} Y_i & \sim \textrm{Dist}(\eta_i) \\ \eta & = {\mathbf X}{\boldsymbol \beta}+ {\mathbf Z}{\mathbf b}\\ {\mathbf b}& \sim \textrm{MVN}(0,\Sigma) \end{split} \] The tricky part is how we define \({\mathbf b}\) and \(\Sigma\). We want this to be a combination of (at least) random intercepts (among-group variation) and autocorrelated residuals within groups. Writing \({\mathbf b}\) out in a different way: \[ \begin{split} b_{ij} & = {\mathbf b}_{0,{ij}} + \epsilon_{ij} \\ \epsilon_i & \sim \Sigma_{\textrm{AR}} \end{split} \] where \(i\) is the group index, \(j\) is the index; \({\mathbf b}_0\) refers to the “other” random effects, and \(\Sigma_{\textrm{AR}}\) is the standard autoregressive covariance matrix, \(\sigma^2 \rho^{j_1-j_2}\)

- If the conditional distributions are Gaussian (i.e. a linear rather than a generalized linear mixed model, then
`lme`

from the recommended`nlme`

will fit a variety of correlation structures, via the`correlation`

argument. - Implementing autocorrelation in Gaussian models with
`lme4`

is surprisingly tricky, because all of the*easy*ways to extend`lme4`

in terms of using the modular structures (see`?modular`

or the`lme4ord`

package) are “G-side” (based on grouping factors) rather than “R-side” (based on structuring the residual error term). You can do a lot by defining an observation-level random effect (and overriding errors about having the same number of random effects as observations), but because it’s impossible to shut off the residual error term or change its value (it is simply estimated from the penalized weighted residual sum of squares that’s left over when the model has been fitted), it means that there will always (??) be a “nugget” effect left over, i.e. correlation functions will be of the form \(g \cdot \phi(t)\), where \(\phi(0)=1\) and \(0<g<1\).`lme4`

handles LMM and GLMMs completely differently; in principle we could fit Gaussian models either via LMM or via`glmer(.,family=gaussian)`

, but the latter is largely untested (at present calling`glmer`

with`family=gaussian`

and an identity link redirects to`lmer`

).

In principle, we simply define some kind of correlation structure on the random-effects variance-covariance matrix of the latent variables; there is not a particularly strong distinction between a correlation structure on the observation-level random effects and one on some other grouping structure (e.g., if there were a random effect of year (with multiple measurements within each year) it might make sense to give the conditional modes a correlation structure).

- Flexibility is nice; in particular, we would like to be able to handle the full range of grouping structures along with a range of temporal correlation structures (AR1, exponential/continuous AR1, ARMA, …)

*advantages*: well documented (Pinheiro and Bates 2000), utility/plotting methods (`ACF`

and`plot.ACF`

), large variety of correlation structures (`nlme`

,`ape`

,`ramps`

packages).

In principle we should be able to re-use correlation structures coded as `corStruct`

s (e.g. see example below, `lme4ord`

), although it is slightly more convenient for our purposes to have the Cholesky factor rather than the inverse-Cholesky factor returned by the `corFactor`

method. (The inverse-Cholesky factor is much sparser for many correlation structures; it may be worth figuring out if we can use that to our advantage, although the blocks of the \(\Lambda\) matrix we’re constructing usually isn’t very big anyway.) - *disadvantages*: `lme`

is slower than `lme4`

, doesn’t handle crossed random effects as easily or as quickly. Can’t handle repeated samples at the same location.

Basically what needs to be done is to use correlation parameters (e.g. just \(\phi\) for something like an autoregression structure) to generate the values for the Cholesky factor. See example below …

- properties relatively poorly understood; PQL is known to be least accurate for small effective sample sizes per cluster (i.e. binary data, few observations per cluster)

This is a branch of `lme4`

that is very out of date, but could conceivably be brought up to date; see here and here for some of the relevant code; `ar1d`

, compound-symmetric, and diagonal variance structures were implemented.

This is a (still-experimental) package built atop Template Model Builder. There is code for the AR1 case, but I’m not sure how complete/tested it is.

Steve Walker’s experimental package; on github.

Uses hierarchical GLMs; on CRAN; Rousset and Ferdy 2014; web page. (Can these do temporal models? I’m sure they could be adapted to do so, but it’s not obvious.)

Uses integrated nested Laplace approximation. At least in principle `f(.,model="ar1")`

works.

Built on Stan; has autocorrelation capabilities (AR, MA, ARMA) via an `autocorr`

argument.

```
library(nlme)
library(lme4)
library(lme4ord)
library(glmmTMB)
library(brms)
library(INLA)
## convenience/manipulation
library(plyr)
library(dplyr)
library(reshape2)
library(ggplot2); theme_set(theme_bw())
library(ggstance)
```

Simulate some data (function definition not shown):

`d <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101)`

`lme`

works pretty well:

`(lme_simple_fit <- lme(y~1,random=~1|f,data=d,correlation=corAR1()))`

```
## Linear mixed-effects model fit by REML
## Data: d
## Log-restricted-likelihood: -390.2915
## Fixed: y ~ 1
## (Intercept)
## -0.7070222
##
## Random effects:
## Formula: ~1 | f
## (Intercept) Residual
## StdDev: 2.172383 0.9768581
##
## Correlation Structure: AR(1)
## Formula: ~1 | f
## Parameter estimate(s):
## Phi
## 0.8011786
## Number of Observations: 400
## Number of Groups: 20
```

`glmmTMB`

can do this (see notes below about the need for `tt-1`

in the `ar1()`

term:

`glmmTMB_simple_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=d,family=gaussian)`

For `lme4`

, we start along the lines described in `?modular`

:

```
## process formula (override some sanity checks)
lmod1 <- lFormula(y ~ (1|f) + (tt-1|f),
data = d,
control=lmerControl(check.nobs.vs.nRE="ignore"))
## construct deviance function
devfun <- do.call(mkLmerDevfun,lmod1)
```

Now we need a function to convert \(\{\phi,\sigma\}\) into the appropriate Cholesky factor (\(\theta\)) values (note in this case \(\sigma\) is the *ratio* of the variance of this term to the residual error … shown in gory detail due to possible interest, should (??) generalize to any `corStruct`

object, although things could get a little trickier when blocks differ significantly in their structure …

```
getTheta <- function(phi,sigma,nmax) {
## make corStruct: fake data sequence within a single block
cc <- nlme::Initialize(nlme::corAR1(phi),data=data.frame(t=seq(nmax)))
## get inverse Cholesky factor
mm <- matrix(nlme::corFactor(cc),nrow=nmax) ##
## check backsolve() idiom: all.equal(solve(mm),backsolve(mm,diag(nmax),upper.tri=FALSE))
mm2 <- backsolve(mm,diag(nmax),upper.tri=FALSE) ## invert ...
return(sigma*mm2[lower.tri(mm2,diag=TRUE)]) ## take lower tri & scale by SD
}
```

Now we set up a wrapper function to take a `theta`

vector consisting of \(\{\sigma^2_g,\rho,\sigma^2_r\}\), convert it to a full `theta`

vector, and pass it to our original deviance function:

```
devfun2 <- function(theta,nmax) {
new_theta <- getTheta(phi=theta[2],sigma=theta[3],nmax)
devfun(c(theta[1],new_theta))
}
devfun2(c(1,0.5,1),nmax=20) ## test
```

Lower/upper limits ((-1,1) for `theta[2]`

, the correlation parameter (actually (-0.999,0.999); limits need to be strictly (-1,1) or `corAR1`

complains …), `(0,Inf)`

for the standard deviation parameters. We use `nloptwrap`

for convenience (it’s good, and it produces output that are particularly convenient to pass to `mkMerMod`

). (Not quite sure why we need `nmax`

explicitly … ?)

```
opt1 <- lme4::nloptwrap(c(1,0,1),devfun2,
lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf),nmax=20)
```

We can make the result into a regular `merMod`

object, but the variance-covariance matrix will be printed out in full.

```
lme4_simple_fit <- mkMerMod(rho=environment(devfun),
opt=opt1,
reTrms=lmod1$reTrm,
fr=lmod1$fr)
```

The log-likelihoods for `glmmTMB_simple_fit`

and `lme4_simple_fit`

are similar (`lme4_simple_fit`

is confused about the number of parameters …)

`## [1] -390.4787 -390.2681`

Results match:

The results are all pretty close in terms of parameter values:

```
## Source: local data frame [3 x 3]
##
## platform phi nugget
## (fctr) (dbl) (dbl)
## 1 glmmTMB 0.8089502 0.006484699
## 2 lme4 0.8120291 0.003819590
## 3 lme 0.8011786 0.000000000
```

`glmmPQL`

results look essentially identical to `lme`

…

```
MASS::glmmPQL(y~1,random=~1|f,data=d,
family=gaussian,
correlation=corAR1(),verbose=FALSE)
```

INLA …

```
d$f2 <- d$f ## hack for INLA; needs unique names
inla_simple_fit <- inla(y~1+f(f,model="iid")+f(tt,model="ar1",replicate=f),data=d,
family="gaussian")
```

I *think* this is right (based on docs here), but I can’t figure out how to extract the parameters …

`brms`

…

`m3 <- brm(y~1+(1|f),data=d,autocor=cor_ar(formula = ~1, p = 1, cov = FALSE))`

This gives an estimate for \(\phi\) of 0.826 (with a standard error of 0.038 …

How does `lme4ord`

do in this case?

```
corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
form1 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj)
form2 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj=corObj)
## next one is copied/adapted from lme4ord vignette ...
form3 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
form4 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj=corObj)
lme4ord_simple_fit <- strucGlmer(form4, family=gaussian, data=d)
## form1: no applicable method for 'corFactor' applied to an object of class "NULL"
## form2: Cholmod error 'A and B inner dimensions must match' ...
## form4: hangs ...
```

Can’t quite figure out what I’m doing here – taking stabs in the dark. The one that works might not be correct - it gives \(\phi=0.588\) - maybe it’s ignoring the grouping structure?

Add a Poisson-distributed sampling layer atop the existing structure:

```
dp <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101,
linkinv=exp,simfun=function(x) rpois(length(x),lambda=x))
```

```
MASS::glmmPQL(y~1,random=~1|f,data=dp,
family=poisson,
correlation=corAR1(),verbose=FALSE)
```

```
## Linear mixed-effects model fit by maximum likelihood
## Data: dp
## Log-likelihood: NA
## Fixed: y ~ 1
## (Intercept)
## 0.2219637
##
## Random effects:
## Formula: ~1 | f
## (Intercept) Residual
## StdDev: 1.675137 2.236704
##
## Correlation Structure: AR(1)
## Formula: ~1 | f
## Parameter estimate(s):
## Phi
## 0.4566912
## Variance function:
## Structure: fixed weights
## Formula: ~invwt
## Number of Observations: 400
## Number of Groups: 20
```

In this case `glmmPQL`

doesn’t do anything insane, but: \(\phi\) (autocorr parameter) doesn’t match what we started with, but maybe this is just an issue of definitions (marginal vs conditional autocorrelation) ?

`lme4ord`

works reasonably well:

```
corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
lme4ord_pois_fit <- strucGlmer(y ~ 1 + (1|f)+nlmeCorStruct(1, corObj = corObj, sig = 1),
family=poisson,
data=dp)
```

I’m not actually sure whether I’m specifying the model correctly here, but the \(\phi\) estimate (0.768) is reasonable, and the output is reasonably pretty (although getting the \(\phi\) parameter out programmatically is a hassle, and the `summary`

method needs work to include all the information given in the `print`

method …):

`print(lme4ord_pois_fit)`

```
## Structured GLMM fit by maximum likelihood (Laplace Approx) ['strucGlmer']
## Family: poisson ( log )
## Formula: y ~ 1 + (1 | f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
## Data: dp
## AIC BIC logLik deviance df.resid
## 1227.3122 1243.2780 -609.6561 206.7885 396
## Random effects term (class: unstruc):
## covariance parameter: 2.294027
## variance-correlation:
## Groups Name Std.Dev.
## f (Intercept) 2.294
## Random effects term (class: nlmeCorStruct):
## Correlation structure of class corAR1 representing
## Phi
## 0.768392
## Standard deviation multiplier: 0.9994793
## Fixed Effects:
## (Intercept)
## -0.8262
```

A hacked-up version of `glmer`

(code hidden - I needed a few dozen lines of code, which seems harder than it should be) gets the same answer.

`glmer_pois_fit <- ar1_glmer(y ~ (1|f) + (tt-1|f),data=dp,family=poisson)`

\(\phi=0.768\)

Seems hard to get confidence intervals on the profile (`lme`

provides back-transformed Wald intervals from the unconstrained scale) for either of these examples. This is a definite hole, but in the meantime I’ll just get point estimates for an ensemble of simulations (i.e., are we getting estimates that are reasonably unbiased and reasonably precise relative to the true values?)

```
glmmTMB_pois_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=dp,
family=poisson)
```

\(\phi\) is approximately 0.768 (matches the others).

Trying `brms::brm()`

gives “Error: ARMA effects for family poisson are not yet implemented”

- Need to remember to put in the
`(1|f)`

(group/IID) term as well as the autoregressive term (with AR only, this should match the fit of`gls(y~1,correlation=corAR1(~1|f))`

but does*not*match the way we simulated the data … - If we use
`ar1(tt|f)`

, with`glmmTMB`

we get a warning message (“AR1 not meaningful with intercept”).*This is important*; it made me aware of a similar mistake I was making previously with my`lmer`

hack below. Since`lme4`

uses unstructured (i.e. general positive-definite) variance-covariance matrices by default, it normally doesn’t matter how you parameterize the contrasts for a categorical variable – the model fit/predications are invariant to linear transformations. This is no longer true when we use structured variance-covariance matrices (!), so we need`(tt-1|f)`

rather than`(tt|f)`

… - We could gain slightly more efficiency out of the
`lme4`

hack by

- manipulating
`Lind`

to account for the small number of*unique*values in the variance-covariance matrix; (2) writing code to get the unique values of the Cholesky factor of the AR1 matrix directly, rather than having to invert the inverse-Cholesky factor.

- Could also consider fitting the correlation parameters on the logit scale … although this is inconsistent with the way we fit the other variance-cov parameters (i.e. on simply bounded, not transformed, scale)

- finish simple Poisson GLMM example
- see how far we can get through P&B
`Ovary`

data-set example (below) - other examples … ?

**Everything below here is more or less junk at the moment, but I’m keeping it in for now**

Using one of the examples from Pinheiro and Bates (`?Ovary`

):

`plot(Ovary)`