Introduction

This is a brain dump.

Fitting (spatially or temporally) correlated data is an important use case for mixed models, especially (for example) for longitudinal data. While there may be other solutions (e.g. additive models, cf. recent Bates papers?), autocorrelated error structures seem like a simple, basic tool that should be available to people fitting mixed models in R.

We’re looking at the standard GLMM formulation:

\[ \begin{split} Y_i & \sim \textrm{Dist}(\eta_i) \\ \eta & = {\mathbf X}{\boldsymbol \beta}+ {\mathbf Z}{\mathbf b}\\ {\mathbf b}& \sim \textrm{MVN}(0,\Sigma) \end{split} \] The tricky part is how we define \({\mathbf b}\) and \(\Sigma\). We want this to be a combination of (at least) random intercepts (among-group variation) and autocorrelated residuals within groups. Writing \({\mathbf b}\) out in a different way: \[ \begin{split} b_{ij} & = {\mathbf b}_{0,{ij}} + \epsilon_{ij} \\ \epsilon_i & \sim \Sigma_{\textrm{AR}} \end{split} \] where \(i\) is the group index, \(j\) is the index; \({\mathbf b}_0\) refers to the “other” random effects, and \(\Sigma_{\textrm{AR}}\) is the standard autoregressive covariance matrix, \(\sigma^2 \rho^{j_1-j_2}\)

GLMMs

In principle, we simply define some kind of correlation structure on the random-effects variance-covariance matrix of the latent variables; there is not a particularly strong distinction between a correlation structure on the observation-level random effects and one on some other grouping structure (e.g., if there were a random effect of year (with multiple measurements within each year) it might make sense to give the conditional modes a correlation structure).

General points

  • Flexibility is nice; in particular, we would like to be able to handle the full range of grouping structures along with a range of temporal correlation structures (AR1, exponential/continuous AR1, ARMA, …)

Packages

nlme (lme)

  • advantages: well documented (Pinheiro and Bates 2000), utility/plotting methods (ACF and plot.ACF), large variety of correlation structures (nlme, ape, ramps packages).

In principle we should be able to re-use correlation structures coded as corStructs (e.g. see example below, lme4ord), although it is slightly more convenient for our purposes to have the Cholesky factor rather than the inverse-Cholesky factor returned by the corFactor method. (The inverse-Cholesky factor is much sparser for many correlation structures; it may be worth figuring out if we can use that to our advantage, although the blocks of the \(\Lambda\) matrix we’re constructing usually isn’t very big anyway.) - disadvantages: lme is slower than lme4, doesn’t handle crossed random effects as easily or as quickly. Can’t handle repeated samples at the same location.

lme4

Basically what needs to be done is to use correlation parameters (e.g. just \(\phi\) for something like an autoregression structure) to generate the values for the Cholesky factor. See example below …

MASS::glmmPQL

  • properties relatively poorly understood; PQL is known to be least accurate for small effective sample sizes per cluster (i.e. binary data, few observations per cluster)

lme4/flexLambda branch

This is a branch of lme4 that is very out of date, but could conceivably be brought up to date; see here and here for some of the relevant code; ar1d, compound-symmetric, and diagonal variance structures were implemented.

glmmTMB

This is a (still-experimental) package built atop Template Model Builder. There is code for the AR1 case, but I’m not sure how complete/tested it is.

lme4ord

Steve Walker’s experimental package; on github.

spaMM

Uses hierarchical GLMs; on CRAN; Rousset and Ferdy 2014; web page. (Can these do temporal models? I’m sure they could be adapted to do so, but it’s not obvious.)

INLA

Uses integrated nested Laplace approximation. At least in principle f(.,model="ar1") works.

brms

Built on Stan; has autocorrelation capabilities (AR, MA, ARMA) via an autocorr argument.

Simplest example

Gaussian

library(nlme)
library(lme4)
library(lme4ord)
library(glmmTMB)
library(brms)
library(INLA)
## convenience/manipulation
library(plyr)
library(dplyr)
library(reshape2)
library(ggplot2); theme_set(theme_bw())
library(ggstance)

Simulate some data (function definition not shown):

d <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101)

lme works pretty well:

(lme_simple_fit <- lme(y~1,random=~1|f,data=d,correlation=corAR1()))
## Linear mixed-effects model fit by REML
##   Data: d 
##   Log-restricted-likelihood: -390.2915
##   Fixed: y ~ 1 
## (Intercept) 
##  -0.7070222 
## 
## Random effects:
##  Formula: ~1 | f
##         (Intercept)  Residual
## StdDev:    2.172383 0.9768581
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | f 
##  Parameter estimate(s):
##       Phi 
## 0.8011786 
## Number of Observations: 400
## Number of Groups: 20

glmmTMB can do this (see notes below about the need for tt-1 in the ar1() term:

glmmTMB_simple_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=d,family=gaussian)

For lme4, we start along the lines described in ?modular:

## process formula (override some sanity checks)
lmod1 <- lFormula(y ~ (1|f) + (tt-1|f),
                  data = d,
                  control=lmerControl(check.nobs.vs.nRE="ignore"))
## construct deviance function
devfun <- do.call(mkLmerDevfun,lmod1)

Now we need a function to convert \(\{\phi,\sigma\}\) into the appropriate Cholesky factor (\(\theta\)) values (note in this case \(\sigma\) is the ratio of the variance of this term to the residual error … shown in gory detail due to possible interest, should (??) generalize to any corStruct object, although things could get a little trickier when blocks differ significantly in their structure …

getTheta <- function(phi,sigma,nmax) {
    ## make corStruct: fake data sequence within a single block
    cc <- nlme::Initialize(nlme::corAR1(phi),data=data.frame(t=seq(nmax)))
    ## get inverse Cholesky factor
    mm <- matrix(nlme::corFactor(cc),nrow=nmax) ## 
    ## check backsolve() idiom: all.equal(solve(mm),backsolve(mm,diag(nmax),upper.tri=FALSE))
    mm2 <- backsolve(mm,diag(nmax),upper.tri=FALSE) ## invert ...
    return(sigma*mm2[lower.tri(mm2,diag=TRUE)])     ## take lower tri & scale by SD
}

Now we set up a wrapper function to take a theta vector consisting of \(\{\sigma^2_g,\rho,\sigma^2_r\}\), convert it to a full theta vector, and pass it to our original deviance function:

devfun2 <- function(theta,nmax) {
    new_theta <- getTheta(phi=theta[2],sigma=theta[3],nmax)
    devfun(c(theta[1],new_theta))
}
devfun2(c(1,0.5,1),nmax=20) ## test

Lower/upper limits ((-1,1) for theta[2], the correlation parameter (actually (-0.999,0.999); limits need to be strictly (-1,1) or corAR1 complains …), (0,Inf) for the standard deviation parameters. We use nloptwrap for convenience (it’s good, and it produces output that are particularly convenient to pass to mkMerMod). (Not quite sure why we need nmax explicitly … ?)

opt1 <- lme4::nloptwrap(c(1,0,1),devfun2,
         lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf),nmax=20)

We can make the result into a regular merMod object, but the variance-covariance matrix will be printed out in full.

lme4_simple_fit <- mkMerMod(rho=environment(devfun),
               opt=opt1,
               reTrms=lmod1$reTrm,
               fr=lmod1$fr)

The log-likelihoods for glmmTMB_simple_fit and lme4_simple_fit are similar (lme4_simple_fit is confused about the number of parameters …)

## [1] -390.4787 -390.2681

Results match:

The results are all pretty close in terms of parameter values:

## Source: local data frame [3 x 3]
## 
##   platform       phi      nugget
##     (fctr)     (dbl)       (dbl)
## 1  glmmTMB 0.8089502 0.006484699
## 2     lme4 0.8120291 0.003819590
## 3      lme 0.8011786 0.000000000

glmmPQL results look essentially identical to lme

MASS::glmmPQL(y~1,random=~1|f,data=d,
              family=gaussian,
              correlation=corAR1(),verbose=FALSE)

INLA …

d$f2 <- d$f ## hack for INLA; needs unique names
inla_simple_fit <- inla(y~1+f(f,model="iid")+f(tt,model="ar1",replicate=f),data=d,
                        family="gaussian")

I think this is right (based on docs here), but I can’t figure out how to extract the parameters …

brms

m3 <- brm(y~1+(1|f),data=d,autocor=cor_ar(formula = ~1, p = 1, cov = FALSE))

This gives an estimate for \(\phi\) of 0.826 (with a standard error of 0.038 …

How does lme4ord do in this case?

corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
form1 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj)
form2 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj=corObj)
## next one is copied/adapted from lme4ord vignette ...
form3 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
form4 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj=corObj)
lme4ord_simple_fit <- strucGlmer(form4, family=gaussian, data=d)
## form1: no applicable method for 'corFactor' applied to an object of class "NULL"
## form2: Cholmod error 'A and B inner dimensions must match' ...
## form4: hangs ...

Can’t quite figure out what I’m doing here – taking stabs in the dark. The one that works might not be correct - it gives \(\phi=0.588\) - maybe it’s ignoring the grouping structure?

GLMM example?

Add a Poisson-distributed sampling layer atop the existing structure:

dp <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101,
              linkinv=exp,simfun=function(x) rpois(length(x),lambda=x))
MASS::glmmPQL(y~1,random=~1|f,data=dp,
              family=poisson,
              correlation=corAR1(),verbose=FALSE)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dp 
##   Log-likelihood: NA
##   Fixed: y ~ 1 
## (Intercept) 
##   0.2219637 
## 
## Random effects:
##  Formula: ~1 | f
##         (Intercept) Residual
## StdDev:    1.675137 2.236704
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | f 
##  Parameter estimate(s):
##       Phi 
## 0.4566912 
## Variance function:
##  Structure: fixed weights
##  Formula: ~invwt 
## Number of Observations: 400
## Number of Groups: 20

In this case glmmPQL doesn’t do anything insane, but: \(\phi\) (autocorr parameter) doesn’t match what we started with, but maybe this is just an issue of definitions (marginal vs conditional autocorrelation) ?

lme4ord works reasonably well:

corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
lme4ord_pois_fit <- strucGlmer(y ~ 1 + (1|f)+nlmeCorStruct(1, corObj = corObj, sig = 1),
           family=poisson,
           data=dp)

I’m not actually sure whether I’m specifying the model correctly here, but the \(\phi\) estimate (0.768) is reasonable, and the output is reasonably pretty (although getting the \(\phi\) parameter out programmatically is a hassle, and the summary method needs work to include all the information given in the print method …):

print(lme4ord_pois_fit)
## Structured GLMM fit by maximum likelihood (Laplace Approx)   ['strucGlmer']
##  Family: poisson  ( log )
## Formula: y ~ 1 + (1 | f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
##    Data: dp
##       AIC       BIC    logLik  deviance  df.resid 
## 1227.3122 1243.2780 -609.6561  206.7885       396 
## Random effects term (class: unstruc): 
##   covariance parameter:   2.294027 
##   variance-correlation:   
##  Groups Name        Std.Dev.
##  f      (Intercept) 2.294   
## Random effects term (class: nlmeCorStruct): 
## Correlation structure of class corAR1 representing
##      Phi 
## 0.768392 
##   Standard deviation multiplier:  0.9994793 
## Fixed Effects:
## (Intercept)  
##     -0.8262

A hacked-up version of glmer (code hidden - I needed a few dozen lines of code, which seems harder than it should be) gets the same answer.

glmer_pois_fit <- ar1_glmer(y ~ (1|f) + (tt-1|f),data=dp,family=poisson)

\(\phi=0.768\)

Seems hard to get confidence intervals on the profile (lme provides back-transformed Wald intervals from the unconstrained scale) for either of these examples. This is a definite hole, but in the meantime I’ll just get point estimates for an ensemble of simulations (i.e., are we getting estimates that are reasonably unbiased and reasonably precise relative to the true values?)

glmmTMB_pois_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=dp,
                            family=poisson)

\(\phi\) is approximately 0.768 (matches the others).

Trying brms::brm() gives “Error: ARMA effects for family poisson are not yet implemented”

Technical notes

  1. manipulating Lind to account for the small number of unique values in the variance-covariance matrix; (2) writing code to get the unique values of the Cholesky factor of the AR1 matrix directly, rather than having to invert the inverse-Cholesky factor.

To do

Everything below here is more or less junk at the moment, but I’m keeping it in for now

Harder example

Using one of the examples from Pinheiro and Bates (?Ovary):

plot(Ovary)

So far these lme and lmer fits are identical (no correlations yet) …

fm1Ovar.lme <- lme(follicles ~ sin(2*pi*Time) + cos(2*pi*Time),
                   data = Ovary, random = pdDiag(~sin(2*pi*Time)))
fm1Ovar.lmer <- lmer(follicles ~ sin(2*pi*Time) + cos(2*pi*Time)+
                        (1+sin(2*pi*Time)||Mare),
                    data = Ovary)
all.equal(fixef(fm1Ovar.lme),fixef(fm1Ovar.lmer))
## [1] TRUE
all.equal(logLik(fm1Ovar.lme),logLik(fm1Ovar.lmer))
## [1] "Attributes: < Component \"nobs\": Mean relative difference: 0.009836066 >"
all.equal(unname(as.numeric(VarCorr(fm1Ovar.lme)[,"StdDev"])),
          unname(as.data.frame(VarCorr(fm1Ovar.lmer))[,"sdcor"]),
          tolerance=1e-7)
## [1] TRUE

What is the ACF?

plot(ACF(fm1Ovar.lme),alpha=0.05)

Fitting with correlation in lme is easy:

fm2Ovar.lme <- update(fm1Ovar.lme, correlation = corAR1())
plot(ACF(fm2Ovar.lme,resType="normalized"),alpha=0.05)

How do we do this by hand in lme4?

Add time-within-Mare factor tt to data:

library(plyr)
Ovary2 <- ddply(Ovary,"Mare",
      mutate,
      tt=factor(seq_along(Time)))

Use the stuff from ?modular (have to override identifiability check):

lmod <- lFormula(follicles ~ sin(2*pi*Time) + cos(2*pi*Time)+
                        (0+sin(2*pi*Time)|Mare)+(0+tt|Mare),
               data = Ovary2,
               control=lmerControl(check.nobs.vs.nRE="ignore"))
## check RE structure ...
lmod$reTrm$cnms
## $Mare
## [1] "sin(2 * pi * Time)"
## 
## $Mare
##  [1] "tt1"  "tt2"  "tt3"  "tt4"  "tt5"  "tt6"  "tt7"  "tt8"  "tt9"  "tt10"
## [11] "tt11" "tt12" "tt13" "tt14" "tt15" "tt16" "tt17" "tt18" "tt19" "tt20"
## [21] "tt21" "tt22" "tt23" "tt24" "tt25" "tt26" "tt27" "tt28" "tt29" "tt30"
## [31] "tt31"
devfun <- do.call(mkLmerDevfun,lmod)

Now we need a function to convert \(\{\phi,\sigma\}\) into the appropriate Cholesky factor (\(\theta\)) values (note in this case \(\sigma\) is the ratio of the variance of this term to the residual error …

getTheta <- function(phi,sigma,nmax=31) {
    cc <- Initialize(corAR1(phi),data=data.frame(t=seq(nmax)))
    ## get inverse Cholesky factor
    mm <- matrix(corFactor(cc),nrow=nmax) ## 
    ## all.equal(solve(mm),backsolve(mm,diag(nmax),upper.tri=FALSE))
    ## invert ...
    mm2 <- backsolve(mm,diag(nmax),upper.tri=FALSE)
    return(sigma*mm2[lower.tri(mm2,diag=TRUE)])
}
devfun2 <- function(theta) {
    new_theta <- c(theta[1],getTheta(theta[2],theta[3]))
    devfun(new_theta)
}
devfun2(c(1,0.5,1))
## [1] 1663.836

It’s not important at this point, but we could gain slightly more efficiency by (1) manipulating Lind to account for the small number of unique values in the variance-covariance matrix; (2) writing code to get the unique values of the Cholesky factor of the AR1 matrix directly, rather than having to invert the inverse-Cholesky factor.

Lower/upper limits ((-1,1) for theta(1), (0,Inf) for theta[2], (0,Inf) for theta(3)) (limits need to be strictly (-1,1) or corAR1 complains …)

opt1 <- minqa::bobyqa(c(1,0,1),devfun2,lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf))
opt2 <- lme4::nloptwrap(c(1,0,1),devfun2,lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf))
all.equal(opt1$fval,opt2$fval)
## [1] TRUE

We can make the result into a regular merMod object, but the variance-covariance matrix will be printed out in full. (It turns out that opt2 is in a slightly more convenient format for mkMerMod …)

res <- mkMerMod(rho=environment(devfun),
               opt=opt2,
               reTrms=lmod$reTrm,
               fr=lmod$fr)
## Linear mixed model fit by REML ['lmerMod']
## REML criterion at convergence: 1546.474
## Random effects:
##  Groups   Name               Std.Dev. Corr                              
##  Mare     sin(2 * pi * Time) 0.8462                                     
##  Mare.1   tt1                4.1434                                     
##           tt2                4.1434   0.90                              
##           tt3                4.1434   0.80 0.90                         
## ... [rows skipped] ...
##           tt30               4.1434   0.04 0.05 0. ...
##           tt31               4.1434   0.04 0.04 0. ...
##  Residual                    1.8957                                     
##                                              
## Number of obs: 308, groups:  Mare, 11
## Fixed Effects:
##        (Intercept)  sin(2 * pi * Time)  cos(2 * pi * Time)  
##             12.049              -2.906              -0.798

Retrieve autocorrelation function. We could get rho <- res@optinfo$val[2]

sigma0 <- sigma(res)            ## resid var
acovf <- VarCorr(res)[[2]][,1]  ## first row of var-cov matrix
acfvec1 <- acovf/(acovf[1]+sigma0^2)  ## scale by *total* resid variance
cs1 <- fm2Ovar.lme$modelStruct$corStruct
acfvec2 <- corMatrix(cs1)[[1]][1,]  ## first row of corr matrix
aa <- ACF(fm2Ovar.lme)
par(las=1,bty="l")
plot(ACF~lag,data=aa,log='y')
## Warning in xy.coords(x, y, xlabel, ylabel, log): 5 y values <= 0 omitted
## from logarithmic plot
matlines(0:14,cbind(acfvec1[1:length(acfvec2)],acfvec2)[1:15,])

Hmm: lme solution looks better on the face of it, but the log-likelihood of the lme4 solution is higher (which would make sense since the model has an additional nugget parameter). I wonder what I screwed up? Did I miss a variance term somewhere?

lme4ord

library(lme4ord)
corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|Mare), Ovary)
## lme4ord can't handle as much within formulas? or formula processing
## screws things up?
Ovary3 <- transform(Ovary,sin2pit=2*pi*Time)
try(strucGlmer(follicles~ sin(2*pi*Time) + cos(2*pi*Time)+
               (1+sin2pit||Mare)+
               nlmeCorStruct(1, corObj = corObj, sig = 1),
           data=Ovary3))
## Warning in Ops.factor(0, sin2pit): '+' not meaningful for factors
## Warning in Ops.ordered(0 + sin2pit, Mare): '|' is not meaningful for
## ordered factors

kronecker games

We want to know the relationship between chol(kron(m1,m2)) and chol(m1),chol(m2) … “three minutes’ thought would suffice to find this out, but thought is irksome and three minutes is a long time”. So let’s

set.seed(101)
## make some cholesky factors
c2 <- c1 <- matrix(0,3,3)
c1[lower.tri(c1,diag=TRUE)] <- rnorm(6)
c2[lower.tri(c2,diag=TRUE)] <- rnorm(6)
## multiply them up to pos-def matrices
m1 <- tcrossprod(c1)
m2 <- tcrossprod(c2)
## ignore signs completely for now ...
all.equal(abs(c1),abs(t(chol(m1))))
## [1] TRUE
## check what we want to know 
all.equal(chol(kronecker(m1,m2)),kronecker(chol(m1),chol(m2)))
## [1] TRUE