fitting mixed models with (temporal) correlations in R

Introduction

This is a brain dump.

Fitting (spatially or temporally) correlated data is an important use case for mixed models, especially (for example) for longitudinal data. While there may be other solutions (e.g. additive models, cf. recent Bates papers?), autocorrelated error structures seem like a simple, basic tool that should be available to people fitting mixed models in R.

We’re looking at the standard GLMM formulation:

\[ \begin{split} Y_i & \sim \textrm{Dist}(\eta_i) \\ \eta & = {\mathbf X}{\boldsymbol \beta}+ {\mathbf Z}{\mathbf b}\\ {\mathbf b}& \sim \textrm{MVN}(0,\Sigma) \end{split} \] The tricky part is how we define ${\mathbf b}$ and $\Sigma$. We want this to be a combination of (at least) random intercepts (among-group variation) and autocorrelated residuals within groups. Writing ${\mathbf b}$ out in a different way: \[ \begin{split} b_{ij} & = {\mathbf b}_{0,{ij}} + \epsilon_{ij} \\ \epsilon_i & \sim \Sigma_{\textrm{AR}} \end{split} \] where $i$ is the group index, $j$ is the index; ${\mathbf b}_0$ refers to the “other” random effects, and $\Sigma_{\textrm{AR}}$ is the standard autoregressive covariance matrix, $\sigma^2 \rho^{j_1-j_2}$

If the conditional distributions are Gaussian (i.e. a linear rather than a generalized linear mixed model, then lme from the recommended nlme will fit a variety of correlation structures, via the correlation argument.
Implementing autocorrelation in Gaussian models with lme4 is surprisingly tricky, because all of the easy ways to extend lme4 in terms of using the modular structures (see ?modular or the lme4ord package) are “G-side” (based on grouping factors) rather than “R-side” (based on structuring the residual error term). You can do a lot by defining an observation-level random effect (and overriding errors about having the same number of random effects as observations), but because it’s impossible to shut off the residual error term or change its value (it is simply estimated from the penalized weighted residual sum of squares that’s left over when the model has been fitted), it means that there will always (??) be a “nugget” effect left over, i.e. correlation functions will be of the form $g \cdot \phi(t)$, where $\phi(0)=1$ and $0<g<1$. lme4 handles LMM and GLMMs completely differently; in principle we could fit Gaussian models either via LMM or via glmer(.,family=gaussian), but the latter is largely untested (at present calling glmer with family=gaussian and an identity link redirects to lmer).

GLMMs

In principle, we simply define some kind of correlation structure on the random-effects variance-covariance matrix of the latent variables; there is not a particularly strong distinction between a correlation structure on the observation-level random effects and one on some other grouping structure (e.g., if there were a random effect of year (with multiple measurements within each year) it might make sense to give the conditional modes a correlation structure).

General points

Flexibility is nice; in particular, we would like to be able to handle the full range of grouping structures along with a range of temporal correlation structures (AR1, exponential/continuous AR1, ARMA, …)

Packages

nlme (lme)

advantages: well documented (Pinheiro and Bates 2000), utility/plotting methods (ACF and plot.ACF), large variety of correlation structures (nlme, ape, ramps packages).

In principle we should be able to re-use correlation structures coded as corStructs (e.g. see example below, lme4ord), although it is slightly more convenient for our purposes to have the Cholesky factor rather than the inverse-Cholesky factor returned by the corFactor method. (The inverse-Cholesky factor is much sparser for many correlation structures; it may be worth figuring out if we can use that to our advantage, although the blocks of the $\Lambda$ matrix we’re constructing usually isn’t very big anyway.) - disadvantages: lme is slower than lme4, doesn’t handle crossed random effects as easily or as quickly. Can’t handle repeated samples at the same location.

lme4

Basically what needs to be done is to use correlation parameters (e.g. just $\phi$ for something like an autoregression structure) to generate the values for the Cholesky factor. See example below …

MASS::glmmPQL

properties relatively poorly understood; PQL is known to be least accurate for small effective sample sizes per cluster (i.e. binary data, few observations per cluster)

lme4/flexLambda branch

This is a branch of lme4 that is very out of date, but could conceivably be brought up to date; see here and here for some of the relevant code; ar1d, compound-symmetric, and diagonal variance structures were implemented.

glmmTMB

This is a (still-experimental) package built atop Template Model Builder. There is code for the AR1 case, but I’m not sure how complete/tested it is.

lme4ord

Steve Walker’s experimental package; on github.

spaMM

Uses hierarchical GLMs; on CRAN; Rousset and Ferdy 2014; web page. (Can these do temporal models? I’m sure they could be adapted to do so, but it’s not obvious.)

INLA

Uses integrated nested Laplace approximation. At least in principle f(.,model="ar1") works.

brms

Built on Stan; has autocorrelation capabilities (AR, MA, ARMA) via an autocorr argument.

Simplest example

Gaussian

library(nlme)
library(lme4)
library(lme4ord)
library(glmmTMB)
library(brms)
library(INLA)
## convenience/manipulation
library(plyr)
library(dplyr)
library(reshape2)
library(ggplot2); theme_set(theme_bw())
library(ggstance)

Simulate some data (function definition not shown):

d <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101)

lme works pretty well:

(lme_simple_fit <- lme(y~1,random=~1|f,data=d,correlation=corAR1()))

## Linear mixed-effects model fit by REML
##   Data: d 
##   Log-restricted-likelihood: -390.2915
##   Fixed: y ~ 1 
## (Intercept) 
##  -0.7070222 
## 
## Random effects:
##  Formula: ~1 | f
##         (Intercept)  Residual
## StdDev:    2.172383 0.9768581
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | f 
##  Parameter estimate(s):
##       Phi 
## 0.8011786 
## Number of Observations: 400
## Number of Groups: 20

glmmTMB can do this (see notes below about the need for tt-1 in the ar1() term:

glmmTMB_simple_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=d,family=gaussian)

For lme4, we start along the lines described in ?modular:

## process formula (override some sanity checks)
lmod1 <- lFormula(y ~ (1|f) + (tt-1|f),
                  data = d,
                  control=lmerControl(check.nobs.vs.nRE="ignore"))
## construct deviance function
devfun <- do.call(mkLmerDevfun,lmod1)

Now we need a function to convert $\{\phi,\sigma\}$ into the appropriate Cholesky factor ($\theta$) values (note in this case $\sigma$ is the ratio of the variance of this term to the residual error … shown in gory detail due to possible interest, should (??) generalize to any corStruct object, although things could get a little trickier when blocks differ significantly in their structure …

getTheta <- function(phi,sigma,nmax) {
    ## make corStruct: fake data sequence within a single block
    cc <- nlme::Initialize(nlme::corAR1(phi),data=data.frame(t=seq(nmax)))
    ## get inverse Cholesky factor
    mm <- matrix(nlme::corFactor(cc),nrow=nmax) ## 
    ## check backsolve() idiom: all.equal(solve(mm),backsolve(mm,diag(nmax),upper.tri=FALSE))
    mm2 <- backsolve(mm,diag(nmax),upper.tri=FALSE) ## invert ...
    return(sigma*mm2[lower.tri(mm2,diag=TRUE)])     ## take lower tri & scale by SD
}

Now we set up a wrapper function to take a theta vector consisting of $\{\sigma^2_g,\rho,\sigma^2_r\}$, convert it to a full theta vector, and pass it to our original deviance function:

devfun2 <- function(theta,nmax) {
    new_theta <- getTheta(phi=theta[2],sigma=theta[3],nmax)
    devfun(c(theta[1],new_theta))
}
devfun2(c(1,0.5,1),nmax=20) ## test

Lower/upper limits ((-1,1) for theta[2], the correlation parameter (actually (-0.999,0.999); limits need to be strictly (-1,1) or corAR1 complains …), (0,Inf) for the standard deviation parameters. We use nloptwrap for convenience (it’s good, and it produces output that are particularly convenient to pass to mkMerMod). (Not quite sure why we need nmax explicitly … ?)

opt1 <- lme4::nloptwrap(c(1,0,1),devfun2,
         lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf),nmax=20)

We can make the result into a regular merMod object, but the variance-covariance matrix will be printed out in full.

lme4_simple_fit <- mkMerMod(rho=environment(devfun),
               opt=opt1,
               reTrms=lmod1$reTrm,
               fr=lmod1$fr)

The log-likelihoods for glmmTMB_simple_fit and lme4_simple_fit are similar (lme4_simple_fit is confused about the number of parameters …)

## [1] -390.4787 -390.2681

Results match:

The results are all pretty close in terms of parameter values:

## Source: local data frame [3 x 3]
## 
##   platform       phi      nugget
##     (fctr)     (dbl)       (dbl)
## 1  glmmTMB 0.8089502 0.006484699
## 2     lme4 0.8120291 0.003819590
## 3      lme 0.8011786 0.000000000

glmmPQL results look essentially identical to lme …

MASS::glmmPQL(y~1,random=~1|f,data=d,
              family=gaussian,
              correlation=corAR1(),verbose=FALSE)

INLA …

d$f2 <- d$f ## hack for INLA; needs unique names
inla_simple_fit <- inla(y~1+f(f,model="iid")+f(tt,model="ar1",replicate=f),data=d,
                        family="gaussian")

I think this is right (based on docs here), but I can’t figure out how to extract the parameters …

brms …

m3 <- brm(y~1+(1|f),data=d,autocor=cor_ar(formula = ~1, p = 1, cov = FALSE))

This gives an estimate for $\phi$ of 0.826 (with a standard error of 0.038 …

How does lme4ord do in this case?

corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
form1 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj)
form2 <- y ~ 1 + (1|f) + nlmeCorStruct(1|f, corObj=corObj)
## next one is copied/adapted from lme4ord vignette ...
form3 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
form4 <- y ~ 1 + (1|f) + nlmeCorStruct(1, corObj=corObj)
lme4ord_simple_fit <- strucGlmer(form4, family=gaussian, data=d)
## form1: no applicable method for 'corFactor' applied to an object of class "NULL"
## form2: Cholmod error 'A and B inner dimensions must match' ...
## form4: hangs ...

Can’t quite figure out what I’m doing here – taking stabs in the dark. The one that works might not be correct - it gives $\phi=0.588$ - maybe it’s ignoring the grouping structure?

GLMM example?

Add a Poisson-distributed sampling layer atop the existing structure:

dp <- simCor1(phi=0.8,sdgrp=2,sdres=1,seed=101,
              linkinv=exp,simfun=function(x) rpois(length(x),lambda=x))

MASS::glmmPQL(y~1,random=~1|f,data=dp,
              family=poisson,
              correlation=corAR1(),verbose=FALSE)

## Linear mixed-effects model fit by maximum likelihood
##   Data: dp 
##   Log-likelihood: NA
##   Fixed: y ~ 1 
## (Intercept) 
##   0.2219637 
## 
## Random effects:
##  Formula: ~1 | f
##         (Intercept) Residual
## StdDev:    1.675137 2.236704
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | f 
##  Parameter estimate(s):
##       Phi 
## 0.4566912 
## Variance function:
##  Structure: fixed weights
##  Formula: ~invwt 
## Number of Observations: 400
## Number of Groups: 20

In this case glmmPQL doesn’t do anything insane, but: $\phi$ (autocorr parameter) doesn’t match what we started with, but maybe this is just an issue of definitions (marginal vs conditional autocorrelation) ?

lme4ord works reasonably well:

corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|f), d)
lme4ord_pois_fit <- strucGlmer(y ~ 1 + (1|f)+nlmeCorStruct(1, corObj = corObj, sig = 1),
           family=poisson,
           data=dp)

I’m not actually sure whether I’m specifying the model correctly here, but the $\phi$ estimate (0.768) is reasonable, and the output is reasonably pretty (although getting the $\phi$ parameter out programmatically is a hassle, and the summary method needs work to include all the information given in the print method …):

print(lme4ord_pois_fit)

## Structured GLMM fit by maximum likelihood (Laplace Approx)   ['strucGlmer']
##  Family: poisson  ( log )
## Formula: y ~ 1 + (1 | f) + nlmeCorStruct(1, corObj = corObj, sig = 1)
##    Data: dp
##       AIC       BIC    logLik  deviance  df.resid 
## 1227.3122 1243.2780 -609.6561  206.7885       396 
## Random effects term (class: unstruc): 
##   covariance parameter:   2.294027 
##   variance-correlation:   
##  Groups Name        Std.Dev.
##  f      (Intercept) 2.294   
## Random effects term (class: nlmeCorStruct): 
## Correlation structure of class corAR1 representing
##      Phi 
## 0.768392 
##   Standard deviation multiplier:  0.9994793 
## Fixed Effects:
## (Intercept)  
##     -0.8262

A hacked-up version of glmer (code hidden - I needed a few dozen lines of code, which seems harder than it should be) gets the same answer.

glmer_pois_fit <- ar1_glmer(y ~ (1|f) + (tt-1|f),data=dp,family=poisson)

$\phi=0.768$

Seems hard to get confidence intervals on the profile (lme provides back-transformed Wald intervals from the unconstrained scale) for either of these examples. This is a definite hole, but in the meantime I’ll just get point estimates for an ensemble of simulations (i.e., are we getting estimates that are reasonably unbiased and reasonably precise relative to the true values?)

glmmTMB_pois_fit <- glmmTMB(y~1 + (1|f) + ar1(tt-1|f), data=dp,
                            family=poisson)

$\phi$ is approximately 0.768 (matches the others).

Trying brms::brm() gives “Error: ARMA effects for family poisson are not yet implemented”

Technical notes

Need to remember to put in the (1|f) (group/IID) term as well as the autoregressive term (with AR only, this should match the fit of gls(y~1,correlation=corAR1(~1|f)) but does not match the way we simulated the data …
If we use ar1(tt|f), with glmmTMB we get a warning message (“AR1 not meaningful with intercept”). This is important; it made me aware of a similar mistake I was making previously with my lmer hack below. Since lme4 uses unstructured (i.e. general positive-definite) variance-covariance matrices by default, it normally doesn’t matter how you parameterize the contrasts for a categorical variable – the model fit/predications are invariant to linear transformations. This is no longer true when we use structured variance-covariance matrices (!), so we need (tt-1|f) rather than (tt|f) …
We could gain slightly more efficiency out of the lme4 hack by

manipulating Lind to account for the small number of unique values in the variance-covariance matrix; (2) writing code to get the unique values of the Cholesky factor of the AR1 matrix directly, rather than having to invert the inverse-Cholesky factor.

Could also consider fitting the correlation parameters on the logit scale … although this is inconsistent with the way we fit the other variance-cov parameters (i.e. on simply bounded, not transformed, scale)

To do

finish simple Poisson GLMM example
see how far we can get through P&B Ovary data-set example (below)
other examples … ?

Everything below here is more or less junk at the moment, but I’m keeping it in for now

Harder example

Using one of the examples from Pinheiro and Bates (?Ovary):

plot(Ovary)

So far these lme and lmer fits are identical (no correlations yet) …

fm1Ovar.lme <- lme(follicles ~ sin(2*pi*Time) + cos(2*pi*Time),
                   data = Ovary, random = pdDiag(~sin(2*pi*Time)))
fm1Ovar.lmer <- lmer(follicles ~ sin(2*pi*Time) + cos(2*pi*Time)+
                        (1+sin(2*pi*Time)||Mare),
                    data = Ovary)
all.equal(fixef(fm1Ovar.lme),fixef(fm1Ovar.lmer))

## [1] TRUE

all.equal(logLik(fm1Ovar.lme),logLik(fm1Ovar.lmer))

## [1] "Attributes: < Component \"nobs\": Mean relative difference: 0.009836066 >"

all.equal(unname(as.numeric(VarCorr(fm1Ovar.lme)[,"StdDev"])),
          unname(as.data.frame(VarCorr(fm1Ovar.lmer))[,"sdcor"]),
          tolerance=1e-7)

## [1] TRUE

What is the ACF?

plot(ACF(fm1Ovar.lme),alpha=0.05)

Fitting with correlation in lme is easy:

fm2Ovar.lme <- update(fm1Ovar.lme, correlation = corAR1())
plot(ACF(fm2Ovar.lme,resType="normalized"),alpha=0.05)

How do we do this by hand in lme4?

Add time-within-Mare factor tt to data:

library(plyr)
Ovary2 <- ddply(Ovary,"Mare",
      mutate,
      tt=factor(seq_along(Time)))

Use the stuff from ?modular (have to override identifiability check):

lmod <- lFormula(follicles ~ sin(2*pi*Time) + cos(2*pi*Time)+
                        (0+sin(2*pi*Time)|Mare)+(0+tt|Mare),
               data = Ovary2,
               control=lmerControl(check.nobs.vs.nRE="ignore"))
## check RE structure ...
lmod$reTrm$cnms

## $Mare
## [1] "sin(2 * pi * Time)"
## 
## $Mare
##  [1] "tt1"  "tt2"  "tt3"  "tt4"  "tt5"  "tt6"  "tt7"  "tt8"  "tt9"  "tt10"
## [11] "tt11" "tt12" "tt13" "tt14" "tt15" "tt16" "tt17" "tt18" "tt19" "tt20"
## [21] "tt21" "tt22" "tt23" "tt24" "tt25" "tt26" "tt27" "tt28" "tt29" "tt30"
## [31] "tt31"

devfun <- do.call(mkLmerDevfun,lmod)

getTheta <- function(phi,sigma,nmax=31) {
    cc <- Initialize(corAR1(phi),data=data.frame(t=seq(nmax)))
    ## get inverse Cholesky factor
    mm <- matrix(corFactor(cc),nrow=nmax) ## 
    ## all.equal(solve(mm),backsolve(mm,diag(nmax),upper.tri=FALSE))
    ## invert ...
    mm2 <- backsolve(mm,diag(nmax),upper.tri=FALSE)
    return(sigma*mm2[lower.tri(mm2,diag=TRUE)])
}

devfun2 <- function(theta) {
    new_theta <- c(theta[1],getTheta(theta[2],theta[3]))
    devfun(new_theta)
}
devfun2(c(1,0.5,1))

## [1] 1663.836

It’s not important at this point, but we could gain slightly more efficiency by (1) manipulating Lind to account for the small number of unique values in the variance-covariance matrix; (2) writing code to get the unique values of the Cholesky factor of the AR1 matrix directly, rather than having to invert the inverse-Cholesky factor.

Lower/upper limits ((-1,1) for theta(1), (0,Inf) for theta[2], (0,Inf) for theta(3)) (limits need to be strictly (-1,1) or corAR1 complains …)

opt1 <- minqa::bobyqa(c(1,0,1),devfun2,lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf))
opt2 <- lme4::nloptwrap(c(1,0,1),devfun2,lower=c(0,-0.999,0),upper=c(Inf,0.999,Inf))
all.equal(opt1$fval,opt2$fval)

## [1] TRUE

We can make the result into a regular merMod object, but the variance-covariance matrix will be printed out in full. (It turns out that opt2 is in a slightly more convenient format for mkMerMod …)

res <- mkMerMod(rho=environment(devfun),
               opt=opt2,
               reTrms=lmod$reTrm,
               fr=lmod$fr)

## Linear mixed model fit by REML ['lmerMod']
## REML criterion at convergence: 1546.474
## Random effects:
##  Groups   Name               Std.Dev. Corr                              
##  Mare     sin(2 * pi * Time) 0.8462                                     
##  Mare.1   tt1                4.1434                                     
##           tt2                4.1434   0.90                              
##           tt3                4.1434   0.80 0.90                         
## ... [rows skipped] ...
##           tt30               4.1434   0.04 0.05 0. ...
##           tt31               4.1434   0.04 0.04 0. ...
##  Residual                    1.8957                                     
##                                              
## Number of obs: 308, groups:  Mare, 11
## Fixed Effects:
##        (Intercept)  sin(2 * pi * Time)  cos(2 * pi * Time)  
##             12.049              -2.906              -0.798

Retrieve autocorrelation function. We could get rho <- res@optinfo$val[2]

sigma0 <- sigma(res)            ## resid var
acovf <- VarCorr(res)[[2]][,1]  ## first row of var-cov matrix
acfvec1 <- acovf/(acovf[1]+sigma0^2)  ## scale by *total* resid variance
cs1 <- fm2Ovar.lme$modelStruct$corStruct
acfvec2 <- corMatrix(cs1)[[1]][1,]  ## first row of corr matrix
aa <- ACF(fm2Ovar.lme)
par(las=1,bty="l")
plot(ACF~lag,data=aa,log='y')

## Warning in xy.coords(x, y, xlabel, ylabel, log): 5 y values <= 0 omitted
## from logarithmic plot

matlines(0:14,cbind(acfvec1[1:length(acfvec2)],acfvec2)[1:15,])

Hmm: lme solution looks better on the face of it, but the log-likelihood of the lme4 solution is higher (which would make sense since the model has an additional nugget parameter). I wonder what I screwed up? Did I miss a variance term somewhere?

can we use a corExp() (spatial) correlation matrix with nugget=TRUE, or is that just bogus??
use broom/dotwhisker to compare parameters across solutions? (tidy.lme needs work, also needs options to return corr/weight parameters as part of the result …)
another, potentially more efficient, way to code an AR1 model is by messing with the $Z$ matrix; in particular, we code the model as $_i = + $\sum \phi^{i=0} \epsilon_{t-i}$, where the $\epsilon_{t-i}$ are independent (I’m not sure about the consequences of truncating the infinite series here; for MA models this approach is exact, and more efficient because our matrices are sparser …) We need a separate sequence of $\epsilon$ values for each group. Hmm … this doesn’t work that well

lme4ord

library(lme4ord)
corObj <- nlme:::Initialize(nlme:::corAR1(0, form = ~ 1|Mare), Ovary)
## lme4ord can't handle as much within formulas? or formula processing
## screws things up?
Ovary3 <- transform(Ovary,sin2pit=2*pi*Time)
try(strucGlmer(follicles~ sin(2*pi*Time) + cos(2*pi*Time)+
               (1+sin2pit||Mare)+
               nlmeCorStruct(1, corObj = corObj, sig = 1),
           data=Ovary3))

## Warning in Ops.factor(0, sin2pit): '+' not meaningful for factors

## Warning in Ops.ordered(0 + sin2pit, Mare): '|' is not meaningful for
## ordered factors

kronecker games

We want to know the relationship between chol(kron(m1,m2)) and chol(m1),chol(m2) … “three minutes’ thought would suffice to find this out, but thought is irksome and three minutes is a long time”. So let’s

set.seed(101)
## make some cholesky factors
c2 <- c1 <- matrix(0,3,3)
c1[lower.tri(c1,diag=TRUE)] <- rnorm(6)
c2[lower.tri(c2,diag=TRUE)] <- rnorm(6)
## multiply them up to pos-def matrices
m1 <- tcrossprod(c1)
m2 <- tcrossprod(c2)
## ignore signs completely for now ...
all.equal(abs(c1),abs(t(chol(m1))))

## [1] TRUE

## check what we want to know 
all.equal(chol(kronecker(m1,m2)),kronecker(chol(m1),chol(m2)))

## [1] TRUE

fitting mixed models with (temporal) correlations in R

Ben Bolker

10:19 01 June 2016

Introduction

GLMMs

General points

Packages

nlme (lme)

lme4

MASS::glmmPQL

lme4/flexLambda branch

glmmTMB

lme4ord

spaMM

INLA

brms

Simplest example

Gaussian

GLMM example?

Technical notes

To do

Harder example

lme4ord

kronecker games