Learning outcomes

  • Learn the basics of parameter and state estimation for simple state space models
  • Create and fit some basic multivariate time series models
  • Understand the state of the art in time series models, including latent factor models and dynamic parameter models

Relevant JAGS file:

jags_statespace.R
jags_multivariate_AR.R
jags_multivariate_statespace.R

Introduction to state space models

  • State space models are a very general family of models that are used when we have a noisy time series of observations that are stochastically related to a hidden time series which is what we are really interested in
  • For example they are used for palaeoclimate reconstruction when we observe pollen but are really interested in climate
  • All state space models have two parts:

  • The first part is called the state equation which links the observations to a latent stochatic process
  • The second part of the model is called the evoluation equation which determines how the latent stochastic process changes over time

A simple linear state space model

  • We define \(y_t\) in the usual way, but write \(x_t\) for the hidden stochastic process
  • For a simple linear state space model we have a state equation of:\[ y_t = \alpha_y + \beta_y x_t + \epsilon_t,\; \epsilon_t \sim N(0, \sigma_y^2)\]
  • The evolution equation could be a random walk:\[ x_t = x_{t-1} + \gamma_t,\; \gamma_t \sim N(0, \sigma_x^2)\]
  • The usual aim when fitting these models is to either estimate \(x_t\), or the parameters, or to predict future values of \(x_t\)
  • This type of model is sometimes known as the Kalman Filter

JAGS code for a linear state space model

model_code = '
model
{
  # Likelihood
  for (t in 1:T) {
    y[t] ~ dnorm(alpha_y + beta_y * x[t], tau_y)
  }
  x[1] ~ dnorm(0, 0.01)
  for (t in 2:T) {
    x[t] ~ dnorm(x[t-1], tau_x)
  }

  # Priors
  tau_y <- 1/pow(sigma_y, 2)
  sigma_y ~ dunif(0, 100)
  tau_x <- 1/pow(sigma_x, 2)
  sigma_x ~ dunif(0, 100)
  }
'

Priors for state space models

  • You need to be very careful with state space models as it's very easy to create models which are ill-defined and crash
  • For example, in the Kalman filter model before you can switch the sign of \(x_t\) and \(\beta_y\) and still end up with the same model
  • It's advisable to either fix some of the parameters, or use extra data to calibrate the parameters of the state space model
  • You can do better if you have multivariate observations or stricter requirements about the time series applied to \(x_t\)

Example: palaeoclimate reconstruction

palaeo = read.csv('https://raw.githubusercontent.com/andrewcparnell/tsme_course/master/data/palaeo.csv')
par(mfrow=c(2,1))
with(palaeo,plot(year, proxy))
with(palaeo,plot(year, temp)) # Only available for a subset

Palaeoclimate reconstruction results

Introduction to multivariate models

  • Often we have multiple different observations at each time and we want to model them together
  • For example, we might have multiple different climate variables observed, or multiple chemical signatures. If they are correlated then by fitting them separately we will lose precision
  • If the observations are observed at different times for each time series we can use the NA trick, or create a latent state space model on which all measurements are regular

The Vector AR model

  • We can extend most of the models we have met to multivariate scenarios by applying the multivariate normal distribution instead of the univariate version
  • Suppose now that \(y_t\) is a vector of length \(k\) containing all the observations at time \(t\)
  • We can write:\[y_t = A + \Phi y_{t-1} + \epsilon_t,\; \epsilon_t \sim MVN(0, \Sigma)\]or equivalently\[y_t \sim MVN(A + \Phi y_{t-1}, \Sigma)\]
  • where MVN is the multivariate normal distribution
  • Here the parameter vector \(A\) controls the overall mean level for each of the \(k\) series, \(\Phi\) is a \(k \times k\) matrix which controls the influence on the current value of the previous time points of both series
  • \(\Sigma\) here is a \(k \times k\) matrix that controls the variance of the process and the residual correlation between the two series (NB: this is different from the GP formulation we met yesterday)

JAGS code for the VAR model

model_code = '
model
{
  # Likelihood
  for (t in 2:T) {
    y[t, ] ~ dmnorm(mu[t, ], Sigma.Inv)
    mu[t, 1:k] <- A + Phi %*% y[t-1,]
  }
  Sigma.Inv ~ dwish(I, k+1)
  Sigma <- inverse(Sigma.Inv)  

  # Priors
  for(i in 1:k) {
    A[i] ~ dnorm(0, 0.01)
    Phi[i,i] ~ dunif(-1, 1)
    for(j in (i+1):k) {
      Phi[i,j] ~ dunif(-1,1)
      Phi[j,i] ~ dunif(-1,1)
    }
  }
}
'

Example: joint temperature/sea level models

par(mfrow=c(2,1))
with(bivariate_data, plot(Year[-1], diff(Anomaly), type='l'))
with(bivariate_data, plot(Year[-1], diff(sea_level_m), type='l'))

VAR model results:

Mixing up state space models, multivariate time series, Gaussian processes

  • We can extend the simple state space model we met earlier to work for multivariate series
  • We would have a state equation that relates our observations to a multivariate latent time series (possibly of a different dimension)
  • We could change the time series model of the latent state to be an ARIMA model, an O-U process, a Gaussian process, or anything else you can think of!

Dynamic linear models

  • So far in all our models we have forced the time series parameters to be constant over time
  • In a Dynamic Linear Model we have a state space model with :\[ y_t = F_t x_t + \epsilon_t,\; \epsilon_t \sim MVN(0, \Sigma_t)\]\[ x_t = G_t x_{t-1} + \gamma_t,\; \gamma_t \sim N(0, \Psi_t)\]
  • The key difference here is that the transformation matrices \(F_t\) and \(G_t\) can change over time, as can the variance matrices \(\Sigma_t\) and \(\Psi_t\), possibly in an ARCH/GARCH type framework
  • These are very hard models to fit in jags but simple versions can work

Latent factor time series models

  • If we have very many series, a common approach to reduce the dimension is to use Factor Analysis or Principal components
  • In a latent factor model we write:\[y_t = B f_t + \epsilon_t\]where now \(B\) is a \(num series \times num factors\) factor loading matrix which transforms the high dimensional \(y_t\) into a lower dimensional \(f_t\).
  • \(f_t\) can then be run using a set of univariate time series, e.g. random walks
  • The \(B\) matrix is often hard to estimate and might require some tight priors

Summary

  • We have seen how to fit basic Bayesian state space models and observed some of their pitfalls
  • We know how to create some simple multivariate time series models
  • We have seen some of the more advanced ideas in time series models such as DLMs and dynamic factor models
  • You are now an expert in Bayesian time series!