As of December 1st, 2016, the forecasting extension of mlr is under
code review and is a branch of the development version of mlr. You can
download this branch through the githubinstall
package.
library(githubinstall)
gh_install_packages("mlr-org/mlr", ref = "forecasting")
The standard objective in forecasting is, at time period , make predictions for
periods into the future. Forecasting tasks are most suitable when past
patterns in the data will continue on into the future. While R has many
wonderful packages for forecasting, there is no package on CRAN that
gives users a standardized form for forecasting models, such as mlr
and caret
for machine learning models. The purpose of this package extension is to give users of mlr
the opportunity to safely and productively train, optimize, and deploy forecasting models.
For our purposes we will simulate some data.
set.seed(1234)
dat = arima.sim(model = list(ar = c(.5,.2), ma = c(.4), order = c(2,0,1)), n = 200)
times = (as.POSIXlt("1992-01-14")) + lubridate::days(1:200)
dat = xts::xts(dat,order.by = times)
colnames(dat) = c("arma_test")
dat.train = dat[1:190,]
dat.test = dat[191:200,]
To specify a forecast task we must pass an xts
object containing the data, a target
, and the frequency
of the data. The frequency of the data can be thought of as the
seasonality. For example, a frequency of seven on daily data would be a
weekly seasonality. A frequency of fifty-two on weekly data would
indicate a yearly seasonality.
library(mlr)
fcregr.task = makeForecastRegrTask(id = "test", data = dat.train, target = "arma_test",
frequency = 7L)
fcregr.task
## Task: test
## Type: fcregr
## Target: arma_test
## Observations: 190
## Dates:
## Start: 1992-01-15
## End: 1992-07-22
## Frequency: 7
## Features:
## numerics factors ordered
## 0 0 0
## Missings: FALSE
## Has weights: FALSE
## Has blocking: FALSE
Like a regression task, this records the type of the learning problem and basic information about the data set. This task also returns the start and end dates of the time series as well as the frequency.
One common problem with forecasting is that it is difficult to use additional explanatory variables or forecast multiple targets that are dependent on one another. If we are at time and want to forecast 10 periods in the future, we need to know the values of the explanatory variables at time , which is often not possible. A new set of models which treats explanatory variables endogenously instead of exogenously allows us to forecast not only our target, but additional explanatory variables. This is done by treating all the variables as targets, making them endogenous to the model. To use these models, we create a multivariate forecasting task.
data("EuStockMarkets")
EuStockMarkets.time = lubridate::date_decimal(as.numeric(time(EuStockMarkets)))
EuStockMarkets = xts::xts(as.data.frame(EuStockMarkets), order.by = EuStockMarkets.time)
EuStockMarkets.train = EuStockMarkets[1:1850,]
EuStockMarets.test = EuStockMarkets[1851:1860,]
mfcregr.task = makeMultiForecastRegrTask(id = "bigvar", data = EuStockMarkets.train, target = "all", frequency = 7L)
mfcregr.task
## Task: bigvar
## Type: mfcregr
## Target: DAX SMI CAC FTSE
## Observations: 1850
## Dates:
## Start: 1991-07-01 02:18:27
## End: 1998-08-10 19:23:04
## Frequency: 7
## Features:
## numerics factors ordered
## 0 0 0
## Missings: FALSE
## Has weights: FALSE
## Has blocking: FALSE
This task prints the same information as a univariate forecast task.
If we would like to specify a target variable while still forecasting the rest of the data in an endogenous manner, we change our target to one of the variables.
mfcregr.task = makeMultiForecastRegrTask(id = "bigvar", data = EuStockMarkets.train, target = "DAX", frequency = 7L)
mfcregr.task
## Task: bigvar
## Type: mfcregr
## Target: DAX
## Observations: 1850
## Dates:
## Start: 1991-07-01 02:18:27
## End: 1998-08-10 19:23:04
## Frequency: 7
## Features:
## numerics factors ordered
## 3 0 0
## Missings: FALSE
## Has weights: FALSE
## Has blocking: FALSE
Several new models have been included from the forecast
package and well as rugarch
:
These all operate the same as the other models in mlr, with a very important parameter. Models will either have an h
or n.ahead
parameter, which is the number of periods you want to forecast into the
future. Note that this should be equal to the horizon you set in your
growing or fixed window resampling strategy.
We create a GARCH model from the package rugarch
using makeLearner()
by calling the learner class fcregr.garch
. An important parameter is the n.ahead
parameter, which is used to specify that we are forecasting 10 periods
into the future. We can view all of the possible parameters that can be
tuned with getLearnerParamSet("fcregr.garch")
getLearnerParamSet("fcregr.garch")
## Type len Def
## model discrete - sGARCH
## garchOrder integervector 2 1,1
## submodel discrete - -
## external.regressors untyped - <NULL>
## variance.targeting logical - FALSE
## armaOrder integervector 2 1,1
## include.mean logical - TRUE
## archm logical - FALSE
## archpow discrete - 1
## arfima logical - FALSE
## archex logical - FALSE
## distribution.model discrete - norm
## solver discrete - solnp
## solver.control untyped - <list>
## numderiv.control untyped - <list>
## stationarity discrete - 1
## fixed.se discrete - 0
## scale discrete - 0
## rec.init numeric - -
## n.ahead integer - 10
## n.roll integer - 0
## probs untyped - 0.05,0.95
## Constr Req Tunable
## model sGARCH,fGARCH,eGARCH,gjrGARCH,apARCH,... - TRUE
## garchOrder 1 to Inf - TRUE
## submodel GARCH,TGARCH,AVGARCH,NGARCH,NAGARCH,A... Y TRUE
## external.regressors - - TRUE
## variance.targeting - - TRUE
## armaOrder 1 to Inf - TRUE
## include.mean - - TRUE
## archm - - TRUE
## archpow 1,2 - TRUE
## arfima - - TRUE
## archex - - TRUE
## distribution.model norm,snorm,std,sstd,ged,sged,nig,ghyp... - TRUE
## solver nlminb,solnp,lbfgs,gosolnp,nloptr,hybrid - TRUE
## solver.control - - TRUE
## numderiv.control - - TRUE
## stationarity 0,1 - TRUE
## fixed.se 0,1 - TRUE
## scale 0,1 - TRUE
## rec.init 1e-10 to Inf - TRUE
## n.ahead 1 to Inf - TRUE
## n.roll 0 to Inf - TRUE
## probs - - TRUE
## Trafo
## model -
## garchOrder -
## submodel -
## external.regressors -
## variance.targeting -
## armaOrder -
## include.mean -
## archm -
## archpow -
## arfima -
## archex -
## distribution.model -
## solver -
## solver.control -
## numderiv.control -
## stationarity -
## fixed.se -
## scale -
## rec.init -
## n.ahead -
## n.roll -
## probs -
garch.mod = makeLearner("fcregr.garch",
model = "sGARCH", garchOrder = c(2,1),
n.ahead = 10L, include.mean = FALSE)
garch.mod
## Learner fcregr.garch from package rugarch
## Type: fcregr
## Name: Generalized AutoRegressive Conditional Heteroskedasticity; Short name: garch
## Class: fcregr.garch
## Properties: numerics,quantile
## Predict-Type: response
## Hyperparameters: model=sGARCH,garchOrder=2,1,n.ahead=10,include.mean=FALSE
Once we set up our task and model train()
, predict()
, and performance()
can be called to build and evaluate the model.
garch.train = train(garch.mod, fcregr.task)
garch.pred = predict(garch.train, newdata = dat.test)
performance(garch.pred, measure = mase, task = fcregr.task)
## mase
## 2.054674
This standard evaluation method is user friendly and in addition, we can now use mlr
’s built in resampling and tuning methods to tune a garch model for the data.
mlr now has two new cross validation resampling strategies, GrowingCV
and FixedCV
.
They are both rolling forecasting origin techniques established in
Hyndman and Athanasopoulos (2013) and first widely available for machine
learning in R by the caret
package’s createTimeSlices()
function. We specify:
resamp.desc = makeResampleDesc("GrowingCV", horizon = 10L,
initial.window = .90,
size = nrow(getTaskData(fcregr.task)), skip = 0)
resamp.desc
## Window description:
## growing with 10 iterations:
## 171 observations in initial window and 10 horizon.
## Predict: test
## Stratification: FALSE
Note that we should not specify stratification, as it does not really
make sense in the context of time series to stratify our data (unless a
future implementation can somehow use this for panel data). The
wonderful graphic posted below comes from the caret
package’s website and gives an intuitive idea of the sliding windows for both the growing and fixed options.
The top two graphs correspond to fixed window while the bottom two graphs are for growing window
Using the resampling strategy we can do a windowing cross-validation with our previous GARCH model to evaluate the general performance of the created GARCH model.
garch.resample = resample(learner = garch.mod, task = fcregr.task,
resampling = resamp.desc, measures = mase)
## [Resample] growing iter 1:
## mase.test.mean=1.54
## [Resample] growing iter 2:
## mase.test.mean=1.64
## [Resample] growing iter 3:
## mase.test.mean=1.97
## [Resample] growing iter 4:
## mase.test.mean=2.63
## [Resample] growing iter 5:
## mase.test.mean=2.26
## [Resample] growing iter 6:
## mase.test.mean=1.86
## [Resample] growing iter 7:
## mase.test.mean=2.03
## [Resample] growing iter 8:
## mase.test.mean= 1.6
## [Resample] growing iter 9:
## mase.test.mean=1.57
## [Resample] growing iter 10:
## mase.test.mean=1.21
## [Resample] Aggr. Result: mase.test.mean=1.83
garch.resample
## Resample Result
## Task: test
## Learner: fcregr.garch
## Aggr perf: mase.test.mean=1.83
## Runtime: 2.16807
The forecasting features fully integrate into mlr, allowing us to
also make a parameter set to tune over. Here we tune a GARCH model with
F1-racing used to tune our parameters. In addition, we can tune the
model in parallel using parallelMap
.
par_set = makeParamSet(
makeDiscreteParam(id = "model", values = c("sGARCH", "fGARCH")),
makeDiscreteParam(id = "submodel", values = c("GARCH", "NGARCH"),
requires = quote(model == "fGARCH")),
makeIntegerVectorParam(id = "garchOrder", len = 2L, lower = 1, upper = 3))
#Specify tune by grid estimation
ctrl = makeTuneControlIrace(maxExperiments = 180L)
library(parallelMap)
parallelStartSocket(3)
configureMlr(on.learner.error = "warn")
res = tuneParams(garch.mod, task = fcregr.task,
resampling = resamp.desc, par.set = par_set,
control = ctrl, measures = mase)
parallelStop()
res
## Tune result:
## Op. pars: model=fGARCH; submodel=GARCH; garchOrder=1,2
## mase.test.mean=1.83
Once we have tuned the model we get back the hyperparameters of the best model through setHyperPars()
and then train the model like any other.
garch.hyp = setHyperPars(makeLearner("fcregr.garch", n.ahead = 10L),
par.vals = res$x)
garch.best = train(garch.hyp, fcregr.task)
garch.pred = predict(garch.best, newdata = dat.test)
performance(garch.pred, measures = mase, task = fcregr.task)
## mase
## 2.1131
A new function updateModel()
has been implemented that
updates the model given new data. This function is currently only
implemented for ets, Arima, auto.arima, bats, tbats, and nnetar.
fcregr.task = makeForecastRegrTask(id = "test", data = dat.train,
target = "arma_test", frequency = 7L)
arm = makeLearner("fcregr.Arima", order = c(2L,0L,1L),
h = 10L, include.mean = FALSE)
arm
## Learner fcregr.Arima from package forecast
## Type: fcregr
## Name: AutoRegressive Integrated Moving Average; Short name: Arima
## Class: fcregr.Arima
## Properties: numerics,quantile
## Predict-Type: response
## Hyperparameters: order=2,0,1,h=10,include.mean=FALSE
armMod = train(arm, fcregr.task)
updateArmMod = updateModel(armMod, fcregr.task, newdata = dat.test)
updateArmMod
## Model for learner.id=fcregr.Arima; learner.class=fcregr.Arima
## Trained on: task.id = test; obs = 10; features = 0
## Hyperparameters: order=2,0,1,h=10,include.mean=FALSE
This works by making a call to updateLearner.fcregr.Arima()
and updating the model and task data with newdata. predict()
now forcasts the next 10 periods.
predict(updateArmMod, task = fcregr.task)
## Prediction: 10 observations
## predict.type: response
## threshold:
## time: 0.01
## response
## 1992-07-22 23:59:40 1.3631309
## 1992-07-23 23:59:21 1.0226897
## 1992-07-24 23:59:02 0.7818986
## 1992-07-25 23:58:43 0.5996956
## 1992-07-26 23:58:24 0.4601914
## 1992-07-27 23:58:05 0.3531699
## ... (10 rows, 1 cols)
The function createLagDiffFeatures()
allows users to create arbitrary lags and differences that allow for the creation of
style machine learning models to be used with forecasting. This method
requires passing a data frame with the row names being POSIXct
compatable.
dat.train.reg = as.data.frame(dat.train, row.names = index(dat.train))
regr.task = makeRegrTask(data = dat.train.reg, target = "arma_test")
regr.task.lag = createLagDiffFeatures(regr.task,lag = 2L:4L, difference = 1L,
seasonal.lag = 1L:2L, frequency = 5)
regr.task.lag
## Supervised task: dat.train.reg
## Type: regr
## Target: arma_test
## Observations: 190
## Features:
## numerics factors ordered
## 5 0 0
## Missings: TRUE
## Has weights: FALSE
## Has blocking: FALSE
This can be used with any task type as long as the row names of the task data can be converted to POSIXct
format.
As an example, we will build and train a gradient boosting machine using lagged data.
regrGbm <- makeLearner("regr.gbm", par.vals = list(n.trees = 100))
gbmMod = train(regrGbm, regr.task.lag)
To forecast with a regression model whose task is manipulated by createLagDiffFeatures()
use the function forecast()
.
gbm.fore = forecast(gbmMod, h = 10, newdata = dat.test)
## Warning in forecast.WrappedModel(gbmMod, h = 10, newdata = dat.test):
## Provided data for prediction is not a pure data.frame but from class xts,
## hence it will be converted.
performance(gbm.fore, measures = mase, task = regr.task.lag)
## mase
## 2.036969
For developing trading strategies, we normally have a discrete set of choices such as to buy, sell, or hold onto a stock. Using forecasting in mlr we can now train classification models that forecast these choices. He we try to predict whether a ‘stock’ will go up 5%, down 5%, or neither. Below we simulate data from an ARIMA process, take the percentage change, and descritize the data into either a ‘buy’, ‘sell’, or ‘hold’.
library(xts)
library(lubridate)
set.seed(1234)
# Generate an ARIMA based data set
dat = arima.sim(model = list(ar = c(.5,.2,.1), ma = c(.5,.3), order = c(3,0,2)), n = 500)
dat = dat/lag(dat,-1) - 1
jump = data.frame(jump = ifelse(diff(dat) > .5, "Buy",ifelse(diff(dat) < -.5, "Sell","Hold")))
times = (as.POSIXct("1992-01-14")) + lubridate::days(1:498)
rownames(jump) = times
jump.train = jump[1:488,,drop = FALSE]
jump.test = jump[489:498,,drop = FALSE]
# Make the classif task
classif.task = makeClassifTask(data = jump.train,target = "jump")
classif.task.lag = createLagDiffFeatures(classif.task, lag = 1L:15L,
na.pad = FALSE)
classif.learn = makeLearner("classif.boosting")
classif.train = train(classif.learn,classif.task.lag)
classif.fore = forecast(classif.train, h = 10, newdata = jump.test)
performance(classif.fore)
## mmce
## 0.8
So our model does pretty bad, but that is because it was not tuned. Given the structure in mlr, parallel computing, and time, it would be possible to use this type of model to make forecast classifiers.
A new preprocessing wrapper makePreprocWrapperLambert()
has been added. This function uses the LambertW
package’s Gaussianize()
function to help remove skewness and kurtosis from the data.
lrn = makePreprocWrapperLambert("classif.lda", type = "h")
print(lrn)
## Learner classif.lda.preproc from package MASS
## Type: classif
## Name: ; Short name:
## Class: PreprocWrapperLambert
## Properties: numerics,factors,prob,twoclass,multiclass
## Predict-Type: response
## Hyperparameters: type=h,methods=IGMM,verbose=FALSE
train(lrn, iris.task)
## Model for learner.id=classif.lda.preproc; learner.class=PreprocWrapperLambert
## Trained on: task.id = iris-example; obs = 150; features = 4
## Hyperparameters: type=h,methods=IGMM,verbose=FALSE
The largest benefit of this preprocessing scheme is that it, as the function says, “Gaussianizes” the data in a similar manner to Box-Cox transforms. Many machine learning and forecasting models lean on the assumption that our data is normally distributed and this preprocessing feature allows us to coalesce the data to fit that assumption.
It’s known that ensembles of forecasts tend to outperform standard forecasting techniques. Here we use mlr’s stacked modeling functionality to ensemble multiple forecast techniques.
# Create some data
library(xts)
library(lubridate)
dat = arima.sim(model = list(ar = c(.5,.2), ma = c(.4), order = c(2,0,1)), n = 500)
times = (as.POSIXlt("1992-01-14")) + lubridate::days(1:500)
dat = xts(dat,order.by = times)
colnames(dat) = c("arma_test")
# seperate into test and train
dat.train = dat[1:490,]
dat.test = dat[491:500,]
fcregr.task = makeForecastRegrTask(id = "test", data = dat.train, target = "arma_test",
frequency = 7L)
resamp.sub = makeResampleDesc("GrowingCV",
horizon = 10L,
initial.window = .90,
size = nrow(getTaskData(fcregr.task)),
skip = .01
)
resamp.super = makeResampleDesc("CV", iters = 5)
lrns = makeLearners(c("fcregr.tbats","fcregr.garch"))
stack.forecast = makeStackedLearner(base.learners = lrns,
predict.type = "response",
method = "average")
# Simple param set for tuning sub learners
ps = makeParamSet(
makeDiscreteParam("fcregr.tbats.h", values = 10),
makeDiscreteParam("fcregr.garch.n.ahead", values = 10)
)
## tuning
fore.tune = tuneParams(stack.forecast, fcregr.task, resampling = resamp.sub,
par.set = ps, control = makeTuneControlGrid(),
measures = mase, show.info = FALSE)
fore.tune
## Tune result:
## Op. pars: fcregr.tbats.h=10; fcregr.garch.n.ahead=10
## mase.test.mean=1.85
# get hyper params
stack.forecast.tune = setHyperPars2(stack.forecast,fore.tune$x)
stack.forecast.tune
## Learner stack from package forecast,rugarch
## Type: fcregr
## Name: ; Short name:
## Class: StackedLearner
## Properties: numerics,quantile
## Predict-Type: response
## Hyperparameters: fcregr.tbats.h=10,fcregr.garch.n.ahead=10
# Train the final best models and predict
stack.forecast.mod = train(stack.forecast.tune,fcregr.task)
stack.forecast.pred = predict(stack.forecast.mod, newdata = dat.test)
stack.forecast.pred
## Prediction: 10 observations
## predict.type: response
## threshold:
## time: 0.00
## truth response
## 1993-05-19 2.57561697 2.4802203
## 1993-05-20 1.64089860 1.8383873
## 1993-05-21 1.19015411 1.4608244
## 1993-05-22 2.31480312 1.3295939
## 1993-05-23 1.69129242 1.0864590
## 1993-05-24 -0.09426478 0.9885076
## ... (10 rows, 2 cols)
One difficulty in the integration of machine learning and forecasting is that most machine learning models rely on a large amount of exogenous data to derive good answers. As stated before, since we can’t forecast these exogenous variables it becomes difficult to fully use the power of machine learning models. One workaround to this is to use mlr’s ensembling method to build a multivariate forecaster, targeting a specific variable, and then training a machine learning model on the forecasts of all variables. It works as the following
mfcregr.BigVAR
modelAnd now your final model you use for prediction will first forecast your extra variables and then use those forecasts to forecast the final values for the target variable.
EuStockMarkets.train = EuStockMarkets[1:1855,]
EuStockMarkets.test = EuStockMarkets[1856:1860,]
multfore.task = makeMultiForecastRegrTask(id = "bigvar", data = EuStockMarkets.train, target = "FTSE")
resamp.sub = makeResampleDesc("GrowingCV",
horizon = 5L,
initial.window = .97,
size = nrow(getTaskData(multfore.task)),
skip = .01
)
resamp.super = makeResampleDesc("CV", iters = 3)
base = c("mfcregr.BigVAR")
lrns = lapply(base, makeLearner)
lrns = lapply(lrns, setPredictType, "response")
lrns[[1]]$par.vals$verbose = FALSE
stack.forecast = makeStackedLearner(base.learners = lrns,
predict.type = "response",
super.learner = makeLearner("regr.earth", penalty = 2),
method = "growing.cv",
resampling = resamp.sub)
ps = makeParamSet(
makeDiscreteParam("mfcregr.BigVAR.p", values = 5),
makeDiscreteParam("mfcregr.BigVAR.struct", values = "Basic"),
makeNumericVectorParam("mfcregr.BigVAR.gran", len = 2L, lower = 25, upper = 26),
makeDiscreteParam("mfcregr.BigVAR.h", values = 5),
makeDiscreteParam("mfcregr.BigVAR.n.ahead", values = 5)
)
## tuning
multfore.tune = tuneParams(stack.forecast, multfore.task, resampling = resamp.sub,
par.set = ps, control = makeTuneControlGrid(),
measures = multivar.mase, show.info = FALSE)
multfore.tune
## Tune result:
## Op. pars: mfcregr.BigVAR.p=5; mfcregr.BigVAR.struct=Basic; mfcregr.BigVAR.gran=26,25; mfcregr.BigVAR.h=5; mfcregr.BigVAR.n.ahead=5
## multivar.mase.test.mean= 21
stack.forecast.f = setHyperPars2(stack.forecast,multfore.tune$x)
multfore.train = train(stack.forecast.f,multfore.task)
multfore.train
## Model for learner.id=stack; learner.class=StackedLearner
## Trained on: task.id = bigvar; obs = 1855; features = 3
## Hyperparameters: mfcregr.BigVAR.verbose=FALSE,mfcregr.BigVAR.p=5,mfcregr.BigVAR.struct=Basic,mfcregr.BigVAR.gran=26,25,mfcregr.BigVAR.h=5,mfcregr.BigVAR.n.ahead=5
multfore.pred = predict(multfore.train, newdata = EuStockMarkets.test)
multfore.pred
## Prediction: 5 observations
## predict.type: response
## threshold:
## time: 0.00
## truth response
## 1998-08-19 05:32:18 5587.6 5251.369
## 1998-08-20 15:13:50 5432.8 5531.898
## 1998-08-22 00:55:23 5462.2 5763.618
## 1998-08-23 10:36:55 5399.5 5929.220
## 1998-08-24 20:18:27 5455.0 6061.172
performance(multfore.pred, mase, task = multfore.task)
## mase
## 17.42952