Advanced Tuning

Iterated F-Racing for mixed spaces and dependencies

The package supports a larger number of tuning algorithms, which can all be looked up and selected via TuneControl. One of the cooler algorithms is iterated F-racing from the irace package (technical description here). This not only works for arbitrary parameter types (numeric, integer, discrete, logical), but also for so-called dependent / hierarchical parameters:

ps = makeParamSet(
  makeNumericParam("C", lower = -12, upper = 12, trafo = function(x) 2^x),
  makeDiscreteParam("kernel", values = c("vanilladot", "polydot", "rbfdot")),
  makeNumericParam("sigma", lower = -12, upper = 12, trafo = function(x) 2^x,
    requires = quote(kernel == "rbfdot")),
  makeIntegerParam("degree", lower = 2L, upper = 5L,
    requires = quote(kernel == "polydot"))
)
ctrl = makeTuneControlIrace(maxExperiments = 200L)
rdesc = makeResampleDesc("Holdout")
res = tuneParams("classif.ksvm", iris.task, rdesc, par.set = ps, control = ctrl, show.info = FALSE)
print(head(as.data.frame(res$opt.path)))
#>            C     kernel     sigma degree mmce.test.mean dob eol
#> 1 -2.8894525    polydot        NA      4           0.06   1  NA
#> 2 -2.6793542 vanilladot        NA     NA           0.04   1  NA
#> 3 -0.7855061     rbfdot 11.049783     NA           0.68   1  NA
#> 4  1.7678978    polydot        NA      5           0.14   1  NA
#> 5  9.7729840 vanilladot        NA     NA           0.04   1  NA
#> 6 -2.7930352     rbfdot -7.198476     NA           0.36   1  NA
#>   error.message exec.time
#> 1          <NA>     0.033
#> 2          <NA>     0.027
#> 3          <NA>     0.032
#> 4          <NA>     0.030
#> 5          <NA>     0.031
#> 6          <NA>     0.034

See how we made the kernel parameters like sigma and degree dependent on the kernel selection parameters? This approach allows you to tune parameters of multiple kernels at once, efficiently concentrating on the ones which work best for your given data set.

Tuning across whole model spaces with ModelMultiplexer

We can now take the following example even one step further. If we use the ModelMultiplexer we can tune over different model classes at once, just as we did with the SVM kernels above.

base.learners = list(
  makeLearner("classif.ksvm"),
  makeLearner("classif.randomForest")
)
lrn = makeModelMultiplexer(base.learners)

Function makeModelMultiplexerParamSet offers a simple way to construct a parameter set for tuning: The parameter names are prefixed automatically and the requires element is set, too, to make all parameters subordinate to selected.learner.

ps = makeModelMultiplexerParamSet(lrn,
  makeNumericParam("sigma", lower = -12, upper = 12, trafo = function(x) 2^x),
  makeIntegerParam("ntree", lower = 1L, upper = 500L)
)
print(ps)
#>                                Type len Def
#> selected.learner           discrete   -   -
#> classif.ksvm.sigma          numeric   -   -
#> classif.randomForest.ntree  integer   -   -
#>                                                       Constr Req Tunable
#> selected.learner           classif.ksvm,classif.randomForest   -    TRUE
#> classif.ksvm.sigma                                 -12 to 12   Y    TRUE
#> classif.randomForest.ntree                          1 to 500   Y    TRUE
#>                            Trafo
#> selected.learner               -
#> classif.ksvm.sigma             Y
#> classif.randomForest.ntree     -

rdesc = makeResampleDesc("CV", iters = 2L)
ctrl = makeTuneControlIrace(maxExperiments = 200L)
res = tuneParams(lrn, iris.task, rdesc, par.set = ps, control = ctrl, show.info = FALSE)
print(head(as.data.frame(res$opt.path)))
#>       selected.learner classif.ksvm.sigma classif.randomForest.ntree
#> 1 classif.randomForest                 NA                        273
#> 2         classif.ksvm           10.53605                         NA
#> 3         classif.ksvm          -11.79057                         NA
#> 4         classif.ksvm           10.42478                         NA
#> 5 classif.randomForest                 NA                        394
#> 6         classif.ksvm           11.02356                         NA
#>   mmce.test.mean dob eol error.message exec.time
#> 1     0.04666667   1  NA          <NA>     0.059
#> 2     0.68000000   1  NA          <NA>     0.046
#> 3     0.52666667   1  NA          <NA>     0.045
#> 4     0.68000000   1  NA          <NA>     0.047
#> 5     0.04666667   1  NA          <NA>     0.063
#> 6     0.68000000   1  NA          <NA>     0.050

Multi-criteria evaluation and optimization

During tuning you might want to optimize multiple, potentially conflicting, performance measures simultaneously.

In the following example we aim to minimize both, the false positive and the false negative rates (fpr and fnr). We again tune the hyperparameters of an SVM (function ksvm) with a radial basis kernel and use the sonar classification task for illustration. As search strategy we choose a random search.

For all available multi-criteria tuning algorithms see TuneMultiCritControl.

ps = makeParamSet(
  makeNumericParam("C", lower = -12, upper = 12, trafo = function(x) 2^x),
  makeNumericParam("sigma", lower = -12, upper = 12, trafo = function(x) 2^x)
)
ctrl = makeTuneMultiCritControlRandom(maxit = 30L)
rdesc = makeResampleDesc("Holdout")
res = tuneParamsMultiCrit("classif.ksvm", task = sonar.task, resampling = rdesc, par.set = ps,
  measures = list(fpr, fnr), control = ctrl, show.info = FALSE)
res
#> Tune multicrit result:
#> Points on front: 2

head(as.data.frame(trafoOptPath(res$opt.path)))
#>              C        sigma fpr.test.mean fnr.test.mean dob eol
#> 1 2.837139e-02  0.004605846          1.00    0.00000000   1  NA
#> 2 8.161350e+00 10.073402485          1.00    0.00000000   2  NA
#> 3 2.947371e+03  0.023696559          0.15    0.03333333   3  NA
#> 4 5.020557e-01  0.279973960          1.00    0.00000000   4  NA
#> 5 8.642356e+01 47.600399172          1.00    0.00000000   5  NA
#> 6 3.661447e-04  0.715765529          1.00    0.00000000   6  NA
#>   error.message exec.time
#> 1          <NA>     0.054
#> 2          <NA>     0.061
#> 3          <NA>     0.050
#> 4          <NA>     0.064
#> 5          <NA>     0.054
#> 6          <NA>     0.055

The results can be visualized with function plotTuneMultiCritResult. The plot shows the false positive and false negative rates for all parameter settings evaluated during tuning. Points on the Pareto front are slightly increased.

plotTuneMultiCritResult(res)

plot of chunk unnamed-chunk-5