In this document the utsf package is described. This package offers a meta engine for applying different regression models for univariate time series forecasting using an autoregressive approach. One main feature of the package is its extensibility, which allow you to use machine learning models not directly supported by the package, such as neural networks, or your own models.
An univariate time series forecasting method is one in which the future values of a series are predicted using only information from the series, for example, using as forecast its mean historical value. An advantage of this type of prediction is that, apart from the series being forecast, there is no need to collect any further information in order to train the forecasting model.
An autoregressive model is a kind of univariate time series forecasting model in which a value of a time series is expressed as a function of some of its past values. That is, an autoregressive model is a regression model in which the independent variables are lagged values (previous values) of the response variable. For example, given a time series with the following historical values: \(t = \{1, 3, 6, 7, 9, 11, 16\}\), suppose that we want to develop an autoregressive model in which a target “is explained” by its first, second and fourth past values (in this context, a previous value is also called a lag, so lag 1 is the value immediately preceding a given value in the series). Given the series \(t\) and lags (1, 2 and 4), the training set would be:
| Lag 4 | Lag 2 | Lag 1 | Target |
|---|---|---|---|
| 1 | 6 | 7 | 9 |
| 3 | 7 | 9 | 11 |
| 6 | 9 | 11 | 16 |
Given a model trained with the previous dataset, the next future value of the series is predicted as \(f(Lag4, Lag2, Lag1)\), where \(f\) is the regression function and \(Lag4\), \(Lag2\) and \(Lag1\) are the fourth, second and first lagged values of the next future value. So, the next future value of series \(t\) is predicted as \(f(7, 11, 16)\), producing a value that will be called \(F1\).
Suppose that the forecast horizon (the number of future values to be forecast into the future) is greater than 1. In the case that the regression function only predicts the next future value of the series, a recursive approach can be applied to forecast all the future values in the forecast horizon. Using a recursive approach, the regression function is applied recursively until all steps ahead values are forecast. For instance, following the previous example, suppose that the forecast horizon is 3. As we have explained, to forecast the next future value of the series (one-step ahead) the regression function is fed with the vector \([7, 11, 16]\), producing \(F1\). To forecast the two-steps ahead value the regression function is fed with the vector \([9, 16, F1]\). The forecast for the one-step ahead value, \(F1\), is used as the first lag for the two-steps ahead value, because the actual value is unknown. Finally, to predict the three-steps ahead value the regression function is fed with the vector [\(11, F1, F2]\). This example of recursive forecast is summarized in the following table:
| Steps ahead | Autoregressive values | Forecast |
|---|---|---|
| 1 | 7, 11, 16 | F1 |
| 2 | 9, 16, F1 | F2 |
| 3 | 11, F1, F2 | F3 |
The recursive approach for forecasting several values into the future is applied in classical statistical models such as ARIMA or exponential smoothing.
The utsf package makes it easy the use of classical regression models for univariate time series forecasting, employing the autoregressive approach and the recursive prediction strategy explained in the previous section. All the supported models are applied using an uniform interface:
create_model() function to build the forecasting
model, andforecast() function for using the model to predict
future values.Let us see an example in which a regression tree model is used to forecast the next future values of a time series:
In this example, an autoregressive tree model
(method = "rt", Regression Tree) is trained using the
historical values of the AirPassengers time series and a
forecast for its 12 next future values (h = 12) is done.
The create_model() function returns an S3 object of class
utsf with information about the trained model. The
information about the forecast is included in the object returned by the
forecast() function, as a component named pred
of class ts (a time series):
f$pred
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 460.2980 428.4915 467.0304 496.9833 499.9819 554.7891 627.5849 628.0503
#> Sep Oct Nov Dec
#> 1961 533.2803 482.4221 448.7926 453.6920
library(ggplot2)
autoplot(f)The training set used to fit the model is built from the historical
values of the time series using the autoregressive approach explained in
the previous section. The lags parameter of the
create_model() function is used to specify the
autoregressive lags. In the example: lags = 1:12, so a
target is a function of its 12 previous values. Next, we consult the
first targets (and their associated features) with which the regression
model has been trained:
head(m$targets) # first targets
#> [1] -11.6666667 -0.9166667 13.4166667 6.6666667 -3.8333333 19.8333333
head(m$features) # and its associated features
#> Lag12 Lag11 Lag10 Lag9 Lag8 Lag7 Lag6
#> 1 -14.6666667 -8.666667 5.333333 2.333333 -5.666667 8.333333 21.333333
#> 2 -8.9166667 5.083333 2.083333 -5.916667 8.083333 21.083333 21.083333
#> 3 4.4166667 1.416667 -6.583333 7.416667 20.416667 20.416667 8.416667
#> 4 0.6666667 -7.333333 6.666667 19.666667 19.666667 7.666667 -9.333333
#> 5 -7.8333333 6.166667 19.166667 19.166667 7.166667 -9.833333 -24.833333
#> 6 5.8333333 18.833333 18.833333 6.833333 -10.166667 -25.166667 -11.166667
#> Lag5 Lag4 Lag3 Lag2 Lag1
#> 1 21.333333 9.333333 -7.666667 -22.666667 -8.666667
#> 2 9.083333 -7.916667 -22.916667 -8.916667 -11.916667
#> 3 -8.583333 -23.583333 -9.583333 -12.583333 -1.583333
#> 4 -24.333333 -10.333333 -13.333333 -2.333333 12.666667
#> 5 -10.833333 -13.833333 -2.833333 12.166667 6.166667
#> 6 -14.166667 -3.166667 11.833333 5.833333 -4.166667The curious reader might have noticed that the features and targets are not on the same scale as the original time series. This is because, by default, a transformation is applied to the examples of the training set. This transformation will be explained later.
Let us see the training set associated with the example of the previous section:
t <- ts(c(1, 3, 6, 7, 9, 11, 16))
out <- create_model(t, lags = c(1, 2, 4), trend = "none")
cbind(out$features, Target = out$targets)
#> Lag4 Lag2 Lag1 Target
#> 1 1 6 7 9
#> 2 3 7 9 11
#> 3 6 9 11 16Here, no transformation has been applied
(trend = "none").
Prediction intervals for a forecast can be computed:
m <- create_model(USAccDeaths, method = "mt")
f <- forecast(m, h = 12, PI = TRUE, level = 90)
f
#> Point Forecast Lo 90 Hi 90
#> Jan 1979 8162.724 7612.520 8703.023
#> Feb 1979 7185.788 6597.051 7744.887
#> Mar 1979 7475.744 6927.619 8020.395
#> Apr 1979 7836.936 7217.843 8420.431
#> May 1979 8570.632 8028.427 9133.970
#> Jun 1979 9018.691 8408.565 9583.103
#> Jul 1979 9865.224 9256.995 10451.958
#> Aug 1979 9695.409 9087.325 10267.034
#> Sep 1979 9160.569 8573.240 9767.617
#> Oct 1979 8962.662 8382.609 9518.964
#> Nov 1979 8606.452 7997.011 9175.178
#> Dec 1979 8899.133 8350.682 9465.557
library(ggplot2)
autoplot(f)Prediction intervals are calculated following the guidelines in (https://otexts.com/fpp3/nnetar.html#prediction-intervals-5).
Random errors are assumed to follow a normal distribution. In the
example, a 90% prediction interval (level = 90) has been
computed.
The create_model() and forecast() functions
provide a common interface to applying an autoregressive approach for
time series forecasting using different regression models. These models
are implemented in several R packages. Currently, our project is mainly
focused on regression tree models, supporting the following
approaches:
FNN::knn.reg() is used, as regression function, to
recursively predict the future values of the time series.stats::lm() and its associated method
stats::predict.lm() is applied recursively for the
forecasts, i.e., as regression function.rpart::rpart() and its associated method
rpart::predict.rpart() is used for the forecasts.Cubist::cubist() and its associated method
Cubist::predict.cubist() is used for predictions.ipred::bagging() and its associated method
ipred::predict.regbagg() is used for forecasting.ranger::ranger() and its associated method
ranger::predict.ranger() is used for predictions.The S3 object of class utsf returned by the
create_model() function contains a component with the
trained autoregressive model:
m <- create_model(fdeaths, lags = 1:12, method = "rt")
m$model
#> n= 60
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 60 1967124.00 -6.465278
#> 2) Lag12< 73 38 212851.90 -124.414500
#> 4) Lag6>=-66.45833 30 57355.07 -153.847200
#> 8) Lag12< -170.7083 10 13293.09 -195.991700 *
#> 9) Lag12>=-170.7083 20 17419.67 -132.775000 *
#> 5) Lag6< -66.45833 8 32051.01 -14.041670 *
#> 3) Lag12>=73 22 312482.00 197.265200
#> 6) Lag5>=-131.7917 7 24738.12 114.500000 *
#> 7) Lag5< -131.7917 15 217416.40 235.888900 *In this case, the model is the result of training a regression tree
using the function rpart::rpart() with the training set
consisting of the features m$features and targets
m$targets. Once the model is trained, the
rpart::predict.rpart() function can be used recursively to
forecast the future values of the time series using the
forecast() function.
An interesting feature of the utsf package is its extensibility. Apart from the models directly supported by the package, you can use the facilities of the package to do autoregressive time series forecasting using your own regression models. Thus, your models can benefit from the features implemented in the package, such as the building of the training set, the implementation of recursive forecasts, pre-processings, the estimation of forecast accuracy or parameter tuning.
To apply your own regression model you have to use the
method parameter of the create_model()
function, providing as argument a function that is able to train your
model. This function should return an object with the trained regression
model. The function must have three input parameters:
X: it is a data frame with the features of the training
examples. This data frame is built from the time series taking into
account the autoregressive lags as explained in a previous section. This
is the same object as the features component of the object
returned by the create_model() function.y: a vector with the targets of the training examples.
It is built as explained in a previous section. It is the same object as
the targets component of the object returned by the
create_model() function.param: it is a list with additional arguments for
adjusting the behavior of the model. This parameter will not be used in
this section and will be explained in the next section.Furthermore, if the function that trains the model (the function
provided in the method parameter) returns a model of class
model_class, a method with the signature
predict.model_class(object, new_value) should be
implemented. This method uses your model to predict a new value, that
is, it is the regression function associated with the model. Let us
explain the parameters of this regression function:
object: it is the object of class
model_class returned by your function that creates the
model.new_value: it is a data frame with the same structure
as the X parameter of the function for building the model.
The new_value data frame has only one row, with the
features of the example to be predicted.Let us see several examples of how to use this functionality for applying your ouw models for autoregressive time series forecasting taking advantage of the capabilities of the utsf package.
The case of K-nearest neighbors is a bit special because it is a lazy learner and no model is built. You only need to store the training set and all the work is done by the regression function. Let us see a first example in which we rely on the implementation of k-NN in the FNN package:
# Function to train the regression model
# In this case (k-NN) just stores the training set
my_knn_model <- function(X, y, param) {
structure(list(X = X, y = y), class = "my_knn")
}
# Function to predict a new example
predict.my_knn <- function(object, new_value) {
FNN::knn.reg(train = object$X, test = new_value, y = object$y)$pred
}
m <- create_model(AirPassengers, lags = 1:12, method = my_knn_model)
f <- forecast(m, h = 12)
f$pred
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 455.9167 434.3264 480.7703 490.1678 506.1262 568.0534 640.6689 640.8636
#> Sep Oct Nov Dec
#> 1961 549.5467 495.4255 441.6554 476.7934
autoplot(f)The function that trains the model (my_knn_model())
creates a “model” that only stores the training set. The regression
function (predict.my_knn()) takes advantage of the
FNN::knn.reg() function to look for the k-nearest neighbors
and compute their mean response value. The default number of neighbors
(3) of FNN::knn.reg() is used. Later, we will modify this
example so that the user can select the \(k\) value.
The k-nearest neighbors algorithm is so simple that it can be easily implemented without using functionality from any R package:
# Function to train the regression model
my_knn_model2 <- function(X, y, param) {
structure(list(X = X, y = y), class = "my_knn2")
}
# Function to predict a new example
predict.my_knn2 <- function(object, new_value) {
k <- 3 # number of nearest neighbors
distances <- sapply(1:nrow(object$X), function(i) sum((object$X[i, ] - new_value)^2))
k_nearest <- order(distances)[1:k]
mean(object$y[k_nearest])
}
m2 <- create_model(AirPassengers, lags = 1:12, method = my_knn_model2)
forecast(m2, h = 12)$pred
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 455.9167 434.3264 480.7703 490.1678 506.1262 568.0534 640.6689 640.8636
#> Sep Oct Nov Dec
#> 1961 549.5467 495.4255 441.6554 476.7934The random forest algorithm is directly supported in the
utsf package, using the implementation of random
forests in the ranger package. Here, we are going to create
an autoregressive model based on the random forest model implemented in
the RandomForest package:
library(randomForest)
#> randomForest 4.7-1.2
#> Type rfNews() to see new features/changes/bug fixes.
#>
#> Adjuntando el paquete: 'randomForest'
#> The following object is masked from 'package:ggplot2':
#>
#> margin
my_model <- function(X, y, param) { randomForest(x = X, y = y) }
m <- create_model(USAccDeaths, method = my_model)
f <- forecast(m, h = 12)
library(ggplot2)
autoplot(f)print(m$model)
#>
#> Call:
#> randomForest(x = X, y = y)
#> Type of random forest: regression
#> Number of trees: 500
#> No. of variables tried at each split: 4
#>
#> Mean of squared residuals: 191987.2
#> % Var explained: 77.2The function that creates the model (my_model()) just
uses the randomForest::randomForest() function. This
function returns an S3 object of class randomForest with
the trained model. In this case, we have not implemented the regression
function, because the predict.randomForest() method does
exactly what we want.
As another example, we are going to use a neural network model implemented in the nnet package:
library(nnet)
my_model <- function(X, y, param) {
nnet(x = X, y = y, size = 5, linout = TRUE, trace = FALSE)
}
m <- create_model(USAccDeaths, method = my_model)
f <- forecast(m, h = 12)
library(ggplot2)
autoplot(f)In this case, the nnet() function returns an S3 object
of class nnet. Again, regression method associated with
this class, predict.nnet(), is what we need as regression
function. Because we are using neural networks, probably some
pre-processing would be needed to obtain more accurate predictions.
In this case we are going to apply the extreme gradient boosting
model implemented in the xgboost package. We rely on the
xgboost::xgboost() function to build the model. This
function returns an object of class xgb.Booster with the
trained model. However, we cannot use the
xgboost::predict.xgb.Booster() regression function, because
the newvalue parameter of this function cannot be a data
frame. Let us see how we can get around this problem:
library(xgboost)
my_model <- function(X, y, param) {
m <- xgboost(data = as.matrix(X),
label = y,
nrounds = 100,
verbose = 0
)
structure(m, class = "my_model")
}
predict.my_model <- function(object, new_value) {
class(object) <- "xgb.Booster"
predict(object, as.matrix(new_value))
}
m <- create_model(USAccDeaths, method = my_model)
f <- forecast(m, h = 12)
library(ggplot2)
autoplot(f)The trick is that we change the class of the object (model) returned
by xgboost() to class my_model, so that we can
provide our own regression function (predict.my_model()).
In this function we change again the class of the model to use the
original regression function: predict.xgb.Booster(), but we
convert the new_value from a data frame to a matrix that is
a format expected by predict.xgb.Booster().
Normally, a regression model can be adjusted using different
parameters. By default, the models supported by our package are set
using some specific parameters, usually the default values of the
functions used to train the models (these functions are listed in a
previous section). However, the user can select these parameters with
the param argument of the create_model()
function. The param argument must be a named list with the
names and values of the parameters to be set. Let us see an example:
# A bagging model set with default parameters
m <- create_model(AirPassengers, lags = 1:12, method = "bagging")
length(m$model$mtrees) # number of regression trees (25 by default)
#> [1] 25
# A bagging model set with 3 regression trees
m2 <- create_model(AirPassengers,
lags = 1:12,
method = "bagging",
param = list(nbagg = 3)
)
length(m2$model$mtrees) # number of regression trees
#> [1] 3In the previous example, two bagging models (using regression trees)
are trained with the create_model() function. In the first
model the number of trees is 25, the default value of the function
ipred::ipredbagg() used to train the model. In the second
model the number of trees is set to 3. Of course, in order to set some
specific parameters the user must consult the parameters of the function
used internally by the utsf package to train the model:
in the example, ipred::ipredbagg(). In this case, the
nbagg parameter of the ipred::ipredbagg()
function is set to 3.
When the regression model is provided by the user, it is also possible to adjust its parameters. As explained previously, if you want to use your own regression model you have to provide a function that creates your model. This function has tree parameters:
X: to receive the training featuresy: to receive the targetsparam: in this parameter the arguments for adjusting
the model are received.The param parameter receives a named list with the names
and values of the arguments of the model to be adjusted. This list is
the same list that the user pass to the param parameter of
the create_model() function.
Let us see some example of how to pass arguments to some of the regression models implemented in the previous section.
In this case the model can be adjusted specifying the \(k\) value. The function provided to train
the model can be tuned with an argument named k with the
number of nearest neighbors:
# Function to train the model
my_knn_model <- function(X, y, param) {
k <- if ("k" %in% names(param)) param$k else 3
structure(list(X = X, y = y, k = k), class = "my_knn")
}
# Regression function for object of class my_knn
predict.my_knn <- function(object, new_value) {
FNN::knn.reg(train = object$X, test = new_value,
y = object$y, k = object$k)$pred
}
# The model is trained with default parameters (k = 3)
m <- create_model(AirPassengers, lags = 1:12, method = my_knn_model)
print(m$model$k)
#> [1] 3
# The model is trained with k = 5
m2 <- create_model(AirPassengers, method = my_knn_model, param = list(k = 5))
print(m2$model$k)
#> [1] 5The list provided as argument for the param parameter in
the create_model() function is passed as argument to the
param parameter of the function that trains the model
(my_knn_model()). In this case, in the second call to
create_model() the argument of the param
parameter, list(k = 5), is passed as argument for the
param parameter of the my_knn_model()
function.
As another example, suppose that we create a random forest model
using the functionality in the randomForest package. By
default, we will use the default parameters of the
randomForest::randomForest() function to build the model,
save for the number of trees (it will be set to 200). However, the user
can train the model with other values:
library(randomForest)
my_model <- function(X, y, param) {
args <- list(x = X, y = y, ntree = 200) # default parameters
args <- args[!(names(args) %in% names(param))]
args <- c(args, param)
do.call(randomForest::randomForest, args = args)
}
# The random forest is built with our default parameters
m <- create_model(USAccDeaths, method = my_model)
print(m$model$ntree)
#> [1] 200
print(m$model$mtry)
#> [1] 4
# The random forest is built with ntree = 400 and mtry = 6
m <- create_model(USAccDeaths, method = my_model,
param = list(ntree = 400, mtry = 6))
print(m$model$ntree)
#> [1] 400
print(m$model$mtry)
#> [1] 6This section explains how to estimate the forecast accuracy of a regression model predicting a time series with the utsf package. Let us see an example:
m <- create_model(UKgas, lags = 1:4, method = "knn")
r <- efa(m, h = 4, type = "normal", size = 8)
r$per_horizon
#> Horizon 1 Horizon 2 Horizon 3 Horizon 4
#> MAE 40.956667 37.205000 42.730417 51.424167
#> MAPE 4.118140 5.030128 7.468098 7.434864
#> sMAPE 4.261377 5.208551 7.814417 7.732037
#> RMSE 58.920772 47.660996 49.733955 55.023038
r$global
#> MAE MAPE sMAPE RMSE
#> 43.079063 6.012808 6.254095 52.834690To assess the forecast accuracy of a model you should use the
efa() function on the model. In this case, the forecasting
method is a K-nearest neighbors algorithm (method = "knn"),
with autoregressive lags 1 to 4, applied on the UKgas time
series. An estimation of its forecast accuracy on the series for a
forecasting horizon of 4 (h = 4) is obtained according to
different forecast accuracy measures. The efa() function
returns a list with two components. The component named
per_horizon contains the forecast accuracy estimations per
forecasting horizon. The component named global is computed
as the means of the rows of the per_horizon component.
Currently, the following forecast accuracy measures are computed:
Next, we describe how the forecasting accuracy measures are computed for a forecasting horizon \(h\) (\(y_t\) and \(\hat{y}_t\) are the actual future value and its forecast for horizon \(t\) respectively):
\[ MAE = \frac{1}{h}\sum_{t=1}^{h} |y_t-\hat{y}_t| \]
\[ MAPE = \frac{1}{h}\sum_{t=1}^{h} 100\frac{|y_t-\hat{y}_t|}{y_t} \] \[ sMAPE = \frac{1}{h}\sum_{t=1}^{h} 200\frac{\left|y_{t}-\hat{y}_{t}\right|}{|y_t|+|\hat{y}_t|} \]
\[ RMSE = \sqrt{\frac{1}{h}\sum_{t=1}^{h} (y_t-\hat{y}_t)^2} \]
The estimation of forecast accuracy is done with the well-known rolling origin evaluation. This technique builds several training and test sets from the series, as shown below for a forecasting horizon of 4:
In this case, the last nine observations of the series are used to build the test sets. Six models are trained using the different training sets and their forecasts are compared to their corresponding test sets. Thus, the estimation of the forecast error for every forecasting horizon is based on six errors (it is computed as the mean).
You can specify how many of the last observations of the series are
used to build the test sets with the size and
prop parameters. For example:
m <- create_model(UKgas, lags = 1:4, method = "knn")
# Use the last 9 observations of the series to build test sets
r1 <- efa(m, h = 4, type = "normal", size = 9)
# Use the last 20% observations of the series to build test sets
r2 <- efa(m, h = 4, type = "normal", prop = 0.2)If the series is short, the training sets might be too small to fit a
meaningful model. In that case, a special rolling origin evaluation can
be done, setting parameter type to
"minimum":
m <- create_model(UKgas, lags = 1:4, method = "knn")
r <- efa(m, h = 4, type = "minimum")
r$per_horizon
#> Horizon 1 Horizon 2 Horizon 3 Horizon 4
#> MAE 47.893750 44.934722 44.918229 26.961198
#> MAPE 5.745458 6.757300 8.250563 3.444200
#> sMAPE 5.761843 6.791762 8.329877 3.385892
#> RMSE 56.558584 52.858868 46.912818 26.961198
r$global
#> MAE MAPE sMAPE RMSE
#> 41.176975 6.049380 6.067343 45.822867When this option is chosen, the last h observations (in
the example 4) are used to build the training sets as is shown below for
h = 4:
In this case, the estimated forecast error for horizon 1 is based on 4 errors, for horizon 2 is based on 3 errors, for horizon 3 on 2 errors and for horizon 4 on 1 error.
The efa() function also returns the test sets (and the
predictions associated with them) used when assessing the forecast
accuracy:
timeS <- ts(1:25)
m <- create_model(timeS, lags = 1:3, method = "mt")
r <- efa(m, h = 5, size = 7)
r$test_sets
#> h=1 h=2 h=3 h=4 h=5
#> [1,] 19 20 21 22 23
#> [2,] 20 21 22 23 24
#> [3,] 21 22 23 24 25
r$predictions
#> h=1 h=2 h=3 h=4 h=5
#> [1,] 19 20 21 22 23
#> [2,] 20 21 22 23 24
#> [3,] 21 22 23 24 25Each row in these tables represent a test set or prediction. Let us see the test sets when the “minimum” evaluation is done:
Another useful feature of the utsf package is
parameter tuning. The tune_grid() function allows you to
estimate the forecast accuracy of a model using different combinations
of parameters. Furthermore, the best combination of parameters is used
to train a model with all the historical values of the series and the
model is applied for forecasting the future values of the series. Let us
see an example:
m <- create_model(UKgas, lags = 1:4, method = "knn")
r <- tune_grid(m, h = 4, tuneGrid = expand.grid(k = 1:7), type = "normal", size = 8)
# To see the estimated forecast accuracy with the different configurations
r$tuneGrid
#> k MAE MAPE sMAPE RMSE
#> 1 1 27.50551 4.195273 4.344649 38.72080
#> 2 2 41.67751 5.782108 5.995799 50.67769
#> 3 3 43.07906 6.012808 6.254095 52.83469
#> 4 4 43.58916 6.206692 6.382526 55.14836
#> 5 5 46.03185 6.332209 6.503322 58.41501
#> 6 6 47.64410 6.441446 6.583134 62.46201
#> 7 7 48.95917 6.841679 6.886563 63.75997In this example, the tuneGrid parameter is used to
specify (using a data frame) the set of parameters to assess. The
forecast accuracy of the model for the different combinations of
parameters is estimated as explained in the previous section using the
last observations of the time series as validation set. The
size parameter is used to specify the size of the test set.
The tuneGrid component of the list returned by the
tune_grid() function contains the result of the estimation.
In this case, the k-nearest neighbors method using \(k=1\) obtains the best results for all the
forecast accuracy measures. The best combination of parameters according
to RMSE is used to forecast the time series:
r$best
#> $k
#> [1] 1
r$forecast
#> Qtr1 Qtr2 Qtr3 Qtr4
#> 1987 1217.9250 661.4063 388.1828 817.3785Let us plot the values of \(k\) against their estimated accuracy using RMSE:
plot(r$tuneGrid$k, r$tuneGrid$RMSE, type = "o", pch = 19, xlab = "k (number of nearest neighbors)", ylab = "RMSE", main = "Estimated accuracy")Let us now see an example in which more than one tuning parameter is used. In this case we use a random forest model and three tuning parameters:
m <- create_model(UKDriverDeaths, lags = 1:12, method = "rf")
r <- tune_grid(m,
h = 12,
tuneGrid = expand.grid(num.trees = c(200, 500), replace = c(TRUE, FALSE), mtry = c(4, 8)),
type = "normal",
size = 12
)
r$tuneGrid
#> num.trees replace mtry MAE MAPE sMAPE RMSE
#> 1 200 TRUE 4 96.26832 6.464129 6.823281 96.26832
#> 2 500 TRUE 4 105.72996 7.096997 7.511665 105.72996
#> 3 200 FALSE 4 106.43686 7.124778 7.552159 106.43686
#> 4 500 FALSE 4 105.40314 7.050568 7.464402 105.40314
#> 5 200 TRUE 8 110.48042 7.462148 7.918964 110.48042
#> 6 500 TRUE 8 106.66487 7.253687 7.673312 106.66487
#> 7 200 FALSE 8 104.21562 7.099506 7.502311 104.21562
#> 8 500 FALSE 8 106.62072 7.248104 7.661996 106.62072Sometimes, the forecast accuracy of a model can be improved by applying some transformations and/or preprocessings to the time series being forecast. Currently, the utsf package is focused on preprocessings related to forecasting time series with a trend. This is due to the fact that, at present, most of the models incorporated in our package (such as regression trees and k-nearest neighbors) predict averages of the targets in the training set (i.e., an average of historical values of the series). In a trended series these averages are probably outside the range of the future values of the series. Let us see an example in which a trended series (the yearly revenue passenger miles flown by commercial airlines in the United States) is forecast using a random forest model:
m <- create_model(airmiles, lags = 1:4, method = "rf", trend = "none")
f <- forecast(m, h = 4)
autoplot(f)In this case, no preprocessing is applied for dealing with the trend
(trend = "none"). It can be observed how the forecast does
not capture the trending behavior of the series because regression tree
models predict averaged values of the training targets.
In the next subsections we explain how to deal with trended series using three different transformations.
A common way of dealing with trended series is to transform the original series taking first differences (computing the differences between consecutive observations). Then, the model is trained with the differenced series and the forecasts are back-transformed. Let us see an example:
m <- create_model(airmiles, lags = 1:4, method = "rf", trend = "differences", nfd = 1)
f <- forecast(m, h = 4)
autoplot(f)In this case, the trend parameter has been used to
specify first-order differences (trend = "differences").
The order of first differences is specified with the nfd
parameter (it will be normally 1). It is also possible to specify the
value -1, in which case the order of differences is estimated using the
ndiffs() function from the forecast
package (in that case, the chosen order of first differences could be 0,
that is, no differences are applied).
# The order of first differences is estimated using the ndiffs function
m <- create_model(airmiles, lags = 1:4, method = "rf", trend = "differences", nfd = -1)
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
m$differences
#> Order of first differences (selected automatically): 2This transformation has been used to deal with trending series in other packages, such as tsfknn or tsfgrnn. The additive transformation works transforming the targets of the training set as follows:
It is easy to check how the training examples are modified by the additive transformation using the API of the package. For example, let us see the training examples of a model with no transformation:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
m <- create_model(timeS, lags = 1:2, trend = "none")
cbind(m$features, Targets = m$targets)
#> Lag2 Lag1 Targets
#> 1 1 3 7
#> 2 3 7 9
#> 3 7 9 10
#> 4 9 10 12Now, let us see the effect of the additive transformation in the training examples:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
m <- create_model(timeS, lags = 1:2, trend = "additive", transform_features = FALSE)
cbind(m$features, Targets = m$targets)
#> Lag2 Lag1 Targets
#> 1 1 3 5.0
#> 2 3 7 4.0
#> 3 7 9 2.0
#> 4 9 10 2.5Apart from transforming the targets, it is also possible to transform the features. A vector of features is transformed by substracting the mean of the vector. Let us see the effects of this transformation:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
m <- create_model(timeS, lags = 1:2, trend = "additive", transform_features = TRUE)
cbind(m$features, Targets = m$targets)
#> Lag2 Lag1 Targets
#> 1 -1.0 1.0 5.0
#> 2 -2.0 2.0 4.0
#> 3 -1.0 1.0 2.0
#> 4 -0.5 0.5 2.5The point of transforming the features is to remove the effects of the level of the series in the feature space. Our experience tells us that transforming the features tends to improve forecast accuracy.
Next, we forecast the airmiles series using the additive transformation and a random forest model:
By default, the additive transformation is applied to both targets and features.
The multiplicative transformation is similar to the additive transformation, but it is intended for series with an exponential trend (the additive transformation is suited for series with a linear trend). In the multiplicative transformation a training example is transformed this way:
Let us see an example of an artificial time series with a multiplicative trend and its forecast using the additive and the multiplicative transformation:
t <- ts(10 * 1.05^(1:20))
m_m <- create_model(t, lags = 1:3, method = "rf", trend = "multiplicative")
f_m <- forecast(m_m, h = 4)
m_a <- create_model(t, lags = 1:3, method = "rf", trend = "additive")
f_a <- forecast(m_a, h = 4)
library(vctsfr)
plot_predictions(t, predictions = list(Multiplicative = f_m$pred, Additive = f_a$pred))In this case, the forecast with the multiplicative transformation captures perfectly the exponential trend.
As another example, let us compare the additive and multiplicative
transformation applied to the AirPassengers series. This
monthly series has an exponential trend.
m_m <- create_model(AirPassengers, lags = 1:12, method = "knn", trend = "multiplicative")
f_m <- forecast(m_m, h = 36)
m_a <- create_model(AirPassengers, lags = 1:12, method = "knn", trend = "additive")
f_a <- forecast(m_a, h = 36)
library(vctsfr)
plot_predictions(AirPassengers, predictions = list(Multiplicative = f_m$pred, Additive = f_a$pred))The create_model() function only has one compulsory
parameter: the time series with which the model is trained. Next, we
discuss the default values for the rest of its parameters:
lags: The lags parameter is an integer
vector (in increasing order) with the autoregressive lags. If
frequency(ts) == f where ts is the time series
being forecast and \(f > 1\) then
the lags used as autoregressive features are 1:f. For example,
the lags for quarterly data are 1:4 and for monthly data 1:12. This is
useful to capture a possible seasonal behavior of the series. If
frequency(ts) == 1, then:
stats::pacf()) are selected.method: By default, the K-nearest neighbors algorithm
is applied.param: By default, the model is trained using some
sensible parameters, normally the default values of the function used to
train the model.trend: The additive transformation is applied. This
transformation have proved its effectiveness to forecast trending
series. In general, its application seems beneficial on any kind of
series.transform_features: Its default value is
TRUE, so that if a series is transformed with the additive
or multiplicative transformation both its features and targets are
transformed.