5.12.8. Time Series (clip0243 action)

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions > 5.12. TA - R Predictive >

5.12.8. Time Series (clip0243 action)

Icon:clip1302

 
Function: R_timeSeries
 

Property window:

 

clip1305

 

Short description:

 

Build and compare TimeSeries

on sequential data

 

Long Description:

 

Use this node to compute ETS, STLF, ARIMA and Holt-Winters models.

 

At the very least, you need to have two series: one with dates (YYYY-MM-dd or YYYY/MM/dd or any separators, as long as the order and number of characters is respected).

 

Then, you need to specify the time units between observations (Unit of Time). Anatella will attempt to do it for you, and if it doesn’t match the data, it will recode it. But it’s always better to help the software.

 

Then, specify on which time frame you wish to run the analysis (Aggregate Time Series) to get smoother – and usually better – results. You can specify whether you want the sum or the average computed

 

Sans-titre-1

 

 

Then, you can optionally uncheck the algorithms you don’t want to run (because you know what you are doing... or leave them all checked and let R figure out which one works.
 

clip1322

 

 

clip1309

When you run the algorithms, make sure you force anatella to write cache: right click on the object and set the option

 

 

You can set some advanced parameters for the decomposition, for ARIMA and for Holt Winters.

 

clip1310

 
 

Arima: Information Criterion: By default, you should use BIC as a criterion, as it will perform better if heterogeneity is large (Aho, 2014) , but AIC an AICC are also available and should be checked as other criterion may yield better models. As a reminder, the key difference between AIC and BIC is that BIC penalizes stronger the number of parameters, and this is often desirable.

Consider clip1311 is the max likelihood estimates of the model parameters, clip1312 is the log likelihood of the model, clip1313 is the number of parameters and clip1314 the number of observations.

 
AIC: Akaike Information Criteria = clip1315

 
AICC: Aikake for Small Samples clip1316

 
BIC: Bayesian Information Criteria. clip1317

 

Thus, when you have few observations (as often in timeseries) AICC would be the most conservative (and preferred) test to apply. When you have “a lot” of points (over 100), BIC or AIC would be appropriate.

 

The details of the model are available in the log:

 

Model Information:

Series: timeSerie

ARIMA (1,0,1) (2,1,0) [12]

 

Coefficients:

        arl     mal    sarl    sar2

     0.8020 -0.5961 -0.6803 -0.4174

s.e.  0.1671  0.2171  0.0956  0.0995

 

sigma^2 estimated as 974.6: log likelihood=-526.74

AIC=1063.48   AICc=1064.07   BIC=1076.89

 

Error measures:

                    ME     RMSE      MAE        MPE     MAPE      MASE

Training set  0.6694643 29.06258 21.78323 -0.4048295 6.963703 0.6720913

                  ACF1

Training set -0.0313262

 

Forecasts:

        Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95

Jan 1991       496.5844 456.5766 536.5921 435.3978 557.7710

 

 

Trend Smoothing Periodicity: this parameter will set how many periods are used to smooth the trend line. If you put a small number, you will have a very non-linear trend, if you put a large number you will get something more stable. If you leave it at 0, the default value of R for (t.window)is used:

 
nextodd(ceiling((1.5*period)/(1-(1.5/s.window))))

 
Decomp: Season Degree: (0/1) degree of the function for seasonality. Set to 0 if you don’t think there is seasonality

 
Decomp: Trend Degree: (0/1) degree of the function for trend. Set to 0 if you don’t think there is trend

 
Decomp Robust: Set if you want to use a robust method

 
Holt Winters alpha, Beta and Gamma: Set starting values to help avoid local optima.

 
Outputs
 

If you are lucky (you have a data structure that fits the time series framework) you will get the following plot:

clip1318

 

Here, we can observe the decomposition of trend, seasonality and error of the data, a box plot showing the distribution (error) per time unit, and the results of each algorithm.

 

You can pick the one with the lowest MAE, of look at the diagnostics for more KPIs:

 

Running... (21/2/22 18:08:13)

[1] "Set Time Unit to Days (minimum frequency), StartUnit== 1"

                     ME             RMSE              MAE                MPE

ETS   -0.225950033418252 27.3589584352165 22.1117188650616 -0.740521333848271

STLF  -0.386518286476155 26.0885511361616 20.9688275434863 -0.918204402968769

ARIMA  0.669464278059237 29.0625764891712 21.7832252493945 -0.404829515401513

HW      3.81109616563921 32.2023638385829 25.2788828886824  0.848436419880703

                  MAPE              MASE                ACF1

ETS    7.05346168010657 0.682226499093432  0.0640085212331007

STLF   6.72201962179413 0.646964168294057  0.0131929824344223

ARIMA   6.9637031816088 0.672091282977547 -0.0313261956920077

HW     8.13514778925323 0.779944963997742 -0.0462278167592328

 

Success! (finished at 21/2/22 18:08:25 after 12.04 seconds - Peak Memory Consumption~ 295 MB)

 

 

The same information is available in the second output pin:

 

clip1323

 

 

The first output pin will give you the computed estimates for each observation, as well as the prediction for all successful algorithms (for those, the original observation will be set to 0)
 

clip1325

 

 

If we are unlucky, the time series will fail to decompose, and /or you may end up with constant estimates (when the algorithms doesn’t simply fail). In this case, only the box plot and timeSeries plots will be returned. You will also see and error message in the log telling you what happened (in R dialect)

 

clip1321

 

[1] “Error. Failed to converge STLF Algorithm.\nThe R library said:\nError in stl(ts(deseas, frequency = msts[i]),

s.window = s.window[i], : series is not periodic or has less than two periods\n"

[1] “Error. Failed to converge Holt-Winters Algorithm.\nThe R library said:\nError in decompose (ts(x[1L:wind], start

= start(x), frequency = f£), seasonal): time series has no or less than 2 periods\n"

[1] "nSeries= 2"

[1] “Warning: Failed to decompose TimeSeries. Decomposition not plotted.”

<simpleError in stl(timeSerie, s.window = "periodic", t.window = idxPeriod): series is not periodic or has less than

two periods>

 

 

In this case, STLF and Holt Winters failed, Arima and ETS failed without crashing, and simply return a constant prediction. The reason here is obvious: there is not enough data to find cycles!