Xgboost time series forcasting with sktime [ep#2]

Patiparn Nualchan

6 min readDec 29, 2022

build,fine- tune and eveluate ML model on time series data with sktime

Again Hello, in this ep#2 we are going to talk about step 3–5 (as below). Don’t waste the time, let’s go !!

Time series data (rough concept to deal with one single — sales data)
Time series analysis (EDA and stationary testing)

3. Time series forecasting (modeling and prediction)

4. Time series cross validation (temporal cross validation)

5. fine-tune xgboost (get best parameter)

Briefly recap, from ep#1 we get the data ready y_train, y_test (by temporal_train_test_split) from and fh (by ForecastingHorizon). Our goal is to create model to predict 12 week(3 month) sales ahead.

3 Time series forecasting, Sktime transform time sequence to supervise and ready to do Machine Learning task, so we can mention now we are working on Regression. I pick up some of Regression model as follow.

AutoARIMA* (a classic Time series technique)
KNeighborsRegressor
LinearRegression
XgbRegressor

for the model performance was take mean_absolute_percentage_error(MAPE) by sktime.performance_metrics.forecasting [MAPE is measured in percentage error relative to the test data. Because it takes the absolute value rather than square the percentage forecast error]. code here.

from sktime.performance_metrics.forecasting import mean_absolute_percentage_error
print('MAPE: %.4f' % mean_absolute_percentage_error(y_test, y_pred, symmetric=False))

AutoARIMA

In Auto ARIMA, the model itself will generate the optimal p, d, and q values which would be suitable for the data set to provide better forecasting.

Time Series forecasting using Auto ARIMA in python

Demonstration on how to leverage Auto ARIMA functionality in python using ‘pmdarima’ package to forecast the future

towardsdatascience.com

code here,

from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA(start_p=8, max_p=9, suppress_warnings=True)

forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])

as you can see, y_pred can’t predict any up and down trend and MAPE showed 0.75(75% of error).

KNeighborsRegressor

The KNN algorithm uses ‘feature similarity’ to predict the values of any new data points. This means that the new point is assigned a value based on how closely it resembles the points in the training set.

K-Nearest Neighbors Algorithm | KNN Regression Python

Out of all the machine learning algorithms I have come across, KNN algorithm has easily been the simplest to pick up…

www.analyticsvidhya.com

code here,

regressor = KNeighborsRegressor(n_neighbors=3)

forecaster = make_reduction(regressor, strategy="recursive", window_length= window_length)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])

as above KNNRegressor can predict trend of up and down, but it had a gap between actual and predict value (as was my yellow color hand writing) formal reapresent by MAPE 0.8094. Howevey, it can capture trend line so I make it as candidate model

LinearRegression

Linear regression shows the linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis), consequently

Linear Regression Algorithm To Make Predictions Easily

This article was published as a part of the Data Science Blogathon . A regression problem is when the output variable…

www.analyticsvidhya.com

code here,

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()

forecaster = make_reduction(regressor, strategy="recursive", window_length= window_length)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])

as above Linear Regression predicted heavy up and down trend. It was the effect of noise data and MAPE was so bad at 1.87 (more explore can do by remove that noise and re-predict and see how does it can improve.)

XgbRegressor

XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. In prediction problems involving unstructured data (images, text, etc.) artificial neural networks tend to outperform all other algorithms or frameworks.

XGBoost Algorithm: Long May She Reign!

The new queen of Machine Learning algorithms taking over the world…

towardsdatascience.com

code here,

from xgboost import XGBRegressor
regressor = XGBRegressor(objective='reg:squarederror', random_state=42)

forecaster = make_reduction(regressor, strategy="recursive", window_length= window_length)
forecaster.fit(y=y_train)
y_pred = forecaster.predict(fh=fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])

XgboostRegressor can predict up and down trend in a acceptable level, but almost same error as KNN (yellow gap) and MAPE at 0.96. However, it can predict so I pick it us as a candidate model to improve.

4 Time series cross-validation, to make more model robustness and well performance. we have to do cross-validation.

Don’t Use K-fold Validation for Time Series Forecasting

How to perform temporal cross validation with sktime in python

towardsdatascience.com

first, We don’t use K-Fold as we familiar for Time series. (sequential data can’t shuffle or random) so cross-validation for Time series is special (temporal cross validation).

SlidingWindowSplitter/ ExpandingWindowSplitter are two main of temporal cross validation.

In this artical, I used ExpandingWindowSplitter and pick up only 2 candidate model KNeighborsRegressor and XgbRegressor.

ExpandingWindowSplitter concept by author

KNeighborsRegressor: ExpandingWindowSplitter

regressor = KNeighborsRegressor(n_neighbors=3)
forecaster = make_reduction(regressor, strategy="recursive", window_length= window_length)

cv = ExpandingWindowSplitter(step_length=12, fh=fh, initial_window=52)
results = evaluate(
    forecaster=forecaster, y=y, cv=cv, strategy="refit", return_data=True
)
results.iloc[:, :5].head()

KNeighborsRegressor: ExpandingWindowSplitter prediction line trend was up and down trend, but in the last len MAPE was higher than previous. By eye we can see big gap as well.

XgbRegressor: ExpandingWindowSplitter

regressor = XGBRegressor(objective='reg:squarederror', random_state=42)
forecaster = make_reduction(regressor, strategy="recursive", window_length= window_length)

cv = ExpandingWindowSplitter(step_length=12, fh=fh, initial_window=52)
results = evaluate(
    forecaster=forecaster, y=y, cv=cv, strategy="refit", return_data=True
)
results.iloc[:, :5].head()

as above XgbRegressor: ExpandingWindowSplitter prediction line trend was getting close to the train line compare with KNeighborsRegressor. By result summary table MAPE was getting better by len_train_window. >>>>>> Pick up XgbRegressor to fine-tune >>>>>>

5 XgbRegressor Fine- Tuning, Hyperparameter tuning is process of determining the right combination of hyperparameters that maximizes the model performance.

XGBoost: A Complete Guide to Fine-Tune and Optimize your Model

How to tune XGBoost hyperparameters and supercharge the performance of your model?

towardsdatascience.com

code here,

from sktime.forecasting.model_selection import (ForecastingGridSearchCV,
                                                SlidingWindowSplitter)

param_grid = {
    'estimator__max_depth': [3, 6, 10, 15],
    'estimator__learning_rate': [0.01, 0.1, 0.2, 0.3],
    'estimator__colsample_bytree': np.arange(0.4, 1.0),
    'estimator__n_estimators': [100, 500, 1000]
}

regressor = XGBRegressor(objective='reg:squarederror', random_state=42)
forecaster = make_reduction(regressor, strategy="recursive")

cv = ExpandingWindowSplitter(step_length=12, fh=fh, initial_window=52)
gscv = ForecastingGridSearchCV(
    forecaster, cv=cv, param_grid=param_grid, strategy="refit"
)

MAPE was better from begining (single model) at 0.9653, ExpandingWindowSplitter (Validation) last len at 0.5319 and last perfomance (Hyperparameter tuning) was 0.5114. Hyperparameter tuning can improve 47% improve from single model and 4% from Validation.

Conclusion

We explored 4 models to predict 12 weeks ahead
KNeighborsRegressor and XgbRegressor were 2 candiate models to make more model robustness
final XgbRegressor with Hyperparameter tuning

Future work we can re-work on Xgboost with noise removing or add more train data or add more feature untill feature engineering on date term to generate more interesting feature.

Thank for your reading till the end, I’m not expert person in any time series or sktime world, I just one who need to learn and explore to practice myself by doing and note it, I will be appreciate on any advise from some expert on some point I was wrong and other additional commemt.

you can take full code on my github here https://github.com/MossMojito/sktime_Xgboost/blob/main/sktime_xgboost.ipynb

Xgboost time series forcasting with sktime [ep#2]

3. Time series forecasting (modeling and prediction)

4. Time series cross validation (temporal cross validation)

5. fine-tune xgboost (get best parameter)

AutoARIMA

Time Series forecasting using Auto ARIMA in python

Demonstration on how to leverage Auto ARIMA functionality in python using ‘pmdarima’ package to forecast the future

KNeighborsRegressor

K-Nearest Neighbors Algorithm | KNN Regression Python

Out of all the machine learning algorithms I have come across, KNN algorithm has easily been the simplest to pick up…

LinearRegression

Linear Regression Algorithm To Make Predictions Easily

This article was published as a part of the Data Science Blogathon . A regression problem is when the output variable…

XgbRegressor

XGBoost Algorithm: Long May She Reign!

The new queen of Machine Learning algorithms taking over the world…

Don’t Use K-fold Validation for Time Series Forecasting

How to perform temporal cross validation with sktime in python

XGBoost: A Complete Guide to Fine-Tune and Optimize your Model

How to tune XGBoost hyperparameters and supercharge the performance of your model?

Conclusion

Written by Patiparn Nualchan