THE MACHINE LEARNING TECHNIQUE FOR FORECASTING THE SEASONAL TIME SERIES

  • V.V. Alchakov Sevastopol State University
  • V.А. Kramar Sevastopol State University
Keywords: Seasonal time series, machine learning, forecasting, SARIMA, Holt-Winters exponential smoothing, Error Trend Seasonal (ETS), Facebook Prophet, XGBoost, Long Short -Term Memory (LSTM)

Abstract

Time series with seasonal variability is widely used to describe processes in various
fields, such as trade, analysis of financial markets, forecasting of passenger air transportation,
and description of climatic changes. Recently, this approach has been widely used to describe
technological processes as well. In this regard, applying predictive models in control systems of
complex technical objects has become possible. Machine learning methods can be effectively
used to build predictive models of series of this type. In this case, only historical data accumulated
over several periods of seasonal observation is used as input data for constructing the
forecast. Knowledge of other parameters, as a rule, is not required. The article considers creating
a predictive time series model with seasonal variability, describing a technological process,
the inlet flow of a wastewater treatment plant being chosen as a model. The general methodology
of model building, requirements for the input data sets, and algorithms of preprocessing to
form samples used for model training and testing are described. Classical methods (SARIMA,
Holt-Winters Exponential Smoothing, ETS), as well as new algorithms (Facebook Prophet,
XGBoost, Long Short Term Memory), were used to build the predictive model. The implementation
of the algorithms is done in the Python language, and recommendations for the use of existing
libraries and functions of this language are given in the work. The comparative analysis of
the accuracy of the obtained models is given on the calculation of a set of statistical metri cs.
Analysis of methods performance is also carried out since the time it takes to create a model
and get a forecast plays an important role when running the model in real production conditions.
The best method for solving the set task for application in real-time control systems was
chosen based on the sum of estimates. In conclusion, recommendations for improving forecast
accuracy were given, and future research directions were outlined.

References

1. Pongdatu G.A.N., Putra Y.H. Time Series Forecasting using SARIMA and Holt Winter’s Exponential
Smoothing, IOP Conf. Ser.: Mater. Sci. Eng., 2018,407 012153.
2. Huang W., Li Y., Zhao Y., Zheng L. Time Series Analysis and Prediction on Bitcoin, BCP
Business & Management, 2022, Vol. 34, pp. 1223-1234.
3. Kemalbay G., B. Korkmazoglu O. Sarima-arch versus genetic programming in stock price
prediction, Sigma J Eng Nat Sci., 2021, Vol. 39, No. 2, pp. 110-122.
4. Paliari I., Karanikola A., Kotsiantis S. A comparison of the optimized LSTM, XGBOOST and
ARIMA in Time Series forecasting, Proceedings of 12th International Conference on Information,
Intelligence, Systems & Applications, Chania Crete, Greece, 2021.
5. Andreeski C., Mechkaroska D. Modelling, Forecasting and Testing Decisions for Seasonal
Time Series in Tourism, Acta Polytechnica Hungarica, 2020, Vol. 17, No. 10, pp. 149-171.
6. Uğuz, Büyükgökoğlan E. A Hybrid CNN-LSTM Model for Traffic Accident Frequency Forecasting
During the Tourist Season, Technical Gazette, 2022, Vol. 29, pp. 2083-2089.
7. Etuk E. A seasonal time series model for Nigerian monthly air traffic data, IJRRAS, 2013,
4 (3), pp. 596-602.
8. Feng T. Tianyu Z., Zheng Y., Jianxing Y. The comparative analysis of SARIMA, Facebook
Prophet, and LSTM for road traffic injury prediction in Northeast China // Frontiers in Public
Health. – July 2022, Vol. 10. DOI: 10.3389/fpubh.2022.946563.
9. Zhu X., Helmer E.H., Gwenzi D., Collin M. Characterization of Dry-Season Phenology in Tropical
Forests by Reconstructing Cloud-Free Landsat Time Series, Remote Sens, 2021, 13, 4736.
10. Figueiredo N., Blanco C. Water level forecasting and navigability conditions of the Tapajós River -
Amazon – Brazil, La Houille Blanche, 2016. La Houille Blanche, Vol. 102 (3), pp. 53-64.
11. Shen J., Valagolam D., McCalla S. Prophet forecasting model: a machine learning approach to
predict the concentration of air pollutants (PM2.5, PM10, O3, NO2, SO2, CO) in Seoul, South
Korea, PeerJ, 2020, 8.
12. Hasnain A., Sheng Y., Hashmi M.Z. Time Series Analysis and Forecasting of Air Pollutants
Based on Prophet Forecasting Model in Jiangsu Province, China Citation, Frontiers in Environmental
Science, 2022, 10:945628.
13. Luo Z., Jia X., Bao J. A Combined Model of SARIMA and Prophet Models in Forecasting
AIDS Incidence in Henan Province, China, International Journal of Environmental Research
and Public Health, 2022, 19, 5910.
14. Pandit A., Khan D. Z., Hanrahan J. G. Historical and future trends in emergency pituitary
referrals: a machine learning analysis, Pituitary, 2022, 25 (6), pp. 927-937.
15. Benkachcha S., Benhra J., El Hassani H. Seasonal Time Series Forecasting Models based on
Artificial Neural Network, International Journal of Computer Applications, 2015, Vol. 116,
No. 20, pp. 9-14.
16. Palmroos C., Gieseler J., Morosan N. Solar energetic particle time series analysis with Python,
Frontiers in Astronomy and Space Sciences, 2022, 9:1073578.
17. Wan X., Zou Y., Wang J., Wang W. Prediction of shale oil production based on Prophet algorithm,
Journal of Physics: Conference Series, 2009, Vol. No. 1.
18. El-Rawy M., Abd-Ellah M.K., Fathi H., Abdella Ahmed A. K. Forecasting effluent and performance
of wastewater treatment plant using different machine learning techniques, Journal of
Water Process Engineering, 2021, Vol. 44.
19. Taylor S., Letham B. Forecasting at scale, The American statistician, 2017, Vol. 72, No. 1,
pp.37-45.
20. Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System, Proceedings of the 22nd
ACM SIGKDD International Conference, San Francisco California, 2016.
21. Hochreiter S., Schmidhuber J. Long Short-term Memory, Neural computation, 1997, Vol. 9,
pp. 1735-80.
22. Zemkoho A. A Basic Time Series Forecasting Course with Python, Oper. Res. Forum., 2023, 4:2.
23. Plevris V., Solorzano G., Bakas N., Ben Seghier M. Investigation of performance metrics in
regression analysis and machine learning-based prediction models, Proccedings of the 8th European
Congress on Computational Methods in Applied Sciences and Engineering Oslo, Norway,
2022.
24. Pandas - Python Data Analysis Library. Available at: https://pandas.pydata.org.
25. Cowpertwait P.S.P., Metcalfe A. V. Introductory Time Series with R. Springer, London, 2009.
26. Introduction – statmodels. Available at: https://www.statsmodels.org/stable/index.html/.
27. Pmdarima: ARIMA estimators for Python. Available at: https://alkaline-ml.com.
28. Hyndman R.J., Athanasopoulos G. Forecasting: Principles and Practice, Otexts, Monash University,
Australia, 2021.
29. Prophet | Forecasting at scale. Available at: https://facebook.github.io/prophet/.
30. XGBoost. Available at: https://xgboost.ai/about.
31. Python API Reference – xgboost documentation. Available at: xgboost.readthedocs.io.
Published
2023-06-07
Section
SECTION III. INFORMATION PROCESSING ALGORITHMS