skip to main content

PREDIKSI PENUMPANG LRT JAKARTA MENGGUNAKAN SARIMAX DAN XGBOOST DENGAN EFEK KALENDER

Muhammad Hafiz Fazli  -  Study Program in Statistics and Data Science, Institut Pertanian Bogor, Jl. Meranti Wing 22 Level 4 Kampus IPB Darmaga, Darmaga, Bogor, Jawa Barat, Indonesia 16680, Indonesia
M. Taqy Abiyu Dzakwan  -  Study Program in Statistics and Data Science - School of Data Science, Mathematics, and Informatics, IPB University, Bogor, Indonesia, Indonesia
Nada Ardelia  -  Study Program in Statistics and Data Science - School of Data Science, Mathematics, and Informatics, IPB University, Bogor, Indonesia, Indonesia
Gemala Aleida Fitri  -  Study Program in Statistics and Data Science - School of Data Science, Mathematics, and Informatics, IPB University, Bogor, Indonesia, Indonesia
*Akbar Rizki orcid scopus  -  Study Program in Statistics and Data Science - School of Data Science, Mathematics, and Informatics, IPB University, Bogor, Indonesia, Indonesia
Windi Pangesti  -  Study Program in Statistics and Data Science - School of Data Science, Mathematics, and Informatics, IPB University, Bogor, Indonesia, Indonesia
Open Access Copyright 2026 Jurnal Gaussian under http://creativecommons.org/licenses/by-nc-sa/4.0.

Citation Format:
Abstract
The daily passenger volume of Jakarta’s LRT fluctuates significantly due to weekly seasonality and calendar variations, making accurate forecasting important for operational planning and decision-making. This study aims to determine the most effective model for forecasting daily passenger demand by comparing the SARIMAX and XGBoost methods on transportation data characterized by strong seasonal patterns and external influences. SARIMAX was selected because it models seasonal and autoregressive structures alongside exogenous variables, while XGBoost captures nonlinear relationships between temporal features and external factors. The dataset covers the period from 1 January 2024 to 31 August 2025 and includes variables such as weekends, national holidays, and special events. Model evaluation was conducted using walk-forward cross-validation and hyperparameter tuning. The results show that the SARIMAX(1,0,1)(0,1,1)7 model achieved the best performance, with a validation MAPE of 11.26% and a test MAPE of 8.64%, outperforming XGBoost. SARIMAX also reproduced weekly fluctuation patterns more consistently, indicating that it is more suitable for forecasting transportation demand with strong seasonal characteristics and relatively stable external influences.

Note: This article has supplementary file(s).

Fulltext View|Download |  Data Set
Dataset LRT Ridership, Daily, Jan 2024-Aug 2025
Subject
Type Data Set
  View (10KB)    Indexing metadata
 Data Analysis
Flagging exogenous variables, weekends, holidays, contiguous off days, almost contiguous off days
Subject
Type Data Analysis
  View (38KB)    Indexing metadata
 Data Analysis
SARIMAX Modelling
Subject
Type Data Analysis
  View (479KB)    Indexing metadata
 Data Analysis
XGBoost Modelling
Subject
Type Data Analysis
  View (403KB)    Indexing metadata
 Research Instrument
Perjanjian Pengalihan Hak Cipta
Subject
Type Research Instrument
  Download (597KB)    Indexing metadata
Email colleagues
Keywords: LRT Jakarta; passenger demand; SARIMAX; time series forecasting; XGBoost

Article Metrics:

Article Info
Section: Articles
Language : ID
  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M., 2019. Optuna: A Next-generation Hyperparameter Optimization Framework, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, pp. 2623–2631. DOI: https://doi.org/10.1145/3292500.3330701
  2. Badan Pusat Statistik Provinsi DKI Jakarta, 2025. Penduduk, Laju Pertumbuhan Penduduk, Distribusi Persentase Penduduk, Kepadatan Penduduk, Rasio Jenis Kelamin Penduduk Menurut Kabupaten/Kota di Provinsi DKI Jakarta, 2024. BPS Provinsi DKI Jakarta. URL: https://jakarta.bps.go.id
  3. Badan Pusat Statistik Provinsi DKI Jakarta, 2024. Perkembangan Transportasi Provinsi DKI Jakarta Juni 2024. BPS Provinsi DKI Jakarta. URL: https://jakarta.bps.go.id
  4. Bergstra, J., Ca, J.B., Ca, Y.B., 2012. Random Search for Hyper-Parameter Optimization Yoshua Bengio, Journal of Machine Learning Research
  5. Chen, T., Guestrin, C., 2016. XGBoost, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, USA, pp. 785–794. DOI: https://doi.org/10.1145/2939672.2939785
  6. Dinas Perhubungan Provinsi DKI Jakarta, 2025. Data Penumpang LRT di Provinsi DKI Jakarta. satudata. URL: https://satudata.jakarta.go.id
  7. Harriz, M.A., Akbariani, N.V., Setiyowati, H., Santoso, H., 2023. Enhancing the Efficiency of Jakarta’s Mass Rapid Transit System with XGBoost Algorithm for Passenger Prediction. Jambura Journal of Informatics 5, 1–6. DOI: https://doi.org/10.37905/jji.v5i1.18814
  8. ITDP Indonesia, 2021. ITDP Annual Report 2021. Jakarta
  9. Lee, G.C., Bang, J.Y., 2024. Forecasting Container Throughput of Singapore Port Considering Various Exogenous Variables Based on SARIMAX Models. Forecasting 6, 748–760. DOI: https://doi.org/10.3390/forecast6030038
  10. Neydi, D., Pakaya, P., Achmad, N., Hasan, I.K., Wungguli, D., Abdussamad, S.N., 2025. Prediksi Wisatawan Mancanegara di Indonesia Menggunakan Metode SARIMAX dengan Efek Variasi Kalender Libur Nasional. Riset Mahasiswa Matematika 4, 287–300. DOI: https://doi.org/10.18860/jrmm/v4i6.34937
  11. Nurhambali, M.R., Angraini, Y., Fitrianto, A., 2024. Implementation of Long Short-Term Memory for Gold Prices Forecasting. Malaysian Journal of Mathematical Sciences 18, 399–422. DOI: https://doi.org/10.47836/mjms.18.2.11
  12. Rizky Mahendra, R., Damaliana, A.T., Diyasa, I.G.S.M., 2025. Pendekatan Time Series Decomposition (STL) Dalam Prediksi Kecelakaan Berbasis Kepadatan Lalu Lintas Sebagai Dasar Kebijakan Di Tol Surabaya-Gempol. Jurnal Impresi Indonesia 4, 1538–1548. DOI: https://doi.org/10.58344/jii.v4i5.6491
  13. Saputra, A., Gustriansyah, R., Sanmorino, A., Mair, R.Z., Sartika, D., Puspasari, S., 2024. Prediction Passenger Numbers in Light Rail Transit using Seasonal Autoregressive Integrated Moving Average (SARIMA). PRZEGLĄD ELEKTROTECHNICZNY 1, 45–47. DOI: https://doi.org/10.15199/48.2024.10.07
  14. Talusan, J.P., Mukhopadhyay, A., Freudberg, D., Dubey, A., 2022. On Designing Day Ahead and Same Day Ridership Level Prediction Models for City-Scale Transit Networks Using Noisy APC Data, in: 2022 IEEE International Conference on Big Data (Big Data). IEEE, pp. 5598–5606. DOI: https://doi.org/10.1109/BigData55660.2022.10020390
  15. The Jakarta Post, 2025. Greater Jakarta LRT breaks new ridership record. The Jakarta Post
  16. URL: https://www.thejakartapost.com/
  17. United Nations, E. and S.C. for A. and the P., 2019. Review of sustainable transport connectivity in Asia and the Pacific 2019 : addressing the challenges for freight transport. United Nations Economic and Social Commission for Asia and the Pacific, Bangkok
  18. Wang, X., Zhang, N., Zhang, Y., Shi, Z., 2018. Forecasting of Short-Term Metro Ridership with Support Vector Machine Online Model. J Adv Transp 2018, 1–13. DOI: https://doi.org/10.1155/2018/3189238
  19. Wicaksana, H.S., Huda, K., 2025. Penerapan Word2Vec dan SVM dengan Hyperparameter Tuning untuk Deteksi Phishing. JURIKOM (Jurnal Riset Komputer) 12, 361–371. DOI: https://doi.org/10.30865/jurikom.v12i3.8729

Last update:

No citation recorded.

Last update:

No citation recorded.