skip to main content

PENERAPAN TUNING HYPERPARAMETER RANDOMSEARCHCV PADA ADAPTIVE BOOSTING UNTUK PREDIKSI KELANGSUNGAN HIDUP PASIEN GAGAL JANTUNG

*Tita Aulia Edi Putri  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Tatik Widiharih  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Rukun Santoso  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Open Access Copyright 2022 Jurnal Gaussian under http://creativecommons.org/licenses/by-nc-sa/4.0.

Citation Format:
Abstract
Heart failure is the number one cause of death every year. Heart failure is a pathological condition characterized by abnormalities in heart function, which results in the failure of blood to be pumped to supply metabolic needs of tissues. The application of data mining and computational techniques to medical records can be an effective tool to predict each patient's survival who has heart failure symptoms. Data mining is a process of gathering important information from big data. The collection of important information is carried out through several processes, including statistical methods, mathematics, and artificial intelligence technology. The AdaBoost method is one of the supervised algorithms in data mining that is widely applied to make classification models. Hyperparameter Optimization is selecting the optimal set of hyperparameters for a learning algorithm. AdaBoost has hyperparameters requiring a classification process set, namely learning rate and n_estimators. RandomSearchCV is a random combination method of selected hyperparameters used to train the model. This research uses heart failure patient data collected at the Faisalabad Institute of Cardiology and at the Allied Hospital in Faisalabad (Punjab, Pakistan) from April to December 2015. The research uses learning rate: [-2.2] (log scale), n_estimators start from 10 to 776, and Kfold=5 and produces the best hyperparameters in learning rate=0.01 and n_estimators=443 with an accuracy value of 0.85 and AUC value of 0.897.
Fulltext View|Download
Keywords: Heart Failure; Tuning Hyperparameter; AdaBoost; RandomSearchCV

Article Metrics:

  1. Aggarwal, C. C. 2015. Data Mining. New York: Springer
  2. Altman, D.G. 2006. Practical Statistics for Medical Research. Chapman dan Hall/CRC
  3. Arian R. V. E., dan Peter M. Th. P. 1998. Receiver operating characteristic (ROC) analysis:
  4. Basic principles and applications in radiology. European Journal of Radiology, pp. 88-
  5. https://doi.org/10.1016/S0720-048X(97)00157-5
  6. Bergstra, J., dan Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305. https://www.researchgate.net/publication/262395872_Random_Search_for_Hyper-Parameter_Optimization
  7. Brownlee, J. (2020). Probability for Machine Learning : Discover how to harness uncertainty with Python (v1.9)
  8. Burkov, A. (2020). The Hundred-Page Machine Learning Book. In Journal of Information Technology Case and Application Research (Vol. 22, Issue 2). https://doi.org/10.1080/15228053.2020.1766224
  9. Fawcett, T. 2006. An Introduction to ROC Analysis. Pattern Recogn. Lett., Volume 27(8), pp. 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
  10. Ghawi, R., dan Pfeffer, J. (2019). Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity. Open Computer Science, 9(1), 160–180. https://doi.org/10.1515/comp-2019-0011
  11. Gorunescu, F. (2011). Data Mining: Concepts, Models and Techniques. Berlin: Springer
  12. Hastie T., Tibshirani R., dan Friedman J. 2008. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Ed ke-2. New York (NY): Springer Verlag
  13. Kasron. 2012. Kelainan dan Penyakit Jantung Pencegahan serta Pengobatannya. yogyakarta: Nuha Medika
  14. Kemenkes RI. (2014). Situasi kesehatan jantung. In Pusat data dan informasi kementerian kesehatan RI. https://doi.org/10.1017/CBO9781107415324.004
  15. Klabunde, R. E. (2015). Konsep Fisiologi Kardiovaskular. Ed. 2. Jakarta: EGC
  16. Niu, M., Li, Y., Wang, C., & Han, K. (2018). RFAmyloid: A web server for predicting amyloid proteins. International Journal of Molecular Sciences, 19(7). https://doi.org/10.3390/ijms19072071
  17. Raphael, C., Briscoe, C., Justin, D. Z. I. W., Manisty, C., Sutton, R., Mayet, J., Francis, D. P. (2007). Limitations of the New York Heart Association functional classification system and self-reported walking distances in chronic heart failure. Heart ; 93(4):476–82
  18. Santosa, B. 2007. Data Mining: Teknik Pemanfaatan Data untuk Keperluan Bisnis. Graha Ilmu. Yogyakarta
  19. Tempola, F., Muhammad, M., dan Khairan, A. (2018). Perbandingan Klasifikasi Antara KNN dan Naive Bayes pada Penentuan Status Gunung Berapi dengan K-Fold Cross Validation. Jurnal Teknologi Informasi Dan Ilmu Komputer, 5(5), 577. https://doi.org/10.25126/jtiik.201855983
  20. Ting, K., Zheng, Z. (2009). A Study of AdaBoost with Naive Bayesian Classifiers: Weakness and Improvement. Computational Intellegence.19(2),186-200

Last update:

No citation recorded.

Last update:

No citation recorded.