PENANGANAN KLASIFIKASI KELAS DATA TIDAK SEIMBANG DENGAN RANDOM OVERSAMPLING PADA NAIVE BAYES (Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal)

Reza Dwi Fitriani; Hasbi Yasin; Tarno Tarno

doi:10.14710/j.gauss.10.1.11-20

DOI: https://doi.org/10.14710/j.gauss.10.1.11-20

PENANGANAN KLASIFIKASI KELAS DATA TIDAK SEIMBANG DENGAN RANDOM OVERSAMPLING PADA NAIVE BAYES (Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal)

*Reza Dwi Fitriani - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

Hasbi Yasin - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

Tarno Tarno - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

Citation Format:

Abstract

The Family Planning Program (KB) launched by the Government of Indonesia to address the problem of population control does not always produce the desired program results. In 2017, there were 7 users of the IUD contraceptive type of contraceptive who failed from 1,102 new IUD users in Kendal Regency so that the ratio of success and failure to the IUD KB program when compared to users of the new IUD KB is 0.64%: 99.36% . The ratio of success and failure of family planning programs which tend to be unbalanced makes it difficult to predict. One of the handling imbalanced data is oversampling, for example using Random Oversampling (ROS). Naive Bayes is used for classification because it’s easy and efficient learning model. The data in this study used 14 independent variables and 1 dependent variable. The results of this study indicate that the G-mean of Naive Bayes is less than 60%. The G-mean of ROS-Naive Bayes is 96.6%. It can be concluded that in this research, the ROS-Naive Bayes method is better than the Naive Bayes method for detecting the success status of IUD family planning in Kendal Regency.

Keywords: Naive Bayes, Random Oversampling, G-mean

Note: This article has supplementary file(s).

Fulltext View|Download | Research Instrument

Untitled

Subject
Type	Research Instrument
	View (8KB) Indexing metadata

Email colleagues

Keywords: Naive Bayes; Random Oversampling; G-mean

Article Metrics:

Article Info

Section: Articles

Language : ID

In Vol 10, No 1 (2021): Jurnal Gaussian

BKKBN. (2017). Laporan Program KB Nasional, Dalap Tabel 8A Kumulatif. Tersedia di: http://aplikasi.bkkbn.go.id/sr/Klinik/Laporan2013/Bulanan/Faskes2013Tabel8aKumulatif.aspx (Diakses pada: 26 February 2021)
Chawla, N. V (2003). C4.5 and Imbalanced Data Sets : Investigating the effect of Sampling Method, Probabilistic Estimate, and Decision Tree Structure. ICML Workshop Learning from Imbalanced Data Sets II. Washington D.C
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research Vol. 16, No. 2, Hal: 321–357
Chen, L., Fang, B., Shang, Z., & Tang, Y. (2018). Tackling class overlap and imbalance problems in software defect prediction. Software Quality Journal Vol. 26, No. 1, Hal: 97-125 doi: 10.1007/s11219-016-9342-6
García, V., Sánchez, J. S. & Mollineda, R. A. (2012). On the effectiveness of preprocessing
methods when dealing with different levels of class imbalance. Knowledge-Based Systems, Vol: 25, No. 1, Hal: 13-21. doi: 10.1016/j.knosys.2011.06.013
Hastuti, Y. (2016). Klasifikasi Karakteristik Mahasiswa Universitas Cokroaminoto Palopo Menggunakan Metode Naive Bayes dan Decision Tree. Jurnal Dinamika Vol. 07, No. 2, Hal: 34-41
Kubat, M. & Matwin, S. (1997). Addressing the Curse of Imbalanced Training Sets: One-Sided Selection. Fourteenth International Conference on Machine Learning Hal: 179-186
Mutrofin, S., Mualif, A., Ginardi, R. V., & Fatichah, C. (2019). Solution of Class Imbalance of K-Nearest Neighbor for Data of New Student Admission Selection. International Journal of Artificial Intelligence Research Vol. 3, No. 2, Hal: 47-55. doi: 10.29099/ijair.v3i2.92
Pangastuti, S. S., Fithriasari, K. & Irawan, N. (2018). Perbandingan Metode Ensemble Random Forest dengan Smote-Boosting dan Smote-Bagging pada Klasifikasi Data Mining untuk Kelas Imbalance. Surabaya: Institut Teknologi Sepuluh Nopember
Prasetyo, E. (2012). Data Mining - Konsep dan Aplikasi Menggunakan MATLAB. Yogyakarta: ANDI
Prasetyo, E. (2014). Data Mining - Mengolah Data Menjadi Informasi Menggunakan Matlab. Yogyakarta: ANDI
Raschka, S. (2018). Model evaluation, model selection, and algorithm selection in machine learning
Saifudin, A. & Wahono, R. P. (2015). Pendekatan Level Data untuk Menangani Ketidakseimbangan Kelas pada Prediksi Cacat Software. Journal of Software Engineering Vol. 3, No. 2, Hal: 47-55
Tan, P., Steinbach, M., & Kumar, V. (2006). Introduction to Data Mining. Boston: Pearson Education
Ustyannie, W., & Suprapto, S. (2020). Oversampling Method to Handling Imbalanced Dataset Problem in Binary Logistic Regression Algorithm. Indonesian Journal of Computing and Cybernetics Systems Vol. 14, No.1, Hal: 1-10. doi: 10.22146/ijccs.37415

Last update:

No citation recorded.

Last update:

No citation recorded.

The Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Media Statistika journal and Department of Statistics, Universitas Diponegoro as the publisher of the journal. Copyright encompasses the rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.

Jurnal Gaussian and Department of Statistics, Universitas Diponegoro and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Gaussian journal are the sole and exclusive responsibility of their respective authors and advertisers.

The Copyright Transfer Form can be downloaded here: [Copyright Transfer Form Jurnal Gaussian]. The copyright form should be signed originally and send to the Editorial Office in the form of original mail, scanned document or fax :

Dr. Rukun Santoso (Editor-in-Chief)
Editorial Office of Jurnal Gaussian
Department of Statistics, Universitas Diponegoro
Jl. Prof. Soedarto, Kampus Undip Tembalang, Semarang, Central Java, Indonesia 50275
Telp./Fax: +62-24-7474754
Email: jurnalgaussian@gmail.com