skip to main content

ANALISIS KECENDERUNGAN LAPORAN MASYARAKAT PADA “LAPORGUB..!” PROVINSI JAWA TENGAH MENGGUNAKAN TEXT MINING DENGAN FUZZY C-MEANS CLUSTERING

*Ratna Kurniasari  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Rukun Santoso  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Alan Prahutama  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Open Access Copyright 2021 Jurnal Gaussian under http://creativecommons.org/licenses/by-nc-sa/4.0.

Citation Format:
Abstract

Effective communication between the government and society is essential to achieve good governance. The government makes an effort to provide a means of public complaints through an online aspiration and complaint service called “LaporGub..!”. To group incoming reports easier, the topic of the report is searched by using clustering. Text Mining is used to convert text data into numeric data so that it can be processed further. Clustering is classified as soft clustering (fuzzy) and hard clustering. Hard clustering will divide data into clusters strictly without any overlapping membership with other clusters. Soft clustering can enter data into several clusters with a certain degree of membership value. Different membership values make fuzzy grouping have more natural results than hard clustering because objects at the boundary between several classes are not forced to fully fit into one class but each object is assigned a degree of membership. Fuzzy c-means has an advantage in terms of having a more precise placement of the cluster center compared to other cluster methods, by improving the cluster center repeatedly. The formation of the best number of clusters is seen based on the maximum silhouette coefficient. Wordcloud is used to determine the dominant topic in each cluster. Word cloud is a form of text data visualization. The results show that the maximum silhouette coefficient value for fuzzy c-means clustering is shown by the three clusters. The first cluster produces a word cloud regarding road conditions as many as 449 reports, the second cluster produces a word cloud regarding covid assistance as many as 964 reports, and the third cluster produces a word cloud regarding farmers fertilizers as many as 176 reports. The topic of the report regarding covid assistance is the cluster with the most number of members.

 

Fulltext View|Download
Keywords: LaporGub, Text Mining, Clustering, Fuzzy C-Means, Silhouette Coefficient, Wordcloud

Article Metrics:

  1. Aini, F. N. (2014). Clustering Business Process Model Petri Net. Jurnal Itsmart Vol. 3, No. 2: Hal. 47-51
  2. Feldman, R., & Sanger, J. (2007). The Text Mining Handbook : Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press
  3. Febriyanto, F. (2019). Sistem Penilaian Otomatis Jawaban Esai Dengan Menggunakan Metode Vector Space Model Pada Beberapa Perkuliahan Di Stmik Indonesia Banjarmasin. Jurnal Teknologi Informasi Vol XIV, No. 1: Hal. 53–68
  4. Han, J., Kamber, M., & Pei, J. (2012). Data mining: Data mining concepts and techniques, Third Edition. Waltham: Morgan Kaufmann Publishers
  5. Jaya, T. S., Adi, K., & Noranita, B. (2014). Sistem Pemilihan Perumahan dengan Metode Kombinasi Fuzzy C-Means Clustering dan Simple Additive Weighting. Jurnal Sistem Informasi Bisnis, Vol. 1
  6. Kaufman, L., Rousseuw, P. J., & Gentle, J. E. (1991). Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons Inc
  7. Kusumadewi, S., & Purnomo, H. (2010). Aplikasi Logika Fuzzy untuk Pendukung Keputusan (Edisi 2). Yogyakarta: Graha Ilmu
  8. Luthfi, E. T. (2007). Fuzzy C-Means Untuk Clustering Data ( Studi Kasus : Data Performance Mengajar Dosen ). Seminar Nasional Teknologi 2007, Yogyakarta : 24 November 2007. Hal: 1-7
  9. Pradana, M. G. (2020). Penggunaan fitur wordcloud dan document term matrix dalam text mining. Jurnal Ilmiah Informatika (JIF) Vol. 08, No. 1: Hal. 38-43
  10. Putri, R. K., Warsito, B., & Mustafid, M. (2019). Implementasi Algoritma Modified Gustafson-Kessel Untuk Clustering Tweets Pada Akun Twitter Lazada Indonesia. Jurnal Gaussian Vol. 8, No. 3:Halm 285-295
  11. Rencher, A. C. (2002). Methods of Multivariate Analysis. In Methods of Multivariate Analysis. Amerika: John Wiley & Sons, Inc
  12. Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for IDF. Journal of Documentation Vol. 60, No. 5: Hal. 503–520
  13. Rofiqi, A. Y. (2017). Clustering Berita Olahraga Berbahasa Indonesia Menggunakan Metode K-Medoid Bersyarat. Jurnal Simantec Vol. 6, No. 1: Hal. 25-32
  14. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management
  15. Struyf, A., Hubert, M., & Rousseeuw, P. J. (1997). Integrating robust clustering techniques in S-PLUS. Computational Statistics and Data Analysis, Vol 26: No.1: Hal. 17–37
  16. Tessem, B., Bjørnestad, S., Chen, W., & Nyre, L. (2015). Word cloud visualisation of locative information. Journal of Location Based Services Vol. 9, No. 4: Hal.254-272
  17. Vijayarani, S., & Janani, R. (2016). Text Mining: open Source Tokenization Tools – An Analysis. Advanced Computational Intelligence: An International Journal (ACII) Vol. 3, No.1: Hal. 37–47

Last update:

No citation recorded.

Last update:

No citation recorded.