METODE K-HARMONIC MEANS CLUSTERING DENGAN VALIDASI SILHOUETTE COEFFICIENT (Studi Kasus :  Empat Faktor Utama Penyebab Stunting 34 Provinsi di Indonesia Tahun 2018)

Silvy ‘Aina Salsabila; Tatik Widiharih; Sudarno Sudarno

doi:10.14710/j.gauss.v11i1.34003

DOI: https://doi.org/10.14710/j.gauss.v11i1.34003

METODE K-HARMONIC MEANS CLUSTERING DENGAN VALIDASI SILHOUETTE COEFFICIENT (Studi Kasus : Empat Faktor Utama Penyebab Stunting 34 Provinsi di Indonesia Tahun 2018)

Silvy ‘Aina Salsabila - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

*Tatik Widiharih - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

Sudarno Sudarno - Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia

BibTex Citation Data :

@article{J.Gauss34003,
    author = {Silvy Salsabila and Tatik Widiharih and Sudarno Sudarno},
    title = {METODE K-HARMONIC MEANS CLUSTERING DENGAN VALIDASI SILHOUETTE COEFFICIENT (Studi Kasus :  Empat Faktor Utama Penyebab Stunting 34 Provinsi di Indonesia Tahun 2018)},
    journal = {Jurnal  Gaussian},
  volume = {11},
    number = {1},
    year = {2022},
    keywords = {Clustering, K-Harmonic Means, Euclidean distance, Silhouette Coefficient, Stunting},
    abstract = { The k-harmonic means method is a method of using the cluster center point value, which is to determine each cluster from its center point based on the calculation of the harmonic average. The k-harmonic means determines the existence of each data point based on the membership function and weighting function by using a distance measure. in the clustering, which aims to increase the importance of data that is far from each central point. This causes the k-harmonic means to be insensitive in initialization in determining the cluster center point and significantly improves the quality of clustering compared to k-means. In determining the level of similarity, the determination of the level of similarity uses the distance measure and the distance measure used is the Euclidean distance measure. The distance measure used in cluster analysis can affect the cluster results obtained. Thus, to determine the quality of the results of the cluster analysis, validation tests were carried out using an internal criteria approach, namely silhouette coefficient. In this study, the k-harmonic means used to classify provinces in Indonesia based on the causes of stunting in 2018. The stunting in children under five in Indonesia has exceeded the limit set by WHO. In 2016-2017 there was an increase in the prevalence of stunting by 27.5% to 29.6%. The k-harmonic means method is used so that the four main factors causing stunting in every province in Indonesia can be seen and the prevention and cure of stunting can run optimally. This method is also used because the data on the four factors that cause stunting show a significant rate of change and as a measure of central tendency in 34 provincial objects in Indonesia. Four factors that cause stunting are used, namely the percentage of households that do not have access to clean drinking water, the percentage of exclusive breastfeeding, the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely and the percentage of households that do not have proper sanitation facilities. The results obtained by the cluster which is optimal at k= 3 using the Euclidean, where the silhouette coefficient = 0,3040722675 ≈ 0,3. Based on the results of the cluster analysis, it is known that in cluster one, the main factor that stands out the most is the percentage of exclusive breastfeeding. In cluster two, the main factor that stands out the most is the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely. In cluster three, the most prominent main factors are the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely and the percentage of households that do not have proper sanitation facilities with the highest average centroid among other clusters.      Keywords:  Clustering, K-Harmonic Means, Euclidean distance, Silhouette Coefficient, Stunting    },
   issn = {2339-2541},   pages = {11--20}  doi = {10.14710/j.gauss.v11i1.34003},
    url = {https://ejournal3.undip.ac.id/index.php/gaussian/article/view/34003}
}

Citation Format:

Abstract

The k-harmonic means method is a method of using the cluster center point value, which is to determine each cluster from its center point based on the calculation of the harmonic average. The k-harmonic means determines the existence of each data point based on the membership function and weighting function by using a distance measure. in the clustering, which aims to increase the importance of data that is far from each central point. This causes the k-harmonic means to be insensitive in initialization in determining the cluster center point and significantly improves the quality of clustering compared to k-means. In determining the level of similarity, the determination of the level of similarity uses the distance measure and the distance measure used is the Euclidean distance measure. The distance measure used in cluster analysis can affect the cluster results obtained. Thus, to determine the quality of the results of the cluster analysis, validation tests were carried out using an internal criteria approach, namely silhouette coefficient. In this study, the k-harmonic means used to classify provinces in Indonesia based on the causes of stunting in 2018. The stunting in children under five in Indonesia has exceeded the limit set by WHO. In 2016-2017 there was an increase in the prevalence of stunting by 27.5% to 29.6%. The k-harmonic means method is used so that the four main factors causing stunting in every province in Indonesia can be seen and the prevention and cure of stunting can run optimally. This method is also used because the data on the four factors that cause stunting show a significant rate of change and as a measure of central tendency in 34 provincial objects in Indonesia. Four factors that cause stunting are used, namely the percentage of households that do not have access to clean drinking water, the percentage of exclusive breastfeeding, the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely and the percentage of households that do not have proper sanitation facilities. The results obtained by the cluster which is optimal at k= 3 using the Euclidean, where the silhouette coefficient = 0,3040722675 ≈ 0,3. Based on the results of the cluster analysis, it is known that in cluster one, the main factor that stands out the most is the percentage of exclusive breastfeeding. In cluster two, the main factor that stands out the most is the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely. In cluster three, the most prominent main factors are the percentage of Low Birth Weight Babies (LBW) 2,500-grams born safely and the percentage of households that do not have proper sanitation facilities with the highest average centroid among other clusters.

Keywords: Clustering, K-Harmonic Means, Euclidean distance, Silhouette Coefficient, Stunting

Fulltext View|Download

Keywords: Clustering, K-Harmonic Means, Euclidean distance, Silhouette Coefficient, Stunting

Article Metrics:

Article Info

Section: Articles

Language : ID

In Vol 11, No 1 (2022): Jurnal Gaussian