skip to main content

PEMODELAN TOPIK ULASAN APLIKASI NETFLIX PADA GOOGLE PLAY STORE MENGGUNAKAN LATENT DIRICHLET ALLOCATION

*Gina Rosalinda  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Rukun Santoso  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Puspita Kartikasari  -  Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro, Indonesia
Open Access Copyright 2022 Jurnal Gaussian under http://creativecommons.org/licenses/by-nc-sa/4.0.

Citation Format:
Abstract

The vast amount of review data available on the Google Play Store can be utilized to extract hidden essential information. These reviews have an unstructured format that requiring particular methods to automatically collect and analyze the review data. Topic modeling is an extension of text analysis that can find main themes or trends hidden in large sets of unstructured documents. This study applies topic modeling with the Latent Dirichlet Allocation (LDA) method to Netflix application review data sourced from the Google Play Store web. The Latent Dirichlet Allocation (LDA) method is a generative probabilistic model from textual data that can explain the hidden semantic themes in the review document. This research aims to analyze hidden topics that application users discuss. These hidden topics contain essential valuable information for Netflix users and the company. Users can use this information to decide before using Netflix services. Meanwhile, Netflix can use this information to improve the quality of its services. This research use data from a web scraping Netflix review on the Google Play Store from January 2021–August 2021. The results of topic modeling show that of the twelve topics generated, the most discussed topic by users is payment methods.

Fulltext View|Download
Keywords: Topic Modeling; Latent Dirichlet Allocation; Topic Coherence; Netflix; Google Play Store.

Article Metrics:

  1. Agustina, A. (2017). Analisis dan Visualisasi Suara Pelanggan pada Pusat Layanan Pelanggan dengan Pemodelan Topik Menggunakan Latent Dirichlet Allocation (LDA). December
  2. Annisa, R., Surjandari, I., & Zulkarnain. (2019). Opinion Mining on Mandalika Hotel Reviews Using Latent Dirichlet Allocation. Procedia Computer Science, 161, 739–746
  3. Berry, M. W. & Kogan, J. (2010). Text Mining Applications and Theory. United Kingdom: WILEY
  4. Blei, D. M. (2012). Probabilistic Topic Models. Communications of the ACM, 55(4), 77–84
  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022
  6. Campbell, J. C., Hindle, A., & Stroulia, E. (2014). Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data. The Art and Science of Analyzing Software Data, 139–159
  7. Castellà, Q. & Sutton, C. (2014). Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. WWW 2014 - Proceedings of the 23rd International Conference on World Wide Web, 665–675
  8. Dailysocial.id. (2020). Menengok Sederet Aplikasi Hiburan Terpopuler Selama Pandemi. Dailysocial. https://dailysocial.id/post/menengok-sederet-aplikasi-hiburan-terpopuler-selama-pandemi. Diakses: 6 Januari 2021
  9. Feldman, R. & Sanger, J. (2007). The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. New York: Cambridge University Press
  10. Heinrich, G. (2005). Parameter Estimation for Text Analysis. Bernoulli, 35, 1–31
  11. Mimno, D., Wallach, H. M., Talley, E., Leenders, M., & McCallum, A. (2011). Optimizing Semantic Coherence in Topic Models. EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of The Conference, 2, 262–272
  12. Netflix, A. (2021). Tentang Netflix. https://media.netflix.com/id/about-netflix. Diakses : 6 Januari 2021
  13. Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. NAACL HLT 2010 - Human Language Technologies: The 2010 Annual Conference of The North American Chapter of The Association for Computational Linguistics, Proceedings of The Main Conference, June, 100–108
  14. Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012). Exploring Topic Coherence Over Many Models and Many Topics. EMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference, July, 952–961
  15. Steyvers, M. & Griffiths, T. (2006). Probalistic Topic Models. Latent Semantic Analysis: A Road To Meaning, 3(3), 993–1022
  16. Wayne, M. L. (2018). Netflix, Amazon, and Branded Television Content in Subscription Video On-Demand Portals. Media, Culture and Society, 40(5), 725–741

Last update:

No citation recorded.

Last update:

No citation recorded.