Naive Bayes Classifier Optimization on Sentiment Analysis of Hotel Reviews
Main Article Content
Abstract
Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire optimal machine learning model, information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.
Article Details
JPPI provides immediate open access to its content on the principle that making research freely available to the public to supports a greater global exchange of knowledge.
JPPI by MCIT/Kemenkominfo is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://kominfo.go.id/.
References
Cahyana, N., Khomsah, S., & Aribowo, A. S. (2019). Improving Imbalanced Dataset Classification Using Oversampling and Gradient Boosting. Proceeding - 2019 5th International Conference on Science in Information Technology: Embracing Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019, 217–222. https://doi.org/10.1109/ICSITech46713.2019.8987499
Eberhart, R., & Kennedy, J. (1995). New optimizer using particle swarm theory. Proceedings of the International Symposium on Micro Machine and Human Science, 39–43. https://doi.org/10.1109/mhs.1995.494215
Elin Hanjani Pramitha. (2020). Sentiment Analysis Komentar Pelanggan Hotel Di Purwokerto Menggunakan Naive Bayes Classifier.
Feng, G., Guo, J., Jing, B. Y., & Sun, T. (2015). Feature subset selection using naive Bayes for text classification. Pattern Recognition Letters, 65, 109–115. https://doi.org/10.1016/j.patrec.2015.07.028
Hu, X., Eberhart, R. C., & Shi, Y. (2003). Engineering optimization with particle swarm. 2003 IEEE Swarm Intelligence Symposium, SIS 2003 - Proceedings, 53–57. https://doi.org/10.1109/SIS.2003.1202247
Khomsah, S., & Aribowo, A. S. (2020). Model Text-Preprocessing Komentar Youtube Dalam Bahasa Indonesia. Rekayasa Sistem Dan Teknologi Informasi, RESTI, 4(10), 648–654. https://doi.org/https://doi.org/10.29207/resti.v4i4.2035
Naive Bayes. (n.d.). https://scikit-learn.org/stable/modules/naive_bayes.html
Osman, S. E., & Zarog, M. (2019). Optimized V-Shaped Beam Micro-Electrothermal Actuator Using Particle Swarm Optimization (PSO) Technique. Micro and Nanosystems, 11(1), 62–67. https://doi.org/10.2174/1876402911666190208162346
Pandhu Wijaya, A., & Agus Santoso, H. (2018). Improving the Accuracy of Naïve Bayes Algorithm for Hoax Classification Using Particle Swarm Optimization. Proceedings - 2018 International Seminar on Application for Technology of Information and Communication: Creative Technology for Human Life, ISemantic 2018, 482–487. https://doi.org/10.1109/ISEMANTIC.2018.8549700
Pramono, F., Didi Rosiyadi, & Windu Gata. (2019). Integrasi N-gram, Information Gain, Particle Swarm Optimation di Naïve Bayes untuk Optimasi Sentimen Google Classroom. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 3(3), 383–388. https://doi.org/10.29207/resti.v3i3.1119
Rasjid, Z. E., & Setiawan, R. (2017). Performance Comparison and Optimization of Text Document Classification using k-NN and Naïve Bayes Classification Techniques. Procedia Computer Science, 116, 107–112. https://doi.org/10.1016/j.procs.2017.10.017
Rizaldy, A., & Santoso, H. A. (2017). Performance improvement of support vector machine (SVM) With information gain on categorization of Indonesian news documents. Proceedings - 2017 International Seminar on Application for Technology of Information and Communication: Empowering Technology for a Better Human Life, ISemantic 2017, 2018-January, 227–231. https://doi.org/10.1109/ISEMANTIC.2017.8251874
Salton, G., & Buckley, C. (1988). Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management, 24(5), 513–523. https://doi.org/https://doi.org/10.1016/0306-4573(88)90021-0
Suyanto;, Arifianto, A., Rismala, R., & Sunyoto, A. (2020). Evolutionary Machine Learning (Edisi 1). Informatika.
Wardhani, N. K., Rezkiani, Kurniawan, S., Setiawan, H., Gata, G., Tohari, S., Gata, W., & Wahyudi, M. (2018). Sentiment analysis article news coordinator minister of maritime affairs using algorithm naive bayes and support vector machine with particle swarm optimization. Journal of Theoretical and Applied Information Technology, 96(24), 8365–8378.
Xiang, Z., Schwartz, Z., Gerdes, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction? International Journal of Hospitality Management, 44, 120–130. https://doi.org/10.1016/j.ijhm.2014.10.013
Xie, K., & Zhang, J. (2014). The Business Value of Online Consumer Reviews and Management Response to Hotel Performance. International Journal of Hospitality Management, 43(October 2017), 1–12. https://doi.org/10.1016/j.ijhm.2014.07.007
Yan, Y., Zhang, R., Wang, J., & Li, J. (2018). Modified PSO algorithms with “Request and Reset” for leak source localization using multiple robots. Neurocomputing, 292, 82–90. https://doi.org/10.1016/j.neucom.2018.02.078