학술논문
Proposing Appropriate SMS Spam Detection Approaches for Variations of the Vietnamese Language
Document Type
Conference
Author
Source
2023 RIVF International Conference on Computing and Communication Technologies (RIVF) Computing and Communication Technologies (RIVF), 2023 RIVF International Conference on. :1-5 Dec, 2023
Subject
Language
ISSN
2473-0130
Abstract
This paper introduces suitable methods for detecting SMS spam in various forms of the Vietnamese language. The researchers conducted experiments using five algorithms: SVM, Naive Bayes, Random Forests, CNN, and LSTM, across three different Vietnamese datasets. The results indicate that LSTM and CNN, complemented by a transformer learning model called PhoBert, outperformed traditional machine learning models. Specifically, the LSTM model achieved the highest accuracy of 97.77% when applied to the Vietnamese dataset with full diacritics. Similarly, the CNN model and PhoBert model achieved the highest accuracy of 95.56% when working with the Vietnamese dataset without diacritics.