학술논문

Proposing Appropriate SMS Spam Detection Approaches for Variations of the Vietnamese Language
Document Type
Conference
Source
2023 RIVF International Conference on Computing and Communication Technologies (RIVF) Computing and Communication Technologies (RIVF), 2023 RIVF International Conference on. :1-5 Dec, 2023
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Support vector machines
Deep learning
Machine learning algorithms
Manuals
Transformers
Data models
Communications technology
SMS Spam
Vietnamese SMS Spam
machine learning
deep learning
transfer learning
PhoBert
Language
ISSN
2473-0130
Abstract
This paper introduces suitable methods for detecting SMS spam in various forms of the Vietnamese language. The researchers conducted experiments using five algorithms: SVM, Naive Bayes, Random Forests, CNN, and LSTM, across three different Vietnamese datasets. The results indicate that LSTM and CNN, complemented by a transformer learning model called PhoBert, outperformed traditional machine learning models. Specifically, the LSTM model achieved the highest accuracy of 97.77% when applied to the Vietnamese dataset with full diacritics. Similarly, the CNN model and PhoBert model achieved the highest accuracy of 95.56% when working with the Vietnamese dataset without diacritics.