학술논문

NLP and ML Synergy: A Novel Approach in Botnet Detection from Sandbox Artifacts
Document Type
Conference
Source
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS) Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS), 2024 ASU International Conference in. :1679-1684 Jan, 2024
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Botnet
Transforms
Machine learning
Feature extraction
Natural language processing
Computer crime
Sustainable development
cybersecurity
botnets
machine learning
datasets
natural language processing
Language
Abstract
The advent of ubiquitous internet access has led to a proliferation of cyber threats. Among these, botnets represent a significant and growing menace to cyber security. Addressing this challenge necessitates the development of potent botnet detection methods. Traditional approaches to botnet detection have predominantly relied on a range of features derived from static or dynamic analysis. This paper presents a novel approach to botnet detection, utilizing Natural Language Processing (NLP), a branch of machine learning (ML), for a more effective analysis. By analyzing behavioral reports through NLP methodologies, including bag-of-words (BoW), BERT, GloVe, and word2vec, we generate rich datasets for ML applications. This unique combination of NLP and ML techniques transforms behavioral data into valuable detection features. Our application of these techniques, reinforced by the XGboost classifier, demonstrates exceptional results in botnet detection, achieving an accuracy of 99.17% and a ROC/AUC score of 0.9995. These findings highlight the critical role of NLP in enhancing feature extraction and the effectiveness of ML in combating botnet threats.