학술논문

MalBoT-DRL: Malware Botnet Detection Using Deep Reinforcement Learning in IoT Networks
Document Type
Periodical
Source
IEEE Internet of Things Journal IEEE Internet Things J. Internet of Things Journal, IEEE. 11(6):9610-9629 Mar, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Internet of Things
Botnet
Malware
Data models
Anomaly detection
Feature extraction
Computational modeling
Bashlite
botnet detection
incremental statistics
intrusion detection
Internet of Things (IoT) botnet
MalBoT-DRL
mirai
network traffic analysis
reinforcement learning (RL)
torii
Language
ISSN
2327-4662
2372-2541
Abstract
In the dynamic landscape of cyber threats, multistage malware botnets have surfaced as significant threats of concern. These sophisticated threats can exploit Internet of Things (IoT) devices to undertake an array of cyberattacks, ranging from basic infections to complex operations, such as phishing, cryptojacking, and Distributed Denial-of-Service (DDoS) attacks. Existing machine learning solutions are often constrained by their limited generalizability across various data sets and their inability to adapt to the mutable patterns of malware attacks in real world environments, a challenge known as model drift. This limitation highlights the pressing need for adaptive intrusion detection systems (IDSs), capable of adjusting to evolving threat patterns and new or unseen attacks. This article introduces MalBoT-DRL, a robust malware botnet detector using deep reinforcement learning (RL). Designed to detect botnets throughout their entire lifecycle, MalBoT-DRL has better generalizability and offers a resilient solution to model drift. This model integrates damped incremental statistics with an attention reward mechanism, a combination that has not been extensively explored in the literature. This integration enables MalBoT-DRL to dynamically adapt to the ever-changing malware patterns within IoT environments. The performance of MalBoT-DRL has been validated via trace-driven experiments using two representative data sets: 1) MedBIoT and 2) N-BaIoT, resulting in exceptional average detection rates of 99.80% and 99.40% in the early and late detection phases, respectively. To the best of our knowledge, this work introduces one of the first studies to investigate the efficacy of RL in enhancing the generalizability of IDS.