학술논문

Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:8846-8874 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Cloud computing
Classification algorithms
Computational modeling
Feature extraction
Training
Telecommunication traffic
Support vector machines
Hybrid metaheuristic approach
GOA-GA-based feature selection
UNSW-NB15
CIC-DDoS2019
CIC Bell DNS EXF 2021
Language
ISSN
2169-3536
Abstract
The adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious activities within a cloud system. The considerable volume of network traffic data may contain redundant and irrelevant features that can impact the classification performance of the classifier. In addition, the complexity and time consumption increase while processing such a substantial volume of data in the cloud intrusion detection process. To enhance the performance of the IDS, this study proposes a hybrid feature selection approach, combining two bio-inspired algorithms, namely the grasshopper optimization algorithm (GOA) and the genetic algorithm (GA). The combination of these two algorithms ensures a more efficient search for optimal solutions. A random forest (RF) classifier is trained using those optimal features. Moreover, the proposal addresses the challenge of imbalanced data by employing a hybrid approach: over-sampling the minority classes using an adaptive synthetic (ADASYN) algorithm, while implementing random under-sampling (RUS) for the majority class as needed. This integrated strategy significantly influences each category, enhancing the true positive rate (TPR) while minimizing the false positive rate (FPR), thus improving the overall system performance. The proposed approach was evaluated using three datasets: UNSW-NB15, CIC-DDoS2019, and CIC Bell DNS EXF 2021. The recorded accuracies for these datasets were 98%, 99%, and 92%, respectively. The hybrid feature selection-based IDS demonstrated superior performance in multi-class classification, along with exemplary results for individual classes within the datasets. The proposed strategy exhibited a marked superiority with the random forest classifier, especially when compared to other classifiers including SVM, LR, FLN, LSTM, AlexNet, DNN, DBN, DT, and XGBoost. Moreover, this performance remained consistent and commendable even when benchmarked against contemporary state-of-the-art methodologies across multiple evaluation metrics.