학술논문

An Empirical Framework for Malware Prediction Using Multi-Layer Perceptron
Document Type
Conference
Source
2023 OITS International Conference on Information Technology (OCIT) Information Technology (OCIT), 2023 OITS International Conference on. :485-490 Dec, 2023
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Power, Energy and Industry Applications
Signal Processing and Analysis
Training
Adaptation models
Predictive models
Prediction algorithms
Feature extraction
Malware
Data models
Classification Techniques
Data Imbalance Methods
Feature Selection
Malware Classification
Multilayer perceptron
Language
Abstract
In today’s interconnected world, where sensitive data and financial transactions are exchanged and stored online, the increasing prevalence of sophisticated and evolving malware attacks poses a significant threat to the cybersecurity of computer systems and network infrastructures. Therefore, accurate malware classification is pivotal to effectively defend against these hostile attacks. In recent developments, malware variants use sophisticated packaging and obfuscation techniques to gain unauthorised access to valuable information. The urgent need for improved security has become imperative in light of these developments. In this paper, we conducted an empirical study, in particular, the impact of data balancing, feature selection and dimensionality reduction, and the application of proposed MLP (Multi-layer perception) for malware prediction are investigated. The proposed MLP-enabled prediction model is efficient and effective, due to its ability to learn complex patterns across different malware families, allowing them to make accurate predictions about new, unseen malware patterns. The experimental results assert that the prediction models trained on sample data using a single-layer MLP with the Adaptive Movement Estimation (ADAM) algorithm and using GA selected features have better predictive ability. The experimental results confirmed that the models trained using 51.44% of the features proposed at GA had significantly similar predictive abilities compared to all features.