학술논문

Standard Latent Space Dimension for Network Intrusion Detection Systems Datasets

Document Type

Periodical

Author

Moyano, R.F.; Duque, A.; Riofrio, D.; Perez, N.; Benitez, D.; Baldeon-Calisto, M.; Fernandez, D.

Source

IEEE Access Access, IEEE. 11:57240-57252 2023

Subject

Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Machine learning
Dimensionality reduction
Feature extraction
Telecommunication traffic
Artificial intelligence
Network intrusion detection
Communication networks
Standardization
machine learning
autoencoder
latent space
network security

Language

ISSN

2169-3536

Abstract

Machine learning is a branch of artificial intelligence that provides computers the ability to create or improve algorithms without being explicitly programmed by directly learning from data. It is widely used in automation or decision-making tasks in fields such as image or speech recognition, sentiment analysis, or self-driving cars. However, its application in the field of communication networks is limited by the lack of appropriate research resources, such as rich datasets for training or the definition of a standard set of features. In this context, a standard latent space dimension is proposed by performing an autoencoder-based dimensionality reduction process. Different network security datasets are projected onto a lower-dimensional space to determine a standard or convergent dimension. The convergent dimension is determined by identifying the threshold above which diminishing returns begin to occur in the autoencoder loss as the latent space dimension increases. The experimental validation showed that four machine learning classification models, trained with a standard latent space of ten dimensions, performed as well as the models that used the non-reduced versions of the datasets in terms of F1-score and accuracy. Furthermore, a Wilcoxon statistical test showed that the mean accuracy of all classification models trained with the standard latent space dimension had a difference of less than 0.0235 in comparison to the models trained with the original inputs. A negligible difference in accuracy is a significant outcome because researchers can use only the latent space to perform experiments with certainty that the performance of ML models will not be constrained.

Online Access

Open Access (EBSCO) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송