학술논문
Malware Detection Using Byte Streams of Different File Formats
Document Type
Periodical
Author
Source
IEEE Access Access, IEEE. 10:51041-51047 2022
Subject
Language
ISSN
2169-3536
Abstract
Malware detection is becoming more important task as we face more data on the Internet. Web users are vulnerable to non-executable files such as Word files and Hangul Word Processor files because they usually open such files without paying attention. As new infected non-executables keep appearing, deep-learning models are drawing attention because they are known to be effective and have better generalization power. Especially, the deep-learning models have been used to learn arbitrary patterns from byte streams, and they exhibited successful performance on malware detection task. Although there have been malware detection studies using the deep-learning models, they commonly aimed at a single file format and did not take using different formats into consideration. In this paper, we assume that different file formats may contribute to each other, and deep-learning models will have a better chance to learn more promising patterns for better performance. We demonstrate that this assumption is possible by experimental results with our annotated datasets of two different file formats (e.g., Portable Document Format (PDF) and Hangul Word Processor (HWP)).