학술논문

Automated microsoft office macro malware detection using machine learning

Document Type

Conference

Author

Source

2017 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2017 IEEE International Conference on. :4448-4452 Dec, 2017

Subject

Aerospace
Bioengineering
Computing and Processing
General Topics for Engineers
Geoscience
Signal Processing and Analysis
Transportation
Malware
Feature extraction
Security
Machine learning algorithms
Testing
Classification algorithms
macro
malware
Microsoft Office
machine learning
p-code

Language

Abstract

Macro malware in Microsoft (MS) Office files has long persisted as a cybersecurity threat. Though it ebbed after its initial rampages around the turn of the century, it has reemerged as threat. Attackers are taking a persuasive approach and using document engineering, aided by improved data mining methods, to make MS Office file malware appear legitimate. Recent attacks have targeted specific corporations with malicious documents containing unusually relevant information. This development undermines the ability of users to distinguish between malicious and legitimate MS Office files and intensifies the need for automating macro malware detection. This study proposes a method of classifying MS Office files containing macros as malicious or benign using the K-Nearest Neighbors machine learning algorithm, feature selection, and TFIDF where p-code opcode n-grams (translated VBA macro code) compose the file features. This study achieves a 96.3% file classification accuracy on a sample set of 40 malicious and 118 benign MS Office files containing macros, and it demonstrates the effectiveness of this approach as a potential defense against macro malware. Finally, it discusses the challenges automated macro malware detection faces and possible solutions.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송