학술논문

DESEM: Depthwise Separable Convolution-Based Multimodal Deep Learning for In-Game Action Anticipation
Document Type
Periodical
Source
IEEE Access Access, IEEE. 11:46504-46512 2023
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Games
Deep learning
Feature extraction
Convolutional neural networks
Artificial intelligence
Videos
Forecasting
Action anticipation
depthwise separable convolution
game artificial intelligence
multimodal deep learning
weighted loss function
Language
ISSN
2169-3536
Abstract
In real-time strategy (RTS) games, to defeat their opponents, players need to choose and implement the correct sequential actions. Because RTS games like StarCraft II are real-time, players have a very limited time to choose how to develop their strategy. In addition, players can only partially observe the parts of the map that they have explored. Therefore, unlike Chess or Go, players do not know what their opponents are doing. For these reasons, applying generally used artificial intelligence models to forecast sequential actions in RTS games is a challenge. To address this, we propose depthwise separable convolution-based multimodal deep learning (DESEM) for forecasting sequential actions in the game StarCraft II. DESEM performs multimodal learning using high-dimensional frames and action labels simultaneously as inputs. We use a depthwise separable convolution as the backbone network for extracting features from high-dimensional frames. In addition, we propose a weighted loss function to resolve class imbalances. We use 1,978 StarCraft II replays where the Terrans win in a Terran vs. Protoss game. The experimental results show that the proposed depthwise separable convolution is superior to the conventional convolution. Furthermore, we demonstrate that multimodal learning and the weighted loss function contribute significantly to improving forecasting performance.