학술논문

Infrared Drone Target Recognition Based on Global-Local Context Aggregation with Self-Attention
Document Type
Conference
Source
2024 5th International Conference on Machine Learning and Computer Application (ICMLCA) Machine Learning and Computer Application (ICMLCA), 2024 5th International Conference on. :119-126 Oct, 2024
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Context-aware services
Accuracy
Target recognition
Urban areas
Object detection
Autonomous aerial vehicles
Feature extraction
Real-time systems
Security
Drones
Infrared Drone Target Recognition
Contextual Information
Feature Extraction and Fusion
Self-Attention
Language
Abstract
As drone usage increases, the security risks linked to unauthorized flights have become more evident. Infrared imaging detection technology, recognized for its robust anti-interference features and ability to provide continuous surveillance, effectively monitors drone activities in urban areas. To tackle the issues arising from the limited infrared characteristics of drones and their close resemblance to background elements-which leads to low real-time detection and recognition rates-we created a detailed urban infrared drone target dataset and introduced the Infrared Unmanned Aerial Vehicle You Only Look Once(IUAV-YOLO) network model. This model is based on global-local context aggregation with self-attention. Our method starts with a Backbone Feature Extraction Framework (BFEM) aimed at maintaining semantic information for small targets within deep networks, while a multi-branch convolution broadens the receptive field to improve local perception. We also present a self-attention context aware module Spatial Pyramid Pooling Fast Global Average Max(SPPF-GAM) to combine multi-scale contextual features, enhancing spatial awareness. Finally, the Spatial Context Aware Module(SCAM) and CSP Bottleneck With Two Convolutions(C2f) modules are linked to merge global and local context across both channel and spatial dimensions, effectively addressing the subtle features of small drones against intricate backgrounds. Experimental findings show that the IUAV-YOLO model improves the Mean Average Precision (mAP@O.S) by 4.3% compared to the baseline model, while also decreasing the number of parameters by 433,636. Overall, our proposed network demonstrates significant benefits over other leading real-time detection algorithms, making it an asset for enhancing security in urban settings.