학술논문

Split-and-Shuffle Detector for Real-Time Traffic Object Detection in Aerial Image
Document Type
Periodical
Source
IEEE Internet of Things Journal IEEE Internet Things J. Internet of Things Journal, IEEE. 11(8):13312-13326 Apr, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Detectors
Autonomous aerial vehicles
Object detection
Real-time systems
Feature extraction
Internet of Things
Task analysis
Deep learning
real-time object detection
traffic scene perception
unmanned aerial vehicle (UAV) image
Language
ISSN
2327-4662
2372-2541
Abstract
Real-time object detection is an essential part of various Internet of Things (IoT) applications. Unmanned aerial vehicles (UAVs) employ visual sensors to capture highdefinition images to detect objects of interest. However, current research on UAV detectors mainly focuses on developing more sophisticated network architectures, with little attention paid to the limitations of UAV computing resources. In this work, we present an end-to-end split-and-shuffle detector, named SCSDet. Unlike the mainstream detector designs that heavily rely on bottleneck structures, our method is based on inexpensive splitand- shuffle operations. It encourages the detector to avoid unnecessary transformation layers for channel down-sampling, thereby minimizing memory and computation cost. This is rarely studied in detector architecture design. Specifically, we first design a lightweight backbone structure (SCSNet) based on split-and-shuffle, which allows frequent interaction between different gradient information to capture more useful nonlinear features for small-scale objects at a low cost. Next, we construct an efficient receptive field module (ERFM) to generate richer multireceptive field expressions for the initial feature space. It significantly alleviates the adverse effects of single receptive field size on the capability of the detectors to detect small-scale objects. Finally, we propose a grouped local attention convolution (GLAConv), which utilizes local sliding windows with different coverage rates to adaptively learn channel and spatial attention. This makes the detector to focus on the foreground. Experimental results show that our method achieves high accuracy with low complexity in UAV object detection.