학술논문

An 176.3 GOPs Object Detection CNN Accelerator Emulated in a 28nm CMOS Technology
Document Type
Conference
Source
2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) Artificial Intelligence Circuits and Systems (AICAS), 2021 IEEE 3rd International Conference on. :1-4 Jun, 2021
Subject
Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Semiconductor device modeling
Emulation
Object detection
Bandwidth
CMOS technology
Real-time systems
Engines
Object Detection
YOLO
Convolution Neural Network
Hardware Accelerator
System-on-Chip
SoC
Language
Abstract
Object Detection methods are an important subject in the implementations of Artificial Intelligent systems. Many attempts to build up a real-time object detection hardware/software in an SoC are presented in recent research. However, due to the demanding in both memory bandwidth and parallel computing resources, only few designs can fulfill the real-time requirement, which is important for applications or payloads such as Drones, UAVs, and autonomous vehicles. In this paper, issues in design of an SoC based object-detection will be discussed, and an on-going hardware design base on YOLO algorithm and ARC Platform will be presented. With the optimization in both numbers of Processing Element and Memory Bandwidth, An 176.3 GOPs CNN accelerator with 30 fps performance at 400MHz is presented. In addition to the Object-Detection engine, a ZCA image preprocessor and NMS postprocessing are also proposed to simplify the corresponding CNN model and enhance the real-time performance of Object-Detection. The emulation results and demonstration videos in the ARC platform will be presented. The post-layout simulation of current designverified the targeting real-time performance in a 28nm CMOS technology.