학술논문

BOOST: Block Minifloat-Based On-Device CNN Training Accelerator with Transfer Learning

Document Type

Conference

Author

Guo, Chuliang; Lou, Binglei; Liu, Xueyuan; Boland, David; Leong, Philip H.W.; Zhuo, Cheng

Source

2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD) Computer Aided Design (ICCAD), 2023 IEEE/ACM International Conference on. :1-9 Oct, 2023

Subject

Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
General Topics for Engineers
Signal Processing and Analysis
Training
Design automation
Transfer learning
Bandwidth
Throughput
System-on-chip
Task analysis

Language

ISSN

1558-2434

Abstract

Adapting CNNs to changing problems is challenging on resource-limited edge devices due to intensive computations, high precision requirements, large storage needs, and high bandwidth. This paper presents BOOST, a novel block minifloat (BM)-based parallel CNN training accelerator on memory- and computation-constrained FPGAs for transfer learning (TL). By updating a small number of layers online, BOOST enables adaptation to changing problems. Our approach utilizes a unified 8-bit BM datatype (bm(2,5) ), i.e., with a sign bit, 2 exponent bits, and 5 mantissa bits, and proposes unified Conv and dilated Conv blocks that support non-unit stride and enable task-level parallelism during back-propagation to minimize latency. For ResNet20 and VGG-like training on CIFAR-10 and SVHN datasets, BOOST achieves near 32-bit floating point accuracy, reducing latency by 21%-43% and BRAM usage by 63%-66% compared to back-propagation training without TL. Notably, BOOST outperforms the prior SOTA works to achieve perbatch throughput of 131 and 209 GOPs for ResNet20 and VGG-like respectively.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송