학술논문

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Document Type

Conference

Author

Flich, Jose; Medina, Laura; Catalan, Izan; Hernandez, Carles; Bragagnolo, Andrea; Auzanneau, Fabrice; Briand, David

Source

2022 IEEE International Conference on Image Processing (ICIP) Image Processing (ICIP), 2022 IEEE International Conference on. :2491-2495 Oct, 2022

Subject

Computing and Processing
Signal Processing and Analysis
Quantization (signal)
Embedded systems
Image coding
Computational modeling
Artificial neural networks
Libraries
Field programmable gate arrays
FPGA
quantization
pruning
inference

Language

ISSN

2381-8549

Abstract

Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems, model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfigurable systems such as FPGAs for the embedded system to work properly. In this paper, we present HLSinf, an open source framework for the development of custom NN accelerators for FPGAs which provides efficient support to quantized and pruned NN models. With HLSinf, significant inference speedups can be obtained for typical medical image-based applications. In particular, we obtain up to 90x speedup factor when we combine quantization/pruning with the flexibility of HLSinf compared to CPU.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송