학술논문

Quantization for Bayesian Deep Learning: Low-Precision Characterization and Robustness

Document Type

Conference

Author

Lin, Jun-Liang; Krishnan, Ranganath; Ranipa, Keyur Ruganathbhai; Subedar, Mahesh; Sanghavi, Vrushabh; Arunachalam, Meena; Tickoo, Omesh; Iyer, Ravishankar; Kandemir, Mahmut Taylan

Source

2023 IEEE International Symposium on Workload Characterization (IISWC) IISWC Workload Characterization (IISWC), 2023 IEEE International Symposium on. :180-192 Oct, 2023

Subject

Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Deep learning
Uniform resource locators
Quantization (signal)
Uncertainty
Codes
Computational modeling
Neural networks

Language

ISSN

2835-2238

Abstract

Bayesian Deep Learning is an emerging field for building robust and trustworthy AI systems due to its ability to estimate reliable uncertainty in neural networks. The need for modeling distribution over parameters and multiple Monte Carlo forward runs in Bayesian neural networks leads to larger model size and significant increase in inference latency compared to deterministic models, which poses challenges for practical deployment. Quantization is a technique that can reduce the model size and also speed up the inference through low-precision computation. In this work, we propose and evaluate a quantization framework and workflow for Bayesian deep learning workloads, which leverages 8-bit integer (INT8) operations to accelerate inference on the 4th Gen Intel Xeon scalable processor (formerly codenamed Sapphire Rapids). We demonstrate that our quantization workflow achieves 6.9x inference throughput speedup on the ImageNet benchmark without sacrificing the model accuracy and quality of uncertainty. Furthermore, we evaluate the effects of quantization on Bayesian neural networks w.r.t. generalizability, robustness against data drift, and its capability in uncertainty estimation on large-scale datasets including a real-world safety-critical application. Our code has been integrated into an open-source project and made available on GitHub at the following URL: https://github.com/IntelLabs/bayesian-torch.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송