학술논문

Autotuning LSTM for Accelerated Execution on Edge

Document Type

Conference

Author

Saluja, Aditi; Mitra, Arnab; Deshwal, Ankur; Madhu, Kavitha; Chugh, Ujjawal; Lee, Seungwon; Song, Joonho

Source

2021 IEEE International Symposium on Circuits and Systems (ISCAS). :1-5 May, 2021

Subject

Components, Circuits, Devices and Systems
Privacy
Runtime
Computational modeling
Search methods
Integrated circuit modeling
Artificial intelligence
Long short term memory
Deep Learning
Edge computing
ASR
NLP
Halide
Compiler

Language

ISSN

2158-1525

Abstract

Deployment of Deep Neural Networks (DNNs) on edge devices is highly desirable to address user privacy concerns and minimize the turnaround time of AI applications. However, the execution of DNN models on a battery-operated device requires a highly optimized implementation specific to the target hardware. Moreover, as different layers of a DNN exhibit distinct computation and memory characteristics, it is imperative to optimize each layer separately. This is in contrast to the widely deployed library-based approach where all the configurations of DNN operations share the same implementation. In this paper, we address this issue by auto-tuning the implementation of Long Short Term Memory (LSTM) operations which are widely used in sequence based AI applications. To exhaustively search through the space of optimizations and its parameters, we develop a high-level autotuning framework based on Halide. We use grid search to find the parameters that lead to minimum runtime and further present TPE based search method to find the near-optimal runtime in a limited number of trials. We observe 2.2× —3.1× speedup in execution time for LSTM layers used in widely deployed GNMT and DeepSpeech2 models.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송