학술논문

An Empirical Study on Adaptive Inference for Pretrained Language Model

Document Type

Periodical

Author

Liu, W.; Zhao, X.; Zhao, Z.; Ju, Q.; Yang, X.; Lu, W.

Source

IEEE Transactions on Neural Networks and Learning Systems IEEE Trans. Neural Netw. Learning Syst. Neural Networks and Learning Systems, IEEE Transactions on. 34(8):4321-4331 Aug, 2023

Subject

Computing and Processing
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
General Topics for Engineers
Adaptation models
Bit error rate
Task analysis
Inference mechanisms
Transformers
Computational modeling
Mathematical models
Adaptive inference
bidirectional encoder representations from transformers (BERT)
distillation
FastPLM
pretrained language model (PLM)

Language

ISSN

2162-237X
2162-2388

Abstract

Adaptive inference has been proven to improve bidirectional encoder representations from transformers (BERT)’s inference speed with minimal loss of accuracy. However, current work only focuses on the BERT model and lacks exploration of other pretrained language models (PLMs). Therefore, this article conducts an empirical study on the application of adaptive inference mechanism in various PLMs, including generative pretraining (GPT), GCNN, ALBERT, and TinyBERT. This mechanism is verified on both English and Chinese benchmarks, and experimental results demonstrated that it is able to speed up by a wide range from 1 to 10 times if given different speed thresholds. In addition, its application on ALBERT shows that adaptive inference can work with parameter sharing, achieving model compression and acceleration simultaneously, while the application on TinyBERT proves that it can further accelerate the distilled small model. As for the problem that too many labels make adaptive inference invalid, this article also proposes a solution, namely label reduction. Finally, this article open-sources an easy-to-use toolkit called FastPLM to help developers adopt pretrained models with adaptive inference capabilities in their applications.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송