학술논문

Design of Analog-AI Hardware Accelerators for Transformer-based Language Models (Invited)
Document Type
Conference
Source
2023 International Electron Devices Meeting (IEDM) Electron Devices Meeting (IEDM), 2023 International. :1-4 Dec, 2023
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Fields, Waves and Electromagnetics
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Training
Semiconductor device modeling
Nonvolatile memory
Computational modeling
Transformers
Throughput
Energy efficiency
In-memory computing
Non-volatile memory
large language models
analog multiply-accumulate for DNN inference
analog AI
deep learning accelerator
system modeling
Language
ISSN
2156-017X
Abstract
Analog Non-Volatile Memory-based accelerators offer high-throughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models. We describe architectural, wafer-scale testing, chip-demo, and hardware-aware training efforts towards such accelerators, and quantify the unique raw-throughput and latency benefits of Fully-(rather than Partially-) Weight-Stationary systems.