학술논문

On Continuing DNN Accelerator Architecture Scaling Using Tightly Coupled Compute-on-Memory 3-D ICs
Document Type
Periodical
Source
IEEE Transactions on Very Large Scale Integration (VLSI) Systems IEEE Trans. VLSI Syst. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on. 31(10):1603-1613 Oct, 2023
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Memory management
Computer architecture
Throughput
System-on-chip
Random access memory
Energy efficiency
Physical design
3-D accelerator physical design
3-D bond pitch study for accelerators
high-performance 3-D accelerator
multitier 3-D ML accelerator
Language
ISSN
1063-8210
1557-9999
Abstract
This work identifies the architectural and design scaling limits of 2-D flexible interconnect deep neural network (DNN) accelerators and addresses them with 3-D ICs. We demonstrate how scaling up a baseline 2-D accelerator in the $X/Y$ dimension fails and how vertical stacking effectively overcomes the failure. We designed multitier accelerators that are $1.67\times $ faster than the 2-D design. Using our 3-D architecture and circuit codesign methodology, we improve throughput, energy efficiency, and area efficiency by up to $5\times $ , $1.2\times $ , and $3.9\times $ , respectively, over 2-D counterparts. The IR-drop in our 3-D designs is within 10.7% of VDD, and the temperature variation is within 12 °C.