학술논문

34.8 A 22nm 16Mb Floating-Point ReRAM Compute-in-Memory Macro with 31.2TFLOPS/W for AI Edge Devices
Document Type
Conference
Source
2024 IEEE International Solid-State Circuits Conference (ISSCC) Solid-State Circuits Conference (ISSCC), 2024 IEEE International. 67:580-582 Feb, 2024
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Engineered Materials, Dielectrics and Plasmas
Photonics and Electrooptics
Robotics and Control Systems
Phase change materials
Program processors
Nonvolatile memory
Interference
In-memory computing
Batteries
System-on-chip
Language
ISSN
2376-8606
Abstract
AI-edge devices demand high-precision computation (e.g. FP16 and BF16) for accurate inference in practical applications, while maintaining high energy efficiency (EF) and low standby power to prolong battery life. Thus, advanced non-volatile AI-edge processors [1, 2] require non-volatile compute-in-memory (nvCIM) [3–5] with a large non-volatile on-chip memory, to store all of the neural network’s parameters (weight data) during power-off, and high-precision high-EF multiply-and-accumulate (MAC) operations during compute, to maximize battery life. Among nvCIMs, ReRAM-nvCIM stands out as a promising candidate due to its lowest cost-per-bit (vs. MRAM, PCM, and eFlash), large on-off ratio, and resilience to magnetic-field interference. However, existing nvCIM macros [3–5] do not support floating-point (FP) computation. Implementing a FP-MAC for nvCIM faces challenges, as shown in Fig. 34.8.1, in (1) balancing the bit width tradeoff for weight pre-alignment between accuracy and storage, (2) addressing long latency and energy consumption in MAC operations due to the high input bit width in FP format, and (3) managing high array current consumption when accessing numerous memory cells (MCs) for FP operations, particularly in the low-resistance-state (LRS) ReRAM cells.