학술논문

Eris: Fault Injection and Tracking Framework for Reliability Analysis of Open-Source Hardware
Document Type
Conference
Source
2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) ISPASS Performance Analysis of Systems and Software (ISPASS), 2022 IEEE International Symposium on. :210-220 May, 2022
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Rockets
Target tracking
Fault location
Reliability engineering
Hardware
Registers
Reliability
fault-injection
RISC-V
safety-critical
resilience
fault-tracking
Language
Abstract
As transistors have been scaled over the past decade, modern systems have become increasingly susceptible to faults. Increased transistor densities and lower capacitances make a particle strike more likely to cause an upset. At the same time, complex computer systems are increasingly integrated into safety-critical systems such as autonomous vehicles. These two trends make the study of system reliability and fault tolerance essential for modern systems. To analyze and improve system reliability early in the design process, new tools are needed for RTL fault analysis.This paper proposes Eris, a novel framework to identify vulnerable components in hardware designs through fault-injection and fault propagation tracking. Eris builds on ESSENT—a fast C/C++ RTL simulation framework—to provide fault injection, fault tracking, and control-flow deviation detection capabilities for RTL designs. To demonstrate Eris’ capabilities, we analyze the reliability of the open source Rocket Chip SoC by randomly injecting faults during thousands of runs on four microbenchmarks. As part of this analysis we measure the sensitivity of different hardware structures to faults based on the likelihood of a random fault causing silent data corruption, unrecoverable data errors, program crashes, and program hangs. We detect control flow deviations and determine whether or not they are benign. Additionally, using Eris’ novel fault-tracking capabilities we are able to find 78% more vulnerable components in the same number of simulations compared to RTL-based fault injection techniques without these capabilities. We will release Eris as an open-source tool to aid future research into processor reliability and hardening.