학술논문

GRAFT: Graph-Assisted Reinforcement Learning for Automated SSD Firmware Testing
Document Type
Conference
Source
2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD) Computer Aided Design (ICCAD), 2023 IEEE/ACM International Conference on. :1-8 Oct, 2023
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
General Topics for Engineers
Signal Processing and Analysis
Training
Representation learning
Codes
Solid state drives
Closed box
Reinforcement learning
Flow graphs
Automated testing
Backward compatibility
Firmware
Graph representation learning
Solid-state drive
Test case
Language
ISSN
1558-2434
Abstract
Well-designed test cases (TCs) are crucial for en-suring the quality of Solid-State Drive (SSD) products. Indeed, validating SSD firmware code by the TCs is indispensable to check if there are no defects during the SSD development process. Accordingly, it is necessary to create short TCs covering firmware code as much as possible for efficient and precise validation. While various methods are available for generating TCs, existing automated approaches overlook backward compatibility, a key property in the SSD development process. To utilize the property, we introduce a novel deep-learning approach called GRAFT, which combines graph representation learning and reinforcement learning (RL) for automated TC generation in SSDs by leveraging pre-collected data. G RAFT trains a graph neural network to extract the underlying structure of the SSD firmware code from a given SSD simulator. The resulting graph embeddings serve as observations in the RL process. To address the challenge of over- estimation in an external domain in the RL process, conservative Q-Iearning, an offline RL technique, is employed using the pre- collected data. Despite the limitation of not being able to interact with the SSD simulator for training, we demonstrate that GRAFT successfully trains RL agents that generate TCs. The TCs are not only significantly more efficient with 3.5x shorter than randomly generated TCs by a black-box fuzzer but also exhibit comparable coverage and efficiency to those created by human experts with domain knowledge, which fully took three days. Moreover, the TCs achieves maximum coverage more reliably than any other methods in the experiments.