학술논문

Poster: Fast GPU Inference with Unstructurally Pruned DNNs for Explainable DOC
Document Type
Conference
Source
2023 IEEE/ACM Symposium on Edge Computing (SEC) SEC Edge Computing (SEC), 2023 IEEE/ACM Symposium on. :273-275 Dec, 2023
Subject
Communication, Networking and Broadcast Technologies
unstructured pruning at initialization
Synaptic Flow
compiler
SparseRT
TensorRT
GPU
MVTec AD
CutPaste
Language
ISSN
2837-4827
Abstract
We have developed a code compiler to compress unstructurally pruned DNN models and demonstrated inference time less than 1 msec with AUC accuracy over 90 % for an anomaly detection task using MVTec AD dataset and edge Graphics Processing Unit (GPU) devices. Reduced RepVGG convolutional neural network (CNN) architecture is applied to an explainable deep one-class classification (XDOC) algorithm and such fast inference is obtained without sacrificing the accuracy by using a training scheme, CutPaste, to keep the accuracy high under an extremely higher pruning rate condition.