학술논문

Deep Feature Compression Using Spatio-Temporal Arrangement Toward Collaborative Intelligent World
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 32(6):3934-3946 Jun, 2022
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Image coding
Correlation
Image edge detection
Video compression
Cloud computing
Quantization (signal)
Collaborative intelligence
Deep feature compression
collaborative intelligence
deep neural network
spatio-temporal arrangement
ordering search algorithm
Language
ISSN
1051-8215
1558-2205
Abstract
Collaborative Intelligence is a new paradigm that splits a deep neural network (DNN) into an edge and cloud for deploying a DNN-based image recognition application. In this paradigm, deep features, which are the outputs of the edge DNN, are compressed and transmitted to the cloud DNN. Because the deep features have a number of responses that are similar to each other, for efficient compression, previous methods spatially arrange and compress the deep features as an image to utilize the similarity as a spatial correlation. However, if the deep features are arranged in not only spatial but also temporal directions like those in a video, it may be possible to compress them more efficiently by increasing a temporal correlation. To explore this possibility, we propose a “spatio-temporal arrangement”. This method spatially arranges the deep features as images and temporally arranges them as a video with a novel ordering search algorithm. Our method effectively increases the spatial and temporal correlations hidden in the deep features and achieves high compression efficiency compared with the previous methods. Experimental results demonstrate the compression efficiency of our method is better than that of the previous methods (1.50% to 4.98% on BD-Rate evaluation in a lossy setting). Our analysis shows that our method effectively increases the correlation when the input is an image with rich edges and textures.