학술논문

Emotion Knowledge Driven Video Highlight Detection
Document Type
Periodical
Author
Source
IEEE Transactions on Multimedia IEEE Trans. Multimedia Multimedia, IEEE Transactions on. 23:3999-4013 2021
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
General Topics for Engineers
Visualization
Training data
Predictive models
Semantics
Emotion recognition
Computational modeling
Deep ranking
knowledge graph
video highlight detection
Language
ISSN
1520-9210
1941-0077
Abstract
This paper addresses video highlight detection which aims to select a small subset of frames according to user's major or special interest. The performances of conventional methods highly depend on large-scale manually labeled training data which are time-consuming and labor-intensive to collect. To deal with this problem, we trace back to the original problem definition and find that whether a user is interested in a specific video segment heavily depends on human's subjective emotions. Leveraging this insight, we introduce an emotion knowledge driven video detection framework for modeling human's general emotion and inferencing highlight strength. Firstly, we obtain the concept-level representation of the video clip with a front-end network. The concepts are used as nodes to build an emotion-related knowledge graph, and their relationships in the graph are modeled via external public knowledge graphs. Then we adopt Siamese GCNs to model the dependencies between nodes in the graph and propagate messages along the edges. Finally, we compute the emotion-aware representation of the video clip based on the GCN layers and further use it to predict the highlight score. Our framework, including the front-end network, graph convolution layers and the highlight mapping network, can be trained in an end-to-end manner with the constraint of a ranking loss. Experiments on two benchmark datasets show that our proposed method performs favorably against the state-of-the-art methods.