학술논문

Large Scale Video Representation Learning via Relational Graph Clustering

Document Type

Conference

Author

Lee, Hyodong; Lee, Joonseok; Ng, Joe Yue-Hei; Natsev, Paul

Source

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2020 IEEE/CVF Conference on. :6806-6815 Jun, 2020

Subject

Computing and Processing
Measurement
Training
Task analysis
Google
Marine vehicles
Clustering algorithms
Face recognition

Language

ISSN

2575-7075

Abstract

Representation learning is widely applied for various tasks on multimedia data, e.g., retrieval and search. One approach for learning useful representation is by utilizing the relationships or similarities between examples. In this work, we explore two promising scalable representation learning approaches on video domain. With hierarchical graph clusters built upon video-to-video similarities, we propose: 1) smart negative sampling strategy that significantly boosts training efficiency with triplet loss, and 2) a pseudo-classification approach using the clusters as pseudo-labels. The embeddings trained with the proposed methods are competitive on multiple video understanding tasks, including related video retrieval and video annotation. Both of these proposed methods are highly scalable, as verified by experiments on large-scale datasets.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송