학술논문

Aligning Correlation Information for Domain Adaptation in Action Recognition
Document Type
Periodical
Source
IEEE Transactions on Neural Networks and Learning Systems IEEE Trans. Neural Netw. Learning Syst. Neural Networks and Learning Systems, IEEE Transactions on. 35(5):6767-6778 May, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
General Topics for Engineers
Correlation
Videos
Feature extraction
Task analysis
Spatiotemporal phenomena
Image recognition
Data mining
Action recognition
adversarial
correlation
dark videos
domain adaptation (DA)
Language
ISSN
2162-237X
2162-2388
Abstract
Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research toward video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-range dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore, we propose a novel adversarial correlation adaptation network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as pixel correlation discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.