학술논문

Human Interaction Understanding With Consistency-Aware Learning
Document Type
Periodical
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE Trans. Pattern Anal. Mach. Intell. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 45(10):11898-11914 Oct, 2023
Subject
Computing and Processing
Bioengineering
Cognition
Task analysis
Feature extraction
Predictive models
Labeling
Automobiles
Context modeling
Consistency-aware learning
logical reasoning
human interaction understanding
graph neural network
Language
ISSN
0162-8828
2160-9292
1939-3539
Abstract
Compared with the progress made on human activity classification, much less success has been achieved on human interaction understanding (HIU). Apart from the latter task is much more challenging, the main causation is that recent approaches learn human interactive relations via shallow graphical representations, which are inadequate to model complicated human interactive-relations. This paper proposes a deep consistency-aware framework aiming at tackling the grouping and labelling inconsistencies in HIU. This framework consists of three components, including a backbone CNN to extract image features, a factor graph network to implicitly learn higher-order consistencies among labelling and grouping variables, and a consistency-aware reasoning module to explicitly enforcing consistencies. The last module is inspired by our key observation that the consistency-aware reasoning bias can be embedded into an energy function or a particular loss function, minimizing which delivers consistent predictions. An efficient mean-field inference algorithm is proposed, such that all modules of our network could be trained in an end-to-end fashion. Experimental results demonstrate that the two proposed consistency-learning modules complement each other, and both make considerable contributions in achieving leading performance on three benchmarks of HIU. The effectiveness of the proposed approach is further validated by experiments on detecting human-object interactions.