학술논문

Object-Based Temporal Segment Relational Network for Activity Recognition
Document Type
Conference
Source
2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) SIBGRAPI Graphics, Patterns and Images (SIBGRAPI), 2018 31st SIBGRAPI Conference on. :103-109 Oct, 2018
Subject
Computing and Processing
Activity recognition
Videos
Object detection
Task analysis
Detectors
Histograms
Dogs
Action recognition, contextual cues, relational reasoning
Language
ISSN
2377-5416
Abstract
Video understanding is the next frontier of computer vision, in which activity recognition plays a major role. Despite the recent improvements in holistic activity recognition, further researching part-based models such as context may allow us to better understand what is important for activities and thus improve our current activity recognition models. This work tackles contextual cues obtained from object detections, in which we posit that objects relevant to an action are related to its spatial arrangement regarding an agent. Based on that, we propose Egocentric Pyramid to encode such spatial relationships. We further extend it by proposing a data-centric approach named Temporal Segment Relational Network (TSRN). Our experiments give support to the hypothesis that object spatiality provides an important clue to activity recognition. In addition, our data-centric approach shows that besides such spatial features, there may be other important information that further enhances the object-based activity recognition, such as co-occurrence, relative size, and temporal information.