학술논문
Temporal 3D Shape Modeling for Video-based Cloth-Changing Person Re-Identification
Document Type
Conference
Source
2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) WACVW Applications of Computer Vision Workshops (WACVW), 2024 IEEE/CVF Winter Conference on. :173-182 Jan, 2024
Subject
Language
ISSN
2690-621X
Abstract
Video-based Cloth-Changing Person Re-ID (VCCRe-ID) refers to a real-world Re-ID problem where texture information like appearance or clothing becomes unreliable in long-term, limiting the applicability of traditional Re-ID methods. VCCRe-ID has not been well studied primarily due to (1) limited public datasets and (2) challenges related to extracting identity-related clothes-invariant cues from videos. Few existing works have heavily focused on gait-based features, which are severely affected under view-point changes and occlusions. In this work, we propose “Temporal 3D ShapE Modeling for VCCRe-ID” (SEMI), a lightweight end-to-end framework that addresses these issues by learning human 3D shape representations. The SEMI framework comprises of a Temporal 3D Shape Modeling branch, which extracts discriminative frame-wise 3D shape features using a temporal encoder, and an identity-aware 3D regressor. This is followed by a novel Attention-based Shape Aggregation (ASA) module that effectively aggregates frame-wise shape features for a fine-grained video-wise shape embedding. ASA leverages an attention mechanism to amplify the contribution of the most important frames while reducing redundancy during the aggregation process. Experiments on two VCCRe-ID datasets demonstrate that our proposed framework outperforms state-of-the-art methods by 10.7% in rank-1 accuracy and 7.4% in mAP in cloth-changing setting.