학술논문

Logovit: Local-Global Vision Transformer for Object Re-Identification

Document Type

Conference

Author

Phan, Nguyen; Huy, Ta Duc; Duong, Soan T. M.; Hoang, Nguyen Tran; Tran, Sam; Hung, Dao Huu; Nguyen, Chanh D. Tr.; Bui, Trung; Truong, Steven Q. H.

Source

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023

Subject

Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Training
Measurement
Visualization
Source coding
Lighting
Signal processing
Benchmark testing
Object re-id
public security
vision transformer
multi-scale
patch modification augmentation

Language

ISSN

2379-190X

Abstract

Object re-identification (ReID) is prone to errors under variations in scale, illumination, complex background, and object occlusion scenarios. To overcome these challenges, attention mechanisms are employed to focus on the object's characteristics, thereby extracting better discriminative features. This paper introduces a local-global vision transformer (LoGoViT) for object re-identification by learning a hierarchical-level representation from fine-grained (local) to general (global) context features. It comprises two components: (i) shift and shuffle operations to generate robust local features and (ii) local-global module to aggregate the multi-level hierarchy features of an object. Extensive experiments show that our method achieves state-of-the-art on the ReID benchmarks. We further investigate effective augmentation operations and discuss how the patch modifications improve the proposed model's generalization under occlusion scenarios. The source code is available at https://github.com/nguyenphan99/LoGoViT.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송