학술논문

TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective

Document Type

Conference

Author

Dan, Jun; Liu, Yang; Xie, Haoyu; Deng, Jiankang; Xie, Haoran; Xie, Xuansong; Sun, Baigui

Source

2023 IEEE/CVF International Conference on Computer Vision (ICCV) ICCV Computer Vision (ICCV), 2023 IEEE/CVF International Conference on. :20585-20596 Oct, 2023

Subject

Computing and Processing
Signal Processing and Analysis
Training
Visualization
Three-dimensional displays
Face recognition
Benchmark testing
Data augmentation
Transformers

Language

ISSN

2380-7504

Abstract

Vision Transformers (ViTs) have demonstrated powerful representation ability in various visual tasks thanks to their intrinsic data-hungry nature. However, we unexpectedly find that ViTs perform vulnerably when applied to face recognition (FR) scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approach and hard sample mining strategy are incompatible with ViTs-based FR backbone due to the lack of tailored consideration on preserving face structural information and leveraging each local token information. To remedy these problems, this paper proposes a superior FR model called TransFace, which employs a patch-level data augmentation strategy named DPAP and a hard sample mining strategy named EHSM. Specially, DPAP randomly perturbs the amplitude information of dominant patches to expand sample diversity, which effectively alleviates the overfitting problem in ViTs. EHSM utilizes the information entropy in the local tokens to dynamically adjust the importance weight of easy and hard samples during training, leading to a more stable prediction. Experiments on several benchmarks demonstrate the superiority of our TransFace. Code and models are available at https://github.com/DanJun6737/TransFace.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송