학술논문

Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning

Document Type

Working Paper

Author

Li, Siyuan; Tian, Juanxi; Wang, Zedong; Zhang, Luyuan; Liu, Zicheng; Jin, Weiyang; Liu, Yang; Sun, Baigui; Li, Stan Z.

Source

Subject

Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning

Language

Abstract

This paper delves into the interplay between vision backbones and optimizers, unvealing an inter-dependent phenomenon termed \textit{\textbf{b}ackbone-\textbf{o}ptimizer \textbf{c}oupling \textbf{b}ias} (BOCB). We observe that canonical CNNs, such as VGG and ResNet, exhibit a marked co-dependency with SGD families, while recent architectures like ViTs and ConvNeXt share a tight coupling with the adaptive learning rate ones. We further show that BOCB can be introduced by both optimizers and certain backbone designs and may significantly impact the pre-training and downstream fine-tuning of vision models. Through in-depth empirical analysis, we summarize takeaways on recommended optimizers and insights into robust vision backbone architectures. We hope this work can inspire the community to question long-held assumptions on backbones and optimizers, stimulate further explorations, and thereby contribute to more robust vision systems. The source code and models are publicly available at https://bocb-ai.github.io/.
Comment: Preprint V1. Online project at https://bocb-ai.github.io/

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송