학술논문

TFC: Transformer Fused Convolution for Adversarial Domain Adaptation
Document Type
Periodical
Source
IEEE Transactions on Computational Social Systems IEEE Trans. Comput. Soc. Syst. Computational Social Systems, IEEE Transactions on. 11(1):697-706 Feb, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Transformers
Feature extraction
Convolution
Uncertainty
Task analysis
Computer vision
Adaptation models
Convolutional neural networks (CNNs)
transfer fused convolution
unsupervised domain adaptation (UDA)
vision transform
Language
ISSN
2329-924X
2373-7476
Abstract
In unsupervised domain adaptation (UDA), a classifier is applied to the target domain without or with limited labels, when the target domain has no or few labels. Recently, inspired by their capabilities of long-distance feature dependencies, vision transformer (ViT)-based methods have been used in UDA, however, they ignore the fact that ViT lacks strength in extracting local feature details. To handle the above problems, the purpose of this article is to demonstrate how to take advantage of both convolutional operations and transformer mechanisms for adversarial UDA by using a hybrid network structure called transformer fused convolution (TFC). TFC integrates local features with global features to boost the representation capacity for UDA which can enhance the discrimination between foreground and background. Moreover, to improve the robustness of the TFC, we leverage an uncertainty penalty loss to make incorrect classes have consistently lower scores. Extensive experiments validate the significant performance gains compared to conditional adversarial domain adaptation (CDAN) on all five datasets including DomainNet ( $\uparrow ~8.5$ %), VisDA-2017 ( $\uparrow ~14.9$ %), Office-Home ( $\uparrow ~18.9$ %), Office-31 ( $\uparrow ~11.5$ %), and ImageCLEF-DA ( $\uparrow ~5.5$ %).