학술논문

Multi-Source to Multi-Target Decentralized Federated Domain Adaptation
Document Type
Periodical
Source
IEEE Transactions on Cognitive Communications and Networking IEEE Trans. Cogn. Commun. Netw. Cognitive Communications and Networking, IEEE Transactions on. 10(3):1011-1025 Jun, 2024
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Data models
Automobiles
Adaptation models
Training
Optimization
Distributed databases
Servers
Federated learning
federated domain adaptation
link formation
decentralized federated learning
network optimization
Language
ISSN
2332-7731
2372-2045
Abstract
Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data (called sources) to devices with low quality or unlabeled data (called targets). Our methodology, Source-Target Determination and Link Formation (ST-LF), optimizes both (i) classification of devices into sources and targets and (ii) source-target link formation, in a manner that considers the trade-off between ML model accuracy and communication energy efficiency. To obtain a concrete objective function, we derive a measurable generalization error bound that accounts for estimates of source-target hypothesis deviations and divergences between data distributions. The resulting optimization problem is a mixed-integer signomial program, a class of NP-hard problems, for which we develop an algorithm based on successive convex approximations to solve it tractably. Subsequent numerical evaluations of ST-LF demonstrate that it improves classification accuracy and energy efficiency over state-of-the-art baselines.