학술논문

Aspis: Robust Detection for Distributed Learning
Document Type
Conference
Source
2022 IEEE International Symposium on Information Theory (ISIT) Information Theory (ISIT), 2022 IEEE International Symposium on. :2058-2063 Jun, 2022
Subject
Communication, Networking and Broadcast Technologies
Training
Distance learning
Redundancy
Machine learning
Distance measurement
Behavioral sciences
Servers
Language
ISSN
2157-8117
Abstract
State-of-the-art machine learning models are routinely trained on large-scale distributed clusters. Crucially, such systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior and return arbitrary results to the parameter server (PS). This behavior may be attributed to a plethora of reasons, including system failures and orchestrated attacks. Existing work suggests robust aggregation and/or computational redundancy to alleviate the effect of distorted gradients. However, most of these schemes are ineffective when an adversary knows the task assignment and can choose the attacked workers judiciously to induce maximal damage. Our proposed method Aspis assigns gradient computations to workers using a subset-based assignment which allows for multiple consistency checks on the behavior of a worker. Examination of the calculated gradients and clique-finding in an appropriately constructed graph by the PS allows for efficient detection and exclusion of adversaries from the training. We prove the Byzantine resilience guarantees of Aspis under weak and strong attacks and extensively evaluate the system on various training scenarios and demonstrate an improvement of about 30% in accuracy compared to many state-of-the-art approaches on the CIFAR-10 dataset as well as reduction of the fraction of corrupted gradients ranging from 16% to 99%.