Byzantine-resilient decentralized stochastic gradient descent
Publication Type
Journal Article
Publication Date
6-2022
Abstract
Decentralized learning has gained great popularity to improve learning efficiency and preserve data privacy. Each computing node makes equal contribution to collaboratively learn a Deep Learning model. The elimination of centralized Parameter Servers (PS) can effectively address many issues such as privacy, performance bottleneck and single-point-failure. However, how to achieve Byzantine Fault Tolerance in decentralized learning systems is rarely explored, although this problem has been extensively studied in centralized systems. In this paper, we present an in-depth study towards the Byzantine resilience of decentralized learning systems with two contributions. First, from the adversarial perspective, we theoretically illustrate that Byzantine attacks are more dangerous and feasible in decentralized learning systems: even one malicious participant can arbitrarily alter the models of other participants by sending carefully crafted updates to its neighbors. Second, from the defense perspective, we propose Ubar, a novel algorithm to enhance decentralized learning with Byzantine Fault Tolerance. Specifically, Ubar provides a Uniform Byzantine-resilient Aggregation Rule for benign nodes to select the useful parameter updates and filter out the malicious ones in each training iteration. It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes. We conduct extensive experiments on standard image classification tasks and the results indicate that Ubar can effectively defeat both simple and sophisticated Byzantine attacks with higher performance efficiency than existing solutions.
Keywords
Training, servers, learning systems, distance learning, computer aided instruction, security, fault tolerant systems, decentralized learning, stochastic gradient descent, Byzantine attack, Byzantine fault tolerance
Discipline
Artificial Intelligence and Robotics | Theory and Algorithms
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Circuits and Systems for Video Technology
Volume
32
Issue
6
First Page
4096
Last Page
4106
ISSN
1051-8215
Identifier
10.1109/TCSVT.2021.3116976
Publisher
Institute of Electrical and Electronics Engineers
Citation
GUO, Shangwei; ZHANG, Tianwei; YU, Han; XIE, Xiaofei; MA, Lei; XIANG, Tao; and LIU, Yang.
Byzantine-resilient decentralized stochastic gradient descent. (2022). IEEE Transactions on Circuits and Systems for Video Technology. 32, (6), 4096-4106.
Available at: https://ink.library.smu.edu.sg/sis_research/7827
Additional URL
http://doi.org/10.1109/TCSVT.2021.3116976