학술논문

Guest Editorial: Introduction to the Special Section on Communication-Efficient Distributed Machine Learning
Document Type
Periodical
Source
IEEE Transactions on Network Science and Engineering IEEE Trans. Netw. Sci. Eng. Network Science and Engineering, IEEE Transactions on. 9(4):1949-1950 Aug, 2022
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Components, Circuits, Devices and Systems
Signal Processing and Analysis
Special issues and sections
Deep learning
Machine learning
Computational modeling
Natural language processing
Training data
Communication systems
Distributed processing
Language
ISSN
2327-4697
2334-329X
Abstract
The papers in this special section focus on communication-efficient distributed machine learning. Machine learning, especially deep learning, has been successfully applied in a wealth of practical AI applications in the field of computer vision, natural language processing, healthcare, finance, robotics, etc. With the increasing size of machine learning models and training data sets, training deep learning models requires significant amount of computations and may take days to months on a single GPU or TPU. Therefore, it becomes a common practice to exploit distributed machine learning to accelerate the training process with multiple processors. Distributed machine learning typically requires the processors to exchange information repeatedly throughout the training process. With the fast-growing computing power of the AI processors, the data communications among processors gradually become the performance bottleneck and excessively limit the system scalability due to Amdahl’s law. The design of communication-efficient distributed machine learning systems has attracted great attention from both academia and industry.