학술논문

Reduce Operations: Send Volume Balancing While Minimizing Latency

Document Type

Periodical

Author

Source

IEEE Transactions on Parallel and Distributed Systems IEEE Trans. Parallel Distrib. Syst. Parallel and Distributed Systems, IEEE Transactions on. 31(6):1461-1473 Jun, 2020

Subject

Computing and Processing
Communication, Networking and Broadcast Technologies
Program processors
Task analysis
Computational modeling
Load modeling
Sparse matrices
Measurement
Solid modeling
Communication hypergraph
communication cost
maximum communication volume
communication volume
latency
recursive bipartitioning
hypergraph partitioning
sparse matrix
sparse matrix-vector multiplication

Language

ISSN

1045-9219
1558-2183
2161-9883

Abstract

Communication hypergraph model was proposed in a two-phase setting for encapsulating multiple communication cost metrics (bandwidth and latency), which are proven to be important in parallelizing irregular applications. In the first phase, computational-task-to-processor assignment is performed with the objective of minimizing total volume while maintaining computational load balance. In the second phase, communication-task-to-processor assignment is performed with the objective of minimizing total number of messages while maintaining communication-volume balance. The reduce-communication hypergraph model suffers from failing to correctly encapsulate send-volume balancing. We propose a novel vertex weighting scheme that enables part weights to correctly encode send-volume loads of processors for send-volume balancing. The model also suffers from increasing the total communication volume during partitioning. To decrease this increase, we propose a method that utilizes the recursive bipartitioning framework and refines each bipartition by vertex swaps. For performance evaluation, we consider column-parallel SpMV, which is one of the most widely known applications in which the reduce-task assignment problem arises. Extensive experiments on 313 matrices show that, compared to the existing model, the proposed models achieve considerable improvements in all communication cost metrics. These improvements lead to an average decrease of 30 percent in parallel SpMV time on 512 processors for 70 matrices with high irregularity.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송