학술논문

Design and implementation of parallel PageRank on multicore platforms
Document Type
Conference
Source
2017 IEEE High Performance Extreme Computing Conference (HPEC) High Performance Extreme Computing Conference (HPEC), 2017 IEEE. :1-6 Sep, 2017
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Random access memory
Layout
Partitioning algorithms
Writing
Multicore processing
Memory management
Optimization
Language
Abstract
PageRank is a fundamental graph algorithm to evaluate the importance of vertices in a graph. In this paper, we present an efficient parallel PageRank design based on an edge-centric scatter-gather model. To overcome the poor locality of PageRank and optimize the memory performance, we develop a fast and efficient partitioning technique. We first partition all the vertices into non-overlapping vertex sets such that the data of each vertex set can fit in the cache; then we sort the outgoing edges of each vertex set based on the destination vertices to minimize random memory writes. The partitioning technique significantly reduces random accesses to main memory and improves the sustained memory bandwidth by 3×. It also enables efficient parallel execution on multicore platforms; we use distinct cores to execute the computations of distinct vertex sets in parallel to achieve speedup. We implement our design on a 16-core Intel Xeon processor and use various large-scale real-life and synthetic datasets for evaluation. Compared with the PageRank Pipeline Benchmark, our design achieves 12× to 19× speedup for all the datasets.