학술논문

Community-based Matrix Reordering for Sparse Linear Algebra Optimization
Document Type
Conference
Source
2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) ISPASS Performance Analysis of Systems and Software (ISPASS), 2023 IEEE International Symposium on. :214-223 Apr, 2023
Subject
Computing and Processing
Rabbits
Graphics processing units
Linear algebra
Hardware
Software
Performance analysis
Sparse matrices
Language
Abstract
Sparse linear algebra kernels achieve sub-optimal performance due to their poor cache locality. Matrix reordering is an effective pre-processing optimization that improves cache locality and performance of these kernels. While many reordering techniques have been proposed, most prior work on matrix reordering suffer from two key limitations: (1) they evaluate their reordering proposal on a small set of arbitrarily-selected inputs and (2) they do not quantify the additional headroom for improvement after reordering is applied. To address these two limitations, we perform a detailed characterization of reordering techniques across a broad set of 50 input matrices where we quantify the ability of matrix reordering techniques to bring sparse linear algebra kernels close to hardware limits. Our analysis reveals that community-based matrix reordering is most effective at optimizing the execution of sparse linear algebra kernels, bringing the cuSPARSE SpMV kernel to within 54% of ideal run time on an NVIDIA A6000 GPU on average. However, community-based reordering is not uniformly effective across all 50 input matrices. We investigate the reasons when community-based reordering falls short and propose an enhanced version of community-based reordering that provides up to 1.57× additional performance improvements for the SpMV kernel.