학술논문

Achieving high performance and portable parallel GMRES algorithm for compressible flow simulations on unstructured grids.
Document Type
Article
Source
Journal of Supercomputing. Nov2023, Vol. 79 Issue 17, p20116-20140. 25p.
Subject
*FLOW simulations
*PARALLEL algorithms
*COMPUTATIONAL fluid dynamics
*NAVIER-Stokes equations
*HIGH performance computing
*COMPRESSIBLE flow
*PARALLEL kinematic machines
Language
ISSN
0920-8542
Abstract
Improving the effectiveness and scalability of implicit algorithms has long been a subject that attracted scientific computing researchers. The generalized minimal residual (GMRES) method is one of the efficient algorithms employed by Computational Fluid Dynamics (CFD). However, due to the inherent sequential properties, GMRES encountered difficulties in achieving high parallel computing performance. Diverse HPC architecture trends also introduce challenges in algorithm migration. In this work, based on the separation of concerns thought, a performance-portable parallel GMRES algorithm is proposed to efficiently solve compressible Navier–Stokes equations on unstructured grids in parallel on different platforms. First, the Jacobian evaluation for the GMRES algorithm is improved. This method explicitly calculates a more accurate Jacobian matrix derived analytically instead of using the matrix-free method to enhance the convergence. In addition, it is convenient to call the highly optimized linear algebra libraries to achieve performance and portability, manually implementing the high-level Jacobian matrix computation and leaving the rest algorithm part and low-level optimization on target architecture to the library. Combined with a fine-grained parallel LU-SGS (lower-upper symmetric Gauss–Seidel) preconditioner, the algorithm can run efficiently on multi-core or many-core architectures such as GPUs. The proposed method has been used to compute some typical compressible flow configurations. Experimental results show that the proposed method has obvious advantages over the commonly used implicit algorithms like matrix-free GMRES and LU-SGS in terms of convergence and portability of parallel performance. [ABSTRACT FROM AUTHOR]