학술논문

High-Productivity Parallelism With Python Plus Packages (But Without a Cluster)
Document Type
Periodical
Source
Computing in Science & Engineering Comput. Sci. Eng. Computing in Science & Engineering. 23(4):38-46 Aug, 2021
Subject
Computing and Processing
Bioengineering
Communication, Networking and Broadcast Technologies
Mathematical model
Graphics processing units
High performance computing
Computational modeling
Tensors
Parallel processing
Language
ISSN
1521-9615
1558-366X
Abstract
We present two computing projects, peridynamics simulation and numerical integration on implicit domains, for which we realized high performance implementations using Python with appropriate packages. The problems are sufficiently compute intensive that a straightforward serial implementation is prohibitively slow. While conventional wisdom suggests moving such problems onto a computing cluster, we very directly produced high-performance parallel implementations that effectively perform the computing tasks on a single GPU. For the peridynamics application, the only package needed in addition to Numpy is Numba whose just-in-time compiler allows us to write kernel functions in Python and compile them to run in parallel on a CUDA-enabled GPU. Our approach to numerical integration on implicit domains invokes two additional packages to support interval arithmetic and dynamic parallelism to enable tree-structured recursive refinement. Use of Python (with only kernels requiring dynamic parallelism written in C) enabled rapid development of concise code that successfully achieves significant performance enhancement.