학술논문

Parallel GPU Optimization of the Shooting and Bouncing Ray Tracing Methodology for Propagation Modeling
Document Type
Periodical
Source
IEEE Transactions on Antennas and Propagation IEEE Trans. Antennas Propagat. Antennas and Propagation, IEEE Transactions on. 72(1):174-182 Jan, 2024
Subject
Fields, Waves and Electromagnetics
Aerospace
Transportation
Components, Circuits, Devices and Systems
Graphics processing units
Codes
Parallel processing
Instruction sets
Electric fields
Computational efficiency
Ray tracing
Asymptotic high-frequency techniques
graphics processing units (GPUs)
high-performance computing
parallel/serial speedup
parallelism level
parallelization
ray tracing (RT)
scaling
shooting bouncing rays
wireless propagation modeling
Language
ISSN
0018-926X
1558-2221
Abstract
We propose a novel unified parallelization framework consisting of algorithms, strategies, and data structures to radically enhance the efficiency of the shooting and bouncing rays (SBRs) method for ray tracing (RT) electromagnetic propagation modeling. The massively parallel optimization of the SBR code is achieved by integration of the SBR with NVIDIA OptiX Prime programming interfaces on graphics processing units (GPUs), comprehensive parallelization of all components of the SBR algorithm, including electric field computation and postprocessing tasks being traditionally limited to sequential operation, and addressing and optimizing memory usage and constraints to further advance the efficiency of the overall method. Numerical results demonstrate that the newly proposed optimized SBR methodology achieves massive parallel versus serial speedups and upward of 99% parallelism under Amdahl’s parallelization scaling law. The strategic use of GPUs and the innovative, meticulous parallel optimization of all computational aspects of the code, explained in detail in the article, yield an SBR RT methodology of unparalleled efficiency, without sacrificing the previously advanced and established accuracy of the method. Rather, the presented major enhancements of the efficiency and the uniquely high level of parallelism are enabled by and addressed synergistically with the improvements of the accuracy of the SBR computation.