학술논문

IDYLL: Enhancing Page Translation in Multi-GPUs via Light Weight PTE Invalidations
Document Type
Conference
Source
2023 56th IEEE/ACM International Symposium on Microarchitecture (MICRO) Microarchitecture (MICRO), 2023 56th IEEE/ACM International Symposium on. :1163-1177 Oct, 2023
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Microarchitecture
Memory management
Graphics processing units
Organizations
Coherence
Software
multi-GPU
page table invalidation
page sharing
Language
Abstract
Multi-GPU systems have emerged as a desirable platform to deliver high computing capabilities and large memory capacity to accommodate large dataset sizes. However, naively employing multi-GPU incurs non-scalable performance. One major reason is that execution efficiency suffers expensive address translations in multi-GPU systems. The data-sharing nature of GPU applications requires page migration between GPUs to mitigate non-uniform memory access overheads. Unfortunately, frequent page migration incurs substantial page table invalidation overheads to ensure translation coherence. A comprehensive investigation of multi-GPU address translation efficiency identifies two significant bottlenecks caused by page table invalidation requests: (i) increased latency for demand TLB miss requests and (ii) increased waiting latency for performing page migrations. Based on observations, we propose IDYLL, which reduces the number of page table invalidations by maintaining an "in-PTE" directory and reduces invalidation latency by batching multiple invalidation requests to exploit spatial locality. We show that IDYLL improves overall performance by 69.9% on average.CCS CONCEPTS• Computer systems organization → Single instruction, multiple data; • Software and its engineering → Virtual memory.

Online Access