학술논문

Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems
Document Type
Periodical
Source
IEEE Computer Architecture Letters IEEE Comput. Arch. Lett. Computer Architecture Letters. 18(2):157-160 Dec, 2019
Subject
Computing and Processing
Graphics processing units
Task analysis
Bandwidth
Data transfer
Switches
Quality of service
Throughput
Multi-GPU
multi-tenant
PCIe scheduling
Language
ISSN
1556-6056
1556-6064
2473-2575
Abstract
Multi-GPU systems are widely used in data centers to provide significant speedups to compute-intensive workloads such as deep neural network training. However, limited PCIe bandwidth between the CPU and multiple GPUs becomes a major performance bottleneck. We observe that relying on a traditional Round-Robin-based PCIe scheduling policy can result in severe bandwidth competition and stall the execution of multiple GPUs. In this article, we propose a priority-based scheduling policy which aims to overlap the data transfers and GPU execution for different applications to alleviate this bandwidth contention. We also propose a dynamic priority policy for semi-QoS management that can help applications to meet QoS requirements and improve overall multi-GPU system throughput. Experimental results show that the system throughput is improved by 7.6 percent on average using our priority-based PCIe scheduling scheme as compared with a Round-Robin-based PCIe scheduler. Leveraging semi-QoS management can help to meet defined QoS goals, while preserving application throughput.