학술논문

Optimizing GPU Deep Learning Operators with Polyhedral Scheduling Constraint Injection
Document Type
Conference
Source
2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) Code Generation and Optimization (CGO), 2022 IEEE/ACM International Symposium on. :313-324 Apr, 2022
Subject
Computing and Processing
Deep learning
Codes
Costs
Neural networks
Memory management
Graphics processing units
Production
Polyhedral model
scheduling
vectorization
Language
Abstract
Automatic parallel code generation from high-level abstractions such as those manipulated by artificial intelligence and deep learning (AI/DL) frameworks heavily rely on compiler techniques for automatic parallelization and optimization. Many recent advances rely on the polyhedral framework for this task because of its ability to model and to apply a wide range of loop transformations. However, modeling the complexity of the target architecture and of efficient cost models to decide about the best transformation is in general out of reach for a framework based on linear/affine constraints. In this work, we propose to decouple the polyhedral framework into linear and non-linear components. We introduce the constraint tree abstraction which may be generated by a non-linear optimizer and injected to the polyhedral optimization process to build better solutions. We present how to benefit from such a mechanism to generate efficient codes for GPU in the context of AI/DL operators. Our constraint injection allows to drive the polyhedral scheduler towards efficient solutions for load/store vectorization relying both on memory coalescing and vector types. We implemented our scheduler supporting constraint injection and our constraint construction system within a production AI/DL framework. Experiments on well known neural networks show the efficiency of this approach with respect to state-of-the-art polyhedral scheduling for GPU.