학술논문

Challenges in Deeply Heterogeneous High Performance Systems
Document Type
Conference
Source
2019 22nd Euromicro Conference on Digital System Design (DSD) Digital System Design (DSD), 2019 22nd Euromicro Conference on. :428-435 Aug, 2019
Subject
Computing and Processing
Reliability
Computational modeling
Resource management
Runtime
Timing
Thermal management
Programming
HPC
heterogeneous computing
run-time management
Language
Abstract
RECIPE (REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems) is a recently started project funded within the H2020 FETHPC programme, which is expressly targeted at exploring new High-Performance Computing (HPC) technologies. RECIPE aims at introducing a hierarchical runtime resource management infrastructure to optimize energy efficiency and minimize the occurrence of thermal hotspots, while enforcing the time constraints imposed by the applications and ensuring reliability for both time-critical and throughput-oriented computation that run on deeply heterogeneous accelerator-based systems. This paper presents a detailed overview of RECIPE, identifying the fundamental challenges as well as the key innovations addressed by the project, which span run-time management, heterogeneous computing architectures, HPC memory/interconnection infrastructures, thermal modelling, reliability, programming models, and timing analysis. For each of these areas, the paper describes the relevant state of the art as well as the specific actions that the project will take to effectively address the identified technological challenges.