학술논문

Automatic custom instruction identification in memory streaming algorithms
Document Type
Conference
Source
2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES) Compilers, Architecture and Synthesis for Embedded Systems (CASES), 2014 International Conference on. :1-9 Oct, 2014
Subject
Components, Circuits, Devices and Systems
Registers
Program processors
Arrays
Hardware
Ports (Computers)
Runtime
Reconfigurable architecture
custom instruction generation
load/store merging
streaming memory access
Language
Abstract
Application-specific instruction set processors (ASIPs) extend the instruction set of a general purpose processor by dedicated custom instructions (CIs). In the last decade, reconfigurable processors advanced this concept towards runtime reconfiguration to increase the efficiency and adaptivity. Compiler support for automatic identification and implementation of ASIP CIs exists commercially and on research platforms, but these compilers do not support CIs with memory accesses, as ASIP CIs typically work on register file data. While being acceptable for ASIPs, this imposes a limitation for reconfigurable processors as they achieve their performance by exploiting data-level parallelism. Con-sequently, we propose a novel approach to CI identification for runtime reconfigurable processors with support for memory operations in contrast to previous works that explicitly exclude them. Our algorithm extracts memory access patterns which allows us to abstract from single memory operations and merge accesses to optimally utilize the available memory bandwidth. We implemented our algorithm in a state-of-the-art compiler framework. The largest CI identified in our benchmarks consists of 2071 nodes (average 999 nodes), and a single generated CI can cover a whole computational kernel (up to 99%).