학술논문

A Multi-Memory Field-Programmable Custom Computing Machine for Accelerating Compute-Intensive Applications

Document Type

Conference

Author

Jadhav, Shrikant S.; Gloster, Clay; Naher, Jannatun; Doss, Christopher; Kim, Youngsoo

Source

2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), 2021 IEEE 12th Annual. :0619-0628 Dec, 2021

Subject

Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
General Topics for Engineers
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Multicore processing
Memory architecture
Logic gates
Mobile communication
Software
Resource management
Hardware acceleration
Reconfigurable Computing
Field-Programmable Gate Arrays
Field-Programmable Custom Computing Machines
High Performance Computing
Multi-Memory Architecture
Multiple Memory Banks

Language

Abstract

In this paper, we present an FPGA-based multi-memory controller for accelerating computationally intensive applications. Our architecture accepts multiple inputs and produces multiple outputs for each clock cycle. The architecture includes processor cores with pipelined functional units tailored for each application. Additionally, we present an approach to achieve one to two orders-of-magnitude speedup over a traditional software implementation executing on a conventional multi-core processor. Even though the clock frequency of the Field-Programmable Custom Computing Machine (FCCM) is an order-of-magnitude slower than a conventional multi-core processor, the FCCM is significantly faster. We used the Power function as an application to demonstrate the merits of our FCCM. In our experiments, we executed the Power function in software and compared the software execution times with the execution time of an FCCM. Additionally, we also compared FCCM execution time with the OpenMP implementation of the function. Our experiments show that the results obtained using our multi-memory architecture are 57X faster than software implementation and 17X faster than OpenMP implementation executing the Power function, respectively.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송