학술논문

An Efficient Hardware/Software Co-Design for FALCON on Low-End Embedded Systems
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:57947-57958 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Hardware
Task analysis
Costs
NIST
Digital signatures
Hardware design languages
Quantum cryptography
Cryptography
Hardware acceleration
Post quantum cryptography
digital signature algorithm
cryptography
SW/HW co-design
FALCON
accelerator
Language
ISSN
2169-3536
Abstract
We propose in this paper an efficient FALCON accelerator called EFX based on a HW/SW co-design where FALCON is a post-quantum cryptographic (PQC) scheme tailored as a digital signature algorithm (DSA). Our findings reveal that FALCON exhibits unique characteristics and structures which distinguish it from other PQC-DSAs. A key finding is that, unlike its counterparts, FALCON doesn’t prioritize a single, time-consuming task; instead, it processes a variety of tasks with comparable execution times. Consequently, the conventional methods focusing on accelerating dominant few tasks, which are generally effective for other algorithms, prove less efficient for FALCON, especially concerning the minimization of the silicon area used. To overcome this, we strategically focus on the granular optimization of lower-level operations rather than on broader functional segments, aiming to boost performance while conserving hardware space. Moreover, to mitigate the potential degradation due to limitation of hardware resources, we have implemented a pipelined execution strategy for the FALCON functions and refined the sampling function–a critical task that is challenging to accelerate due to inherent sequential algorithm–enabling it to run concurrently on both software and hardware, thus reducing latency. Our hardware design, synthesized at $300MHz$ using Samsung’s $28nm$ and $45nm$ process technologies, demonstrates superior performance in generating FALCON signatures, with a $3.58 \times $ improvement in clock cycles over an existing hardware accelerator. EFX occupies 38K $um ^{2}$ and 74K $um ^{2}$ for $28nm$ and $45nm$ processes, respectively, comparatively small compared to other PQC accelerators.