학술논문

Optimized CNN Architectures Benchmarking in Hardware-Constrained Edge Devices in IoT Environments
Document Type
Periodical
Source
IEEE Internet of Things Journal IEEE Internet Things J. Internet of Things Journal, IEEE. 11(11):20357-20366 Jun, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Computational modeling
Convolutional neural networks
Benchmark testing
Data models
Transfer learning
Training
Quantization (signal)
Convolutional neural network (CNN) architectures
edge computing
edge devices
model optimization
transfer learning
Language
ISSN
2327-4662
2372-2541
Abstract
Internet of Things (IoT) and edge devices have grown in their application fields due to machine learning (ML) models and their capacity to classify images into previously known labels, working close to the end-user. However, the model might be trained with several convolutional neural network (CNN) architectures that can affect its performance when developed in hardware-constrained environments, such as edge devices. In addition, new training trends suggest using transfer learning techniques to get an excellent feature extractor obtained from one domain and use it in a new domain, which has not enough images to train the whole model. In light of these trends, this work benchmarks the most representative CNN architectures on emerging edge devices, some of which have hardware accelerators. The ML models were trained and optimized using a small set of images obtained in IoT environments and using transfer learning. Our results show that unfreezing until the last 20 layers of the model’s architecture can be fine-tuned correctly to the new set of IoT images depending on the CNN architecture. Additionally, quantization is a suitable optimization technique to shrink $2\times $ or $3\times $ times the model leading to a lighter memory footprint, lower execution time, and battery consumption. Finally, the Coral Dev Board can boost $100\times $ the inference process, and the EfficientNet model architecture keeps the same classification accuracy even when the model is adopted to a hardware-constrained environment.