학술논문

Toward Improving Ensemble-Based Collaborative Inference at the Edge
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:6926-6940 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Collaboration
Computational modeling
Aggregates
Ensemble learning
Task analysis
Deep learning
Costs
Edge computing
Neural networks
Ensemble
edge computing
collaborative inference
neural networks
cascade
test time augmentation
Language
ISSN
2169-3536
Abstract
Ensemble-based collaborative inference systems, Edge Ensembles, are deep learning edge inference systems that enhance accuracy by aggregating predictions from models deployed on each device. They offer several advantages, including scalability based on task complexity and decentralized functionality without dependency on centralized servers. In general, ensemble methods effectively improve the accuracy of deep learning, and conventional research uses several model integration techniques for deep learning ensembles. Some of these existing integration methods are more effective than those used in previous Edge Ensembles. However, it remains uncertain whether these methods can be directly applied in the context of cooperative inference systems involving multiple edge devices. This study investigates the effectiveness of conventional model integration techniques, including cascade, weighted averaging, and test-time augmentation (TTA), when applied to Edge Ensembles to enhance their performance. Furthermore, we propose enhancements of these techniques tailored for Edge Ensembles. The cascade reduces the number of models required for inference but worsens latency by sequential inference processing. To address this latency issue, we propose $m$ -parallel cascade, which adjusts the number of models processed simultaneously to $m$ . We also propose learning TTA policies and weights for weighted averaging using ensemble prediction labels instead of ground truth labels. In the experiments, we verified the effectiveness of each technique for Edge Ensembles. The proposed $m$ -parallel cascade achieved a 2.8 times reduction in latency compared to the conventional cascade, even with a 1.06 times increase in computational costs. Additionally, the ensemble label-based learning demonstrated comparable effectiveness to the approach using ground truth labels.