학술논문

EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices
Document Type
Conference
Source
2024 IEEE/ACM Symposium on Edge Computing (SEC) SEC Edge Computing (SEC), 2024 IEEE/ACM Symposium on. :191-205 Dec, 2024
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Training
Energy consumption
Costs
Software algorithms
Artificial neural networks
Tail
Software
Sustainable development
Low latency communication
Tuning
inference
energy
latency
workload changes
Language
ISSN
2837-4827
Abstract
The use of Deep Neural Networks (DNNs) has skyrocketed in recent years. While its applications have brought many benefits and use cases, they also have a significant environmental impact due to the high energy consumption of DNN execution. It has already been acknowledged in the literature that training DNNs is computationally expensive and requires large amounts of energy. However, the energy consumption of DNN inference is still an area that has not received much attention, yet. With the increasing adoption of online tools, the usage of inference has significantly grown and will likely continue to grow. Unlike training, inference is user-facing, requires low latency, and is used more frequently. As such, edge devices are being considered for DNN inference due to their low latency and privacy benefits. In this context, inference on edge is a timely area that requires closer attention to regulate its energy consumption. We present EcoEdgeInfer, a system that balances performance and sustainability for DNN inference on edge devices. Our core component of EcoEdgeInfer is an adaptive optimization algorithm, EcoGD, that strategically and quickly sweeps through the hardware and software configuration space to find the jointly optimal configuration that can minimize energy consumption and latency. EcoGD is agile by design, and adapts the configuration parameters in response to time-varying and unpredictable inference workload. We evaluate EcoEdgeInfer on different DNN models using real-world traces and show that EcoGD consistently outperforms existing baselines, lowering energy consumption by 31% and reducing tail latency by 14%, on average.