학술논문

EvoGWP: Predicting Long-Term Changes in Cloud Workloads Using Deep Graph-Evolution Learning
Document Type
Periodical
Source
IEEE Transactions on Parallel and Distributed Systems IEEE Trans. Parallel Distrib. Syst. Parallel and Distributed Systems, IEEE Transactions on. 35(3):499-516 Mar, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Dynamic scheduling
Heuristic algorithms
Prediction algorithms
Correlation
Predictive models
Interference
Graph neural networks
Workload prediction
cloud computing
graph neural network
resource management
datacenter
Language
ISSN
1045-9219
1558-2183
2161-9883
Abstract
Workload prediction plays a crucial role in resource management of large scale cloud datacenters. Although quite a number of methods/algorithms have been proposed, long-term changes have not been explicitly identified and considered. Due to shifty user demands, workload re-locations, or other reasons, the “resource usage pattern” of a workload, which is usually quite stable in a short-term view, may change dynamically in a long-term range. Such long-term dynamic changes may cause significant accuracy degradation for prediction algorithms. How to handle such long-term dynamic changes is an open and challenging issue. In this article, we propose Evolution Graph for Workload Prediction (EvoGWP), a novel method that can predict long-term dynamic changes using a delicately designed graph-based evolution learning algorithm. EvoGWP automatically extracts shapelets to explicitly identify resource usage patterns of workloads in a fine-grained level, and predicts workload changes by considering factors in both temporal and spatial dimensions. We design a two-level importance based shapelet extraction mechanism to mine new usage pattern changes in temporal dimension, and design a novel evolution graph model to fuse the interference among resource usage patterns of different workloads in spatial dimension. By combining temporal extraction of shapelets from each single workload and spatial interference of shapelets among different workloads, we then design a spatio-temporal GNN-based encoder-decoder model to predict the long-term dynamic changes of workloads. Experiments using real trace data from Alibaba, Tencent and Google show that EvoGWP improves the prediction accuracy by up to 58.6% over the state-of-the-art prediction methods. Moreover, EvoGWP can outperform the state-of-the-art prediction methods in terms of model convergence. To the best of our knowledge, this is the first work that explicitly identifies fine-grained workload resource usage patterns to accurately predict long-term dynamic changes of workloads.