학술논문

Learning-Driven Algorithms for Responsive AR Offloading With Non-Deterministic Rewards in Metaverse-Enabled MEC
Document Type
Periodical
Source
IEEE/ACM Transactions on Networking IEEE/ACM Trans. Networking Networking, IEEE/ACM Transactions on. 32(2):1556-1572 Apr, 2024
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Task analysis
Heuristic algorithms
Metaverse
Base stations
Delays
Software
Servers
Mobile edge computing
augmented reality
reward maximization
online learning
multi-armed bandits
Language
ISSN
1063-6692
1558-2566
Abstract
In the coming era of Metaverse, Augmented Reality (AR) has become a key enabler of diverse applications including healthcare, education, smart cities, and entertainments. To provide users with interactive and immersive experience, most AR applications require extremely high responsiveness and ultra-low processing latency. Mobile edge computing (MEC) has demonstrated great potentials in meeting such stringent latency requirements and resource demands of AR applications, by implementing AR requests in edge servers within the proximity of users. In this paper, we investigate the reward maximization problem for AR applications with uncertain resource demands in an MEC network, such that the accumulative reward of services provided for AR applications is maximized, while ensuring that the responsiveness of AR applications is enhanced, subject to network resource capacity. To this end, we formulate an exact solution when the problem size is small, otherwise we devise an efficient approximation algorithm with a provable approximation ratio for the problem. We also develop an online learning algorithm with a bounded regret for the dynamic reward maximization problem without the knowledge of future arrivals of AR requests, by adopting the Multi-Armed Bandits (MAB) technique. Considering maximizing the reward may defer the implementations of some urgent yet low-award requests, we propose a fairness-aware online learning algorithm for the dynamic reward maximization problem, through a data rate prediction mechanism that adopts a multi-task and multi-timescale Long Short-Term Memory (MT2-LSTM) method. Finally, we evaluate the performance of the proposed algorithms for AR applications by building a real test bed. Experimental results show that the proposed algorithms outperform existing studies by improving the award by 13%.