학술논문

Deep Reinforcement Learning for Improving Resource Utilization Efficiency of URLLC With Imperfect Channel State Information
Document Type
Periodical
Source
IEEE Wireless Communications Letters IEEE Wireless Commun. Lett. Wireless Communications Letters, IEEE. 12(10):1796-1800 Oct, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Symbols
Resource management
Ultra reliable low latency communication
Channel estimation
Channel models
Reliability
Costs
Ultra-reliable and low-latency communications
deep reinforcement learning
resource allocation
Language
ISSN
2162-2337
2162-2345
Abstract
This letter optimizes resource blocks allocated for channel estimation and data transmission in multiple-input and single-output (MISO) ultra-reliable low-latency communication (URLLC) systems. The goal is to improve the resource utilization efficiency subject to a reliability constraint. Considering that wireless channels are correlated in the temporal domain, and the channel estimation is not error-free, the problem is formulated as a partial observation Markov decision process (POMDP), which is a sequential decision-making problem with partial observations. To solve this problem, we develop a constrained deep reinforcement learning (DRL) algorithm, namely Cascaded-Action Twin Delayed Deep Deterministic policy (CA-TD3). We train the policy by using a primal domain method and compare it with a primal-dual method and an existing benchmark. Two channel models are considered for evaluation: the first-order autoregressive correlated channel model and the clustered delay line (CDL) channel model. The simulation results show that the proposed primal CA-TD3 method converges faster than the primal-dual method and achieves over 30% improvement in resource utilization efficiency compared to the benchmark.