학술논문

Multitimescale Control and Communications With Deep Reinforcement Learning—Part I: Communication-Aware Vehicle Control
Document Type
Periodical
Source
IEEE Internet of Things Journal IEEE Internet Things J. Internet of Things Journal, IEEE. 11(9):15386-15401 May, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Delays
Optimization
Vehicle-to-everything
Task analysis
Throughput
Control theory
Packet loss
Deep reinforcement learning (DRL)
multi31 timescale decision making
platoon control (PC)
Language
ISSN
2327-4662
2372-2541
Abstract
An intelligent decision-making system enabled by vehicle-to-everything (V2X) communications is essential to achieve safe and efficient autonomous driving (AD), where two types of decisions have to be made at different timescales, i.e., vehicle control and radio resource allocation (RRA) decisions. The interplay between RRA and vehicle control necessitates their collaborative design. In this two-part paper (Part I and Part II), taking platoon control (PC) as an example use case, we propose a joint optimization framework of multitimescale control and communications (MTCCs) MTCCs based on deep reinforcement learning (DRL). In this article (Part I), we first decompose the problem into a communication-aware DRL-based PC subproblem and a control-aware DRL-based RRA subproblem. Then, we focus on the PC subproblem assuming an RRA policy is given, and propose the MTCC- PC algorithm to learn an efficient PC policy. To improve the PC performance under random observation delay, the PC state space is augmented with the observation delay and PC action history. Moreover, the reward function with respect to the augmented state is defined to construct an augmented state Markov decision process (MDP). It is proved that the optimal policy for the augmented state MDP is optimal for the original PC problem with observation delay. Different from most existing works on communication-aware control, the MTCC- PC algorithm is trained in a delayed environment generated by the fine-grained embedded simulation of cellular vehicle-to-everything communications rather than by a simple stochastic delay model. Finally, experiments are performed to compare the performance of MTCC- PC with those of the baseline DRL algorithms.