학술논문

Deep Reinforcement Learning-Based Resource Allocation in Cooperative UAV-Assisted Wireless Networks
Document Type
Periodical
Source
IEEE Transactions on Wireless Communications IEEE Trans. Wireless Commun. Wireless Communications, IEEE Transactions on. 20(11):7610-7625 Nov, 2021
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Resource management
Optimization
Array signal processing
Trajectory
Wireless networks
Throughput
Unmanned aerial vehicles
Beamforming
limited fronthaul
optimization
reinforcement learning
UAV placement
Language
ISSN
1536-1276
1558-2248
Abstract
We consider the downlink of an unmanned aerial vehicle (UAV) assisted cellular network consisting of multiple cooperative UAVs, whose operations are coordinated by a central ground controller using wireless fronthaul links, to serve multiple ground user equipments (UEs). A problem of jointly designing UAVs’ positions, transmit beamforming, as well as UAV-UE association is formulated in the form of mixed integer nonlinear programming (MINLP) to maximize the sum UEs’ achievable rate subject to limited fronthaul capacity constraints. Solving the considered problem is hard owing to its non-convexity and the unavailability of channel state information (CSI) due to the movement of UAVs. To tackle these effects, we propose a novel algorithm comprising of two distinguishing features: (i) exploiting a deep Q-learning approach to tackle the issue of CSI unavailability for determining UAVs’ positions, (ii) developing a difference of convex algorithm (DCA) to efficiently solve for the UAV’s transmit beamforming and UAV-UE association. The proposed algorithm recursively solves the problem of interest until convergence, where each recursion executes two steps. In the first step, the deep Q-learning (DQL) algorithm allows UAVs to learn the overall network state and account for the joint movement of all UAVs to adapt their locations. In the second step, given the determined UAVs’ positions from the DQL algorithm, the DCA iteratively solves a convex approximate subproblem of the original non-convex MINLP problem with the updated parameters, where the problem’s variables are transmit beamforming and UAV-UE association. Numerical results show that our design outperforms the existing algorithms in terms of algorithmic convergence and network performance with a gain of up to 70%.