학술논문

Drone Navigation and Avoidance of Obstacles Through Deep Reinforcement Learning
Document Type
Conference
Source
2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC) Digital Avionics Systems Conference (DASC), 2019 IEEE/AIAA 38th. :1-7 Sep, 2019
Subject
Aerospace
Communication, Networking and Broadcast Technologies
Computing and Processing
Drones
UAV
Deep Reinforcement Learning
Q-Network
DDQN
JNN
Language
ISSN
2155-7209
Abstract
Unmanned aerial vehicles (UAV) specifically drones have been used for surveillance, shipping and delivery, wildlife monitoring, disaster management etc. The increase on the number of drones in the airspace worldwide will lead necessarily to full autonomous drones. Given the expected huge number of drones, if they were operated by human pilots, the possibility to collide with each other could be too high. In this paper, deep reinforcement learning (DRL) architecture is proposed to make drones behave autonomously inside a suburb neighborhood environment. The environment in the simulator has plenty of obstacles such as trees, cables, parked cars and houses. In addition, there are also another drones, acting as moving obstacles, inside the environment while the learner drone has a goal to achieve. In this way the drone can be trained to detect stationary and moving obstacles inside the neighborhood and so the drones can be used safely in a public area in the future. The drone has a front camera and it can capture continuously depth images. Every depth image is part of the state used in DRL architecture. Also, another part of the state is the distance to the geo-fence (a virtual barrier on the environment) which is added as a scalar value. The agent will be rewarded negatively when it tries to overpass the geo-fence limits. In addition, angle to goal and elevation angle between the goal and the drone will be used as information to be added to the state. It is considered that these scalar values will improve the DRL performance and also the reward obtained. The drone is trained using Q-Network and its convergence and final reward are evaluated. The states containing image and several scalars are processed by a neural network that joints the two state parts into a unique flow. This neural network is named as Joint Neural Network (JNN) [1]. The training and test results show that the agent can successfully learn to avoid any obstacle in the environment. The results for three scenarios are very promising and the learner drone reaches the destination with a success rate 100% in first two tests and with a success rate 98% in the last test, this one with a total of three drones.