학술논문

DeepLS: Local Search for Network Optimization Based on Lightweight Deep Reinforcement Learning
Document Type
Periodical
Source
IEEE Transactions on Network and Service Management IEEE Trans. Netw. Serv. Manage. Network and Service Management, IEEE Transactions on. 21(1):108-119 Feb, 2024
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Optimization
Training
Routing
Heuristic algorithms
Reinforcement learning
Search problems
Metaheuristics
Deep reinforcement learning
local search
IP networks
optical networks
NP-Hard optimization
Language
ISSN
1932-4537
2373-7379
Abstract
Deep Reinforcement Learning (DRL) is being investigated as a competitive alternative to traditional techniques for solving network optimization problems. A promising research direction lies in enhancing traditional optimization algorithms by offloading low-level decisions to a DRL agent. In this study, we consider how to effectively employ DRL to improve the performance of Local Search algorithms, i.e., algorithms that, starting from a candidate solution, explore the solution space by iteratively applying local changes (i.e., moves), yielding the best solution found in the process. We propose a Local Search algorithm based on lightweight Deep Reinforcement Learning (DeepLS) that, given a neighborhood, queries a DRL agent for choosing a move, with the goal of achieving the best objective value in the long term. Our DRL agent, based on permutation-equivariant neural networks, is composed by less than a hundred parameters, requiring only up to ten minutes of training and can evaluate problem instances of arbitrary size, generalizing to networks and traffic distributions unseen during training. We evaluate DeepLS on two illustrative NP-Hard network routing problems, namely OSPF Weight Setting and Routing and Wavelength Assignment, training on a single small network only and evaluating on instances 2x-10x larger than training. Experimental results show that DeepLS outperforms existing DRL-based approaches from literature and attains competitive results with state-of-the-art metaheuristics, with computing times up to 8x smaller than the strongest algorithmic baselines.