학술논문

Markov Decision Process Design for Imitation of Optimal Task Schedulers
Document Type
Conference
Source
2023 IEEE Statistical Signal Processing Workshop (SSP) Statistical Signal Processing Workshop (SSP), 2023 IEEE. :56-60 Jul, 2023
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Engineering Profession
Power, Energy and Industry Applications
Signal Processing and Analysis
Process design
Sequential analysis
Processor scheduling
Signal processing algorithms
Training data
Reinforcement learning
Markov processes
Scheduling
imitation learning
Markov decision process
tree search
Language
ISSN
2693-3551
Abstract
Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.