학술논문

Real-Time Heterogeneous Road-Agents Trajectory Prediction Using Hierarchical Convolutional Networks and Multi-Task Learning
Document Type
Periodical
Source
IEEE Transactions on Intelligent Vehicles IEEE Trans. Intell. Veh. Intelligent Vehicles, IEEE Transactions on. 9(2):4055-4069 Feb, 2024
Subject
Transportation
Robotics and Control Systems
Components, Circuits, Devices and Systems
Trajectory
Predictive models
Feature extraction
Adaptation models
Hidden Markov models
Computational modeling
Vehicle dynamics
Self-driving
trajectory prediction
hierarchical convolutional networks
multi-task learning
Language
ISSN
2379-8858
2379-8904
Abstract
Trajectory prediction of heterogeneous road agents such as vehicles, cyclists, and pedestrians in dense traffic plays an essential role in self-driving. Despite breakthroughs in trajectory prediction technology in recent years, challenges remain in world state representation, social interaction modeling, real-time computing, and road agent heterogeneity. To address these challenges, we propose a new model that employs hierarchical convolutional networks and multi-task learning to predict agents' trajectories. The model first achieves effective and unified representation of agent and scene context by rendering heterogeneous world states in a top-down multi-channel raster map. Based on this representation, we propose hierarchical convolutional networks to extract global interaction and local features of all agents simultaneously, enabling the model to predict multiple agents' trajectories in real-time in a single forward inference. In addition, we specifically design multi-task learning branches with dynamic adaptive anchors to capture differences in behavioral patterns of heterogeneous agents, allowing a single model to accurately predict multimodal trajectories of multi-class agents. Extensive experiments on public nuScenes and Lyft datasets demonstrate top model performance. Importantly, our model is faster (2.2x) and more computationally stable than state-of-the-art models, making it well-suited for mass-produced self-driving systems that require both performance and computational efficiency.