학술논문

Learning Sequential Descriptors for Sequence-Based Visual Place Recognition
Document Type
Periodical
Source
IEEE Robotics and Automation Letters IEEE Robot. Autom. Lett. Robotics and Automation Letters, IEEE. 7(4):10383-10390 Oct, 2022
Subject
Robotics and Control Systems
Computing and Processing
Components, Circuits, Devices and Systems
Visualization
Transformers
Taxonomy
Robots
Databases
Computer architecture
Task analysis
Deep learning for visual perception
localization
representation learning
visual learning
Language
ISSN
2377-3766
2377-3774
Abstract
In robotics, visual place recognition (VPR) is a continuous process that receives as input a video stream to produce a hypothesis of the robot's current position within a map of known places. This work proposes a taxonomy of the architectures used to learn sequential descriptors for VPR, highlighting different mechanisms to fuse the information from the individual images. This categorization is supported by a complete benchmark of experimental results that provides evidence of the strengths and weaknesses of these different architectural choices. The analysis is not limited to existing sequential descriptors, but we extend it further to investigate the viability of Transformers instead of CNN backbones. We further propose a new ad-hoc sequence-level aggregator called SeqVLAD, which outperforms prior state of the art on different datasets. The code is available at https://github.com/vandal-vpr/vg-transformers.