학술논문

From Multi-Source Virtual to Real: Effective Virtual Data Search for Vehicle Re-Identification
Document Type
Periodical
Author
Source
IEEE Transactions on Intelligent Transportation Systems IEEE Trans. Intell. Transport. Syst. Intelligent Transportation Systems, IEEE Transactions on. 25(5):3433-3444 May, 2024
Subject
Transportation
Aerospace
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Training
Data models
Redundancy
Solid modeling
Pipelines
Three-dimensional displays
Engines
Virtual-to-real vehicle re-identification
data redundancy
data search
Language
ISSN
1524-9050
1558-0016
Abstract
Without tedious and time-consuming labeling processes, virtual datasets have recently shown their superiority for vehicle re-identification (re-ID). Existing virtual to real vehicle re-ID methods employ only a single virtual dataset for model training, while datasets from different generative sources are not jointly exploited. Multiple source virtual datasets contain more data diversity that can boost model performance. We thus propose a multi-source virtual to real vehicle re-ID pipeline, where multiple source virtual datasets are used during training. However, the multi-source virtual dataset suffers from more data redundancy than the single virtual dataset, which can affect the training efficiency. Intuitively, it can be mitigated by virtual data search. Unlike a single virtual dataset, a performance gap exists between multiple source virtual datasets, indicating their different contributions to model learning. Accordingly, we propose to split the multi-source virtual dataset into the main training set and the auxiliary training set, and then design the sampling strategy separately. For the main training set, the Consistent Attribute Distribution-FEature distance Trade-off (CAD-FET) strategy is designed to search for representative data. For the auxiliary training set, a cluster-based sampling strategy is further proposed to search for the most diverse subset. Besides, a simple yet effective two-stage training strategy is proposed to utilize these subsets reasonably. Extensive virtual-to-real vehicle re-ID experiments show that our data sampling method can reduce the volume of the multi-source virtual dataset by around 77%/96% and boost the model performance when tested on the VeRi776/VehicleID.