학술논문

A Visual and Textual Information Fusion-Based Zero-Shot Framework for Hazardous Material Placard Detection and Recognition
Document Type
Periodical
Source
IEEE Transactions on Artificial Intelligence IEEE Trans. Artif. Intell. Artificial Intelligence, IEEE Transactions on. 5(4):1755-1768 Apr, 2024
Subject
Computing and Processing
Visualization
Feature extraction
Codes
Hazardous materials
Transportation
Training
Object detection
Artificial intelligence in transportation
intelligent systems
knowledge transfer
text processing
Language
ISSN
2691-4581
Abstract
Automatically detecting and recognizing hazardous material placards using computer vision-based methods ensures safe operations and proper management of dangerous freight transportation. Deep learning-based object detection methods provide viable and practical solutions to varied applications. However, contemporary deep learning-based methods suffer from imbalanced and unseen classes, which are very common in real-life data. Thus, this study, drawing attention to this hitherto neglected challenge in real-world applications, proposes a deep learning-based zero-shot framework to detect and recognize the hazardous material placards of both imbalanced and open classes. A logarithmic weighted cross-entropy is proposed to balance the closed classes during training. In addition, a logarithmic weighted confidence fusion strategy is designed to fuse the separately extracted visual and textual information. The experiments on real-world transportation data demonstrated the proposed framework's effectiveness and superiority over other state-of-the-art methods. Notably, our framework outperforms the previous method with a remarkable margin of 12.8% in the F1 score on the placard dataset. This study solves the imbalanced and open class problem by fusing object visual information and text information, providing a practical industrial application of the zero-shot learning concept.