학술논문

Construction of Diverse Image Datasets From Web Collections With Limited Labeling
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 30(4):1147-1161 Apr, 2020
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Visualization
Labeling
Manuals
Image coding
Training
Google
Search engines
Image dataset
active learning
sparse optimization
joint image-text analysis
Language
ISSN
1051-8215
1558-2205
Abstract
Image datasets play a pivotal role in advancing computer vision and multimedia research. However, most of the datasets are created by extensive human effort and are extremely expensive to scale-up. To address these issues, several automatic and semi-automatic approaches have been proposed for creating datasets by refining web images. However, these approaches either include significant redundant images in the dataset or fail to provide a diverse enough set to train a robust classifier. Ideally, a representative subset should be both semantically and visually diverse so as to provide the maximum amount of information under the current budget. Most current approaches are entirely based on the analysis of visual features, which may not correlate well with image semantics, and hence, collected images may not be sufficient to give a detailed understanding of a category. In this paper, we propose a system for creating diverse image dataset collections from the web with limited manual labeling effort. It is based upon a semi-supervised sparse coding framework that employs a joint visual-semantic space to simultaneously utilize both the images and associated textual information from the web for dataset construction. In addition, the proposed system is online and is capable of collecting more discriminative images continuously as new data becomes available, which is also suitable for enriching the existing datasets. The experiments demonstrate that our system can create and enrich datasets with limited manual labeling, with better cross-dataset generalization capability and diversity compared to the state-of-the-art datasets.