학술논문

Two Novel Approaches for Automatic Labelling in Semi-Supervised Methods

Document Type

Conference

Author

Da S. Barreto, C.A.; De P. Canuto, A.M.; Xavier, J.C.; Gorgonio, A.C.; Lima, D.F.A.; Da Costa, R.R.F.

Source

2020 International Joint Conference on Neural Networks (IJCNN) Neural Networks (IJCNN), 2020 International Joint Conference on. :1-8 Jul, 2020

Subject

Bioengineering
Computing and Processing
General Topics for Engineers
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Labeling
Training
Buildings
Robustness
Informatics
Mathematics
Semisupervised learning

Language

ISSN

2161-4407

Abstract

In real world classification problems, the amount of labelled data is usually limited (very hard or expensive to manually label the instances). However, a natural limitation of a classification algorithm is that it needs to have a set of labelled instances with a reasonable size in order to achieve a reasonable performance. Therefore, one solution to smooth out this problem is the use of semi-supervised learning. Several semi-supervised approaches (e.g. self training) have been proposed in the literature, aiming to use only a few labelled instances, to train a classifier, and to apply a labelling process in which a high number of unlabelled instances is labelled and included in the labelled set. However, this approach can include unreliable instances to the labelled set, impairing the performance of a semi-supervised method. In other words, the selection criterion to include newly labelled instances in the labelled set as well as the labelling step have an important effect in the performance of a semi-supervised method. In this paper, we propose two new approaches for automatic labelling in semi-supervised methods based on the prediction agreement of a pool of classifier as selection criterion. In addition, we compare them to the standard self-training method, and one variation of it called Flexible Confidence Classifier as baselines. In general, both methods obtained significantly better predictive results than the other two methods over 40 classification datasets.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송