학술논문

Tri-Training Based Learning from Positive and Unlabeled Data
Document Type
Conference
Source
2008 International Symposiums on Information Processing Information Processing (ISIP), 2008 International Symposiums on. :640-644 May, 2008
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Semisupervised learning
Iterative algorithms
Bayesian methods
Support vector machines
Support vector machine classification
Educational institutions
Sampling methods
Convergence
Supervised learning
Training data
Semi-supervised Learning
Tri-training
Learning From Positive And Unlabeled Data
Language
Abstract
This paper studies the problem of learning text classifier using positive and unlabeled examples with tri-training algorithm, which has been brought forward for semi-supervised learning. The key feature is that there are no negative examples. This paper proposed a new tri-training algorithm for the LPU problem that combines the step 1 of the three LPU algorithms to extract a reliable negative examples set, consequently to build an initial classifier for the tri-training and replace the bootstrap sampling procedure that has not been thought as a good method, and then iteratively use the three SVM classifiers until they convergence. Experiments on the popular Reuter21578 collection show the effectiveness of our proposed technique.