학술논문

A Sampling-Based Framework for Transductive Classification in Information Networks
Document Type
Conference
Source
2019 8th Brazilian Conference on Intelligent Systems (BRACIS) BRACIS Intelligent Systems (BRACIS), 2019 8th Brazilian Conference on. :657-662 Oct, 2019
Subject
Computing and Processing
Knowledge acquisition
Feature extraction
Pattern classification
Semisupervised learning
Sampling methods
network sampling
classification
regularization
Language
ISSN
2643-6264
Abstract
Knowledge extraction from large information networks has received increasing attention in recent years. Among existing methods for knowledge extraction, transductive classification is a well-known semi-supervised learning method, where both labeled and unlabeled vertices are used in the learning process. However, transductive classification tasks become impractical in large information networks and the use of network sampling techniques in the transductive classification setting is not a trivial task, since it is required that all the vertices of the original network be classified during the transductive learning – and not only the vertices of the sample. In this paper, we present a framework called TCSN (Transductive Classification for Sampled Networks). TCSN allows the use of various network sampling techniques, as well as enables the use of various methods of transductive classification for information networks. We present a variation of the Chernoff Bounds method to calculate the minimum size of a sampled network, thereby bounding sampling error within a pre-specified tolerance level. Moreover, TCSN extends the concept of evidence accumulation to combine the results of several rounds of transductive classification into a final classification. Experimental results from different information networks reveals that TCSN statistically outperformed the classification performance in the whole original network. These promising results show that the TCSN enables transductive classification in large information networks without loss of quality in the knowledge extraction process.