학술논문

Recurring concept detection for spam filtering
Document Type
Conference
Source
17th International Conference on Information Fusion (FUSION) Information Fusion (FUSION), 2014 17th International Conference on. :1-7 Jul, 2014
Subject
Aerospace
Computing and Processing
General Topics for Engineers
Photonics and Electrooptics
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Context
Training
Accuracy
Adaptation models
Unsolicited electronic mail
Context modeling
Prediction algorithms
Language
Abstract
In this work we dig into the problem of recurring concept drifts, proposing a framework to manage them. Its implementation and evaluation phases have been oriented to solve the spam detection problem, taking into account that it is a real-world situation where concepts (spam patterns) may reappear. The possibility of detecting recurring drifts allows to reuse previously learnt models, enhancing the overall learning process specifically in terms of accuracy and efficiency. Consequently, in this paper we propose the Meta-Model Drift Detector (MM-DD). The proposed system is able to deal with the underlying context that results from the drifts detected throughout the data stream learning process. In order to do so, a meta-model is trained in parallel to the learning process. While the learning process of the base classifier is feeding the meta-model with all the context information when a drift occurs, the later is able to predict in the near future recurrent situations. Therefore, when a drift is detected the meta-model checks if the context information is equal to any of the previously managed by the learning process and provides the most suitable stored model to deal with the concept. Our experimental results support the value of the proposed MM-DD in terms of accuracy when compared with existing approaches.