학술논문

Handling New Class in Online Label Shift
Document Type
Conference
Source
2023 IEEE International Conference on Data Mining (ICDM) ICDM Data Mining (ICDM), 2023 IEEE International Conference on. :1283-1288 Dec, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Adaptation models
Heuristic algorithms
Estimation
Benchmark testing
Medical diagnosis
Data mining
Diseases
data stream
distribution shift
new class
weakly supervised learning
online label shift
Language
ISSN
2374-8486
Abstract
In many real-world applications, data are continuously accumulated within open environments. For instance, in disease diagnosis, the prevalence of diseases can vary across seasons, and new types of diseases can emerge. This paper investigates the problem of learning from unlabeled data where the label distribution evolves over time, and meanwhile, previously unseen new class appears in the data stream. To handle the new class in online label shift, we first design a novel risk estimator by unbiased risk rewriting and mixture proportion estimation. Subsequently, we employ the online ensemble paradigm for model updating to handle unknown distribution shifts. The proposed approach enjoys a theoretical guarantee of dynamic regret, ensuring its effectiveness in adapting to the changing label distribution and the presence of the new class in streams. Experiments conducted on diverse benchmark datasets and two real-world applications demonstrate the effectiveness of the proposed algorithm.