학술논문

Clustering aggregation

Document Type

Conference

Author

Source

21st International Conference on Data Engineering (ICDE'05) Data engineering Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on. :341-352 2005

Subject

Computing and Processing
Communication, Networking and Broadcast Technologies
Clustering algorithms
Information technology
Computer science
Robustness
Sampling methods
Partitioning algorithms
Data analysis

Language

ISSN

1063-6382
2375-026X

Abstract

We consider the following problem: given a set of clusterings, find a clustering that agrees as much as possible with the given clusterings. This problem, clustering aggregation, appears naturally in various contexts. For example, clustering categorical data is an instance of the problem: each categorical variable can be viewed as a clustering of the input rows. Moreover, clustering aggregation can be used as a meta-clustering method to improve the robustness of clusterings. The problem formulation does not require a-priori information about the number of clusters, and it gives a natural way for handling missing values. We give a formal statement of the clustering-aggregation problem, we discuss related work, and we suggest a number of algorithms. For several of the methods we provide theoretical guarantees on the quality of the solutions. We also show how sampling can be used to scale the algorithms for large data sets. We give an extensive empirical evaluation demonstrating the usefulness of the problem and of the solutions.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송