학술논문

The effect of data set characteristics on the choice of clustering validity index type

Document Type

Conference

Author

Temizel, Tugba Taskaya; Mizani, Mehrdad A; Inkaya, Tulin; Yucebas, Sait Can

Source

2007 22nd international symposium on computer and information sciences Computer and information sciences, 2007. iscis 2007. 22nd international symposium on. :1-6 Nov, 2007

Subject

Computing and Processing
Educational institutions
Clustering algorithms
Cities and towns
Informatics
Self organizing feature maps
Frequency
Partitioning algorithms
Employment
Cleaning
Industrial engineering

Language

Abstract

Clustering techniques are widely used to give insight about the similarities/dissimilarities between data set items. Most algorithms require the user to tune parameters such as number of clusters or threshold for cut-off point in a dendrogram. Such parameters also affect the clustering quality. In a good quality cluster, the intra-cluster similarity should be high, whereas the inter-cluster similarity should be low. To determine the optimal cluster number, several cluster validity methods have been proposed. However, there is no guideline with respect to which clustering validity methods can be used in conjunction with which clustering algorithms. In this paper, Dunn and SD validity indices were applied to Kohonen self organizing maps, k-means and agglomerative clustering algorithms and their limitations were shown empirically.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송