학술논문

Enhanced SMART framework for gene clustering using successive processing
Document Type
Conference
Source
2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) Machine Learning for Signal Processing (MLSP), 2013 IEEE International Workshop on. :1-6 Sep, 2013
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
General Topics for Engineers
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Indexes
Clustering algorithms
Merging
Length measurement
Cancer
Algorithm design and analysis
Covariance matrices
SMART
Successive processing
Finite mixture models
Gene clustering
Language
ISSN
1551-2541
2378-928X
Abstract
In this paper, we develop an enhanced splitting merging awareness tactics (E-SMART) framework using successive processing. Instead of selecting the best clustering from the results by using clustering selection criterion in original SMART framework, we introduce a successive processing strategy into the framework to subtract clusters one by one in iterations. In doing so, the silhouette index is employed to evaluate the intermediate clusters and order them according to their index values from high to low. Then we subtract the best cluster from the original dataset and iterate the remaining dataset back to the splitting-while-merging (SWM) process to start a new iteration. The clustering and subtracting are repeated successively and terminated automatically, once no splitting happened in the SWM process. Consequently, all clusters can be obtained by iterations. We implement the framework using component-wise expectation maximization (CEM) for finite mixture models (FMM). The E-SMART-FMM implementation is tested in real NCI-60 cancer dataset. We evaluate the clustering results from the proposed algorithm, together with two existing self-splitting algorithms, using two popular validation indices other than the silhouette index. The results of both validation indices consistently demonstrate that E-SMART-FMM is superior to the existing algorithms.