학술논문

SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:12501-12516 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Clustering algorithms
Particle swarm optimization
Convergence
Estimation
Simulated annealing
Optimization
Robustness
Swarm intelligence
particle swarm optimization (PSO)
K-means
K-means++
simulated annealing
Gaussian estimation
data clustering
big data
cluster convergence
clustering metrics
local optima
Language
ISSN
2169-3536
Abstract
Data clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid algorithm named SA-PSO-GK++ that synergistically combines Particle Swarm Optimization (PSO), K-means++, Simulated Annealing (SA), and Gaussian Estimation of Distribution to tackle this issue effectively. The proposed SA-PSO-GK++ aims to overcome the drawbacks of existing methods by leveraging the strengths of each individual algorithm. The K-means++ initialization reduces the risk of poor initial centroids, while PSO aids in efficient search space exploration. GED provides a statistical model of the particle space, enabling the algorithm to generate new potential solutions that are statistically guided by the current best solutions. Additionally, the incorporation of Simulated Annealing allows the algorithm to escape local minima, thereby enhancing its global search capability. We evaluate the effectiveness of SA-PSO-GK++ using benchmark datasets from the UCI Machine Learning Repository, including the Iris, Breast cancer, Heart datasets and contraceptive method choice datasets. The proposed method outperforms conventional and some of the state-of-the-art hybrid clustering algorithms in terms of sum of euclidean distance, normalized index, and error rates. These advantages make SA-PSO-GK++ a compelling option for a wide range of clustering applications. The results offer promising avenues for future research in optimizing and applying this innovative clustering technique in diverse domains.