학술논문

Efficient k-means++ with random projection

Document Type

Conference

Author

Source

2017 International Joint Conference on Neural Networks (IJCNN) Neural Networks (IJCNN), 2017 International Joint Conference on. :94-100 May, 2017

Subject

Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Robotics and Control Systems
Signal Processing and Analysis

Language

ISSN

2161-4407

Abstract

We propose three algorithms as approximations to k-means++ with complexity only O(nk) which is independent of D with n being the number of data points, k being the number of clusters and D the dimensionality. By combining random projection and k-means++, the proposed algorithm still enjoying the lower-bounded performance by results given for k-means++, on the other hand, the errors introduced by the approximation are also bounded by JL lemma, which is sufficently low in the experiment we presented. By approximating D(x) 2 in the k-means++ algorithm, we obtain an approximate algorithm to k-means++ of complexity only O(nk). The approximation to D(x) 2 can be obtained through random projections by only perturbing the distances to a provably small extent. Our experiments on real-world datasets show that, when k is small, the performance of minimization of the potential deteriorated only by 3%–5% while the execution time can be shortened to 50% of the original. With a large k, compared to the results of our proposed algorithms with small values of k, the error introduced by random projection is much reduced and the speed up of the execution time is improved. In our experiments, it is shown that the speed up of the proposed algorithms is up to 20 times compared to k-means++.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송