Abstract
In k-means clustering, we are given a set of n data points in d-dimensional space ℝd and an integer k and the problem is to determine a set of k points in ℝd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
Original language | English |
---|---|
Pages (from-to) | 1626-1633 |
Number of pages | 8 |
Journal | Journal of Zhejinag University: Science |
Volume | 7 |
Issue number | 10 |
DOIs | |
State | Published - Oct 2006 |
Externally published | Yes |
Keywords
- Cluster analysis
- Clustering algorithms
- Data analysis
- k-means algorithm