Clustering Algorithm for Multi-density Datasets

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is density-based clustering method. It discovers clusters with varied shapes, sizes and handles noise. But it fails to discover clusters of varied density. This problem arises due to its dependency on global parameters especially Eps (represents neighborhood radius for each point in dataset). This paper introduces very simple idea to deal with this problem. The idea is steamed from density-based methods especially DENCLUE (DENsity-based CLUstEring), DBSCAN algorithm and k-nearest neighbors. The proposed method estimates local density -for each point in dataset- as the sum of distances to the k-nearest neighbor, arranges points in ascending order based on local density. The algorithm starts the clustering process from the highest density point by adding un-clustered points that have similar density as first point in cluster. Similar means there is small variance in density between the current point and the first point in cluster. Also, the point is assigned to current cluster if the sum of distances to its Minpts-nearest neighbors is less than or equal to the density of first point (core point condition in DBSCAN). Experimental results show the efficiency of the proposed method in discovering varied density clusters from data.

Original languageEnglish
Pages (from-to)244-258
Number of pages15
JournalRomanian Journal of Information Science and Technology
Volume22
Issue number3-4
StatePublished - 2019

Keywords

  • Clustering methods
  • Data analysis
  • Data mining
  • Knowledge discovery
  • Un-supervised learning

Fingerprint

Dive into the research topics of 'Clustering Algorithm for Multi-density Datasets'. Together they form a unique fingerprint.

Cite this