![]() |
VOOZH | about |
DBScan (Density-Based Spatial Clustering of Applications with Noise) is a non-linear, unsupervised clustering algorithm that identifies groups (clusters) of densely packed data points without requiring the number of clusters to be specified beforehand. Unlike algorithms like k-means, DBScan is capable of discovering arbitrarily shaped clusters and distinguishing noise or outliers in datasets.
The diagram shows DBSCAN clustering where core points have ≥ 4 neighbors within a 1-unit radius, border points are near core points but not dense enough, and noise points lie outside any dense region.
We implement the DBScan clustering algorithm in R to identify non-linear clusters and detect noise in an unsupervised learning setting.
We install and load the fpc package which provides the DBScan functionality.
We load and view the built-in Iris dataset to understand its structure.
Output:
We remove the label column to prepare the dataset for unsupervised clustering.
We fit the DBScan clustering model on the prepared dataset with specified parameters.
Output:
We extract the cluster assignments and compare them to the original species for evaluation.
Output:
We visualize the clusters to understand the spatial groupings formed by DBScan.
Output:
The output displays a 2D scatter plot of DBSCAN clustering results, where points are colored by cluster labels and noise points are marked separately, helping visualize spatial groupings in the Iris dataset.