VOOZH about

URL: https://www.geeksforgeeks.org/machine-learning/ordering-points-to-identify-cluster-structure-optics-using-sklearn/

⇱ Ordering Points To Identify Cluster Structure (OPTICS) using Sklearn - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Ordering Points To Identify Cluster Structure (OPTICS) using Sklearn

Last Updated : 13 Sep, 2025

OPTICS (Ordering Points To Identify the Clustering Structure) is a clustering algorithm used to find clusters of different shapes and densities in a dataset. It works like DBSCAN but gives better results when data has clusters with varying densities.

Why we use OPTICS instead of DBSCAN?

  • DBSCAN needs a fixed eps which may not work well if some clusters are tight and others are loose.
  • OPTICS doesn’t force you to set a global distance. It gives a reachability plot and clusters can be extracted from it at different levels.
  • OPTICS handles datasets with varying densities better and identify both dense and sparse clusters in one go.
  • It provides more detailed cluster structure information making it easier to explore data visually and decide the best cut-off points for clusters.

Step 1: Importing Libraries

We will import all the necessary libraries like Matplotlib , numpy and scikit-learn.

Step 2: Creating Sample Data

We generate 6 different groups of points (clusters) each in a different location and with different densities. All groups are combined into one big dataset i.e X_modified.

Step 3: Apply OPTICS Clustering

Now we will apply OPTICS Clustering

  • min_samples=40: Minimum number of points to form a dense region.
  • xi=0.1 helps in detecting changes in cluster density.
  • min_cluster_size=0.1: Minimum size of clusters as fraction of dataset.

Output:

πŸ‘ Optics
OPTICS Model

Step 4: Extract Clusters Using DBSCAN Logic

These labels define clusters based on different eps or distance thresholds.

  • eps=0.7 finds smaller or tighter groups.
  • eps=1.5 finds larger or broader groups.

Step 5: Prepare Values for Plotting

These help us plot how reachable each point is from others.

Step 6: Plotting the Results

Finally all the results are visualized in four subplots:

  • The reachability plot visualizes density-based clustering where valleys indicate clusters and peaks suggest noise or separations.
  • The bottom-left plot (OPTICS Clustering) shows automatically detected clusters based on density variations.
  • The middle plot (DBSCAN, eps=0.7) extracts smaller and tight clusters.
  • The right plot (DBSCAN, eps=1.5) merges clusters into broader groups.

Output:

πŸ‘ optics-

This comparison highlights OPTICS ability to detect clusters of varying densities while DBSCAN requires an appropriate epsilon value to segment data effectively. This visualization gives better insights for understand data's structure and identifying clusters and sparse regions.

To download complete code : Click here

Comment