![]() |
VOOZH | about |
OPTICS (Ordering Points To Identify the Clustering Structure) is a density-based clustering algorithm similar to DBSCAN clustering. Unlike DBSCAN which struggles with varying densities. OPTICS does not directly assign clusters but instead creates a reachability plot which visually represents clusters. The key concepts in OPTICS are:
A reachability plot is a graph that helps visualize clustering structures. It shows the reachability distance of each point in the dataset. It makes it ordered way based on how OPTICS processes them. Here clusters appear as valleys in the plot where lower reachability distances indicate dense regions while peaks represent sparse regions or noise.
,To better understand the concept refer to the image below:
It is more informative than DBSCAN as the reachability plot provides better understanding of clustering structure. Now we will learn about its working.
Below is the Python implementation using scikit-learn to demonstrate OPTICS on a synthetic dataset of varying densities:
Output:
| Feature | OPTICS | DBSCAN |
|---|---|---|
| Handles Varying Densities | Can detect clusters of different densities. | Struggles with varying densities as it requires a single epsilon value for all points. |
| Cluster Identification | Uses a reachability plot for cluster extraction and identifies hierarchical structures. | Directly assigns clusters without hierarchical structure. |
| Hierarchical Structure | Yes it can detect nested clusters. | No it does not support hierarchical clustering. |
| Runtime Complexity | Higher due to sorting and ordering of reachability distances. | Lower as it processes fewer calculations. |
| Memory Cost | Requires more memory as it maintains a priority queue (Min Heap) for reachability. | Lower memory usage. |
| Fewer Parameters | Less sensitive to the epsilon parameter and can work with a large max_eps, but still uses a neighborhood radius internally. | Requires careful tuning of epsilon and minPts parameters for effective clustering. |
| Noise Handling | Does not directly identify noise points but high reachability distances may indicate noise. | Directly identify core points, boundary points and noise points. |
| Cluster Extraction | Produces a reachability distance plot for flexible extraction at different granularities. | Assigns clusters directly based on density criteria without additional plots. |
OPTICS is widely used for clustering algorithm that works well for identifying clusters of varying densities. It provides flexibility through reachability plots which allows dynamic cluster extraction. While computationally more expensive it is useful for complex datasets where density variation is significant.