![]() |
VOOZH | about |
Two sets are called disjoint sets if they don't have any element in common. The disjoint set data structure is used to store such sets. It supports following operations:
Consider a situation with a number of persons and the following tasks to be performed on them:
Examples:
We are given 10 individuals say, a, b, c, d, e, f, g, h, i, j
Following are relationships to be added:
a <-> b
b <-> d
c <-> f
c <-> i
j <-> e
g <-> jGiven queries like whether a is a friend of d or not. We basically need to create following 4 groups and maintain a quickly accessible connection among group items:
G1 = {a, b, d}
G2 = {c, f, i}
G3 = {e, g, j}
G4 = {h}
Partitioning the individuals into different sets according to the groups in which they fall. This method is known as a Disjoint set Union which maintains a collection of Disjoint sets and each set is represented by one of its members.
To answer the above question two key points to be considered are:
Array: An array of integers is called Parent[]. If we are dealing with N items, i'th element of the array represents the i'th item. More precisely, the i'th element of the Parent[] array is the parent of the i'th item. These relationships create one or more virtual trees.
Tree: It is a Disjoint set. If two elements are in the same tree, then they are in the same Disjoint set. The root node (or the topmost node) of each tree is called the representative of the set. There is always a single unique representative of each set. A simple rule to identify a representative is if 'i' is the representative of a set, then Parent[i] = i. If i is not the representative of his set, then it can be found by traveling up the tree until we find the representative.
The task is to find representative of the set of a given element. The representative is always root of the tree. So we implement find() by recursively traversing the parent array until we hit a node that is root (parent of itself).
The task is to combine two sets and make one. It takes two elements as input and finds the representatives of their sets using the Find operation, and finally puts either one of the trees (representing the set) under the root node of the other tree.
Are 1 and 2 in the same set? Yes
The above union() and find() are naive and the worst case time complexity is linear. The trees created to represent subsets can be skewed and can become like a linked list. Following is an example of worst case scenario.
The above operations can be optimized using Union by Rank and Path Compression.