VOOZH about

URL: https://www.geeksforgeeks.org/dsa/job-sequencing-problem-using-disjoint-set/

⇱ Job Sequencing Problem using Disjoint Set - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Job Sequencing Problem using Disjoint Set

Last Updated : 23 Jul, 2025

Given three arrays id[], deadline[], profit[], where each job i is associated with id[i], deadline[i], and profit[i]. Each job takes 1 unit of time to complete, and only one job can be scheduled at a time. You will earn the profit associated with a job only if it is completed by its deadline. The task is to find the maximum profit that can be gained by completing the jobs and the count of jobs completed to earn the maximum profit.

Examples:

Input: id[] = [1, 2, 3, 4]
deadline[] = [4, 1, 1, 1]
profit[] = [20, 10, 40, 30]
Output: 2 60
Explanation: All jobs other than the first job have a deadline of 1, thus only one of these and the first job can be completed, with the total profit gain of 20 + 40 = 60.

Input: id[] = [1, 2, 3, 4, 5]
deadline[] = [2, 1, 2, 1, 1]
profit[] = [100, 19, 27, 25, 15]
Output: 2 127
Explanation: The first and third job have a deadline of 2, thus both of them can be completed and other jobs have a deadline of 1, thus any one of them can be completed. Both the jobs with a deadline of 2 is having the maximum associated profit, so these two will be completed, with the total profit gain of 100 + 27 = 127.

A greedy solution of time complexity O(n Log n) is already discussed. Below is the simple Greedy Algorithm.

  1. Sort all jobs in decreasing order of profit.
  2. Initialize the result sequence as first job in sorted jobs.
  3. Do following for remaining n-1 jobs 
    • If the current job can fit in the current result sequence without missing the deadline, add current job to the result. Else ignore the current job.

The costly operation in the Greedy solution is to assign a free slot for a job. We were traversing each and every slot for a job and assigning the greatest possible time slot(<deadline) which was available.

What does greatest time slot means?
Suppose that a job J1 has a deadline of time t = 5. We assign the greatest time slot which is free and less than the deadline i.e 4-5 for this job. Now another job J2 with deadline of 5 comes in, so the time slot allotted will be 3-4 since 4-5 has already been allotted to job J1.
Why to assign greatest time slot(free) to a job?
Now we assign the greatest possible time slot since if we assign a time slot even lesser than the available one then there might be some other job which will miss its deadline. 

Example: 
J1 with deadline d1 = 5, profit 40 
J2 with deadline d2 = 1, profit 20 
Suppose that for job J1 we assigned time slot of 0-1. Now job J2 cannot be performed since we will perform Job J1 during that time slot.

Using Disjoint Set for Job Sequencing
All time slots are individual sets initially. We first find the maximum deadline of all jobs. Let the max deadline be m. We create m+1 individual sets. If a job is assigned a time slot of t where t >= 0, then the job is scheduled during [t-1, t]. So a set with value X represents the time slot [X-1, X]. 
We need to keep track of the greatest time slot available which can be allotted to a given job having deadline. We use the parent array of Disjoint Set Data structures for this purpose. The root of the tree is always the latest available slot. If for a deadline d, there is no slot available, then root would

Below are the detailed steps.

  • The idea is to Disjoint Sets and create individual set for all available time slots.
  • First find the maximum deadline of all the jobs, let's call it b. Now create a disjoint set with d + 1 nodes, where each set is independent of other.
  • Sort the jobs based on profit associated in descending order.
  • Start with the first job, and for each job find the available slot which is closest to its deadline. Occupy the available slot and merge the slot with slot-1, by assigning slot-1 as parent of slot. If slot value is 0, it means no slot is available, so move to the next job.
  • At last find the sum of all the jobs with allocated slots.

How come find() of disjoint set returns the latest available time slot?
Initially, all time slots are individual slots. So the time slot returned is always maximum. When we assign a time slot ‘t’ to a job, we do union of ‘t’ with ‘t-1’ in a way that ‘t-1’ becomes the parent of ‘t’. To do this we call union(t-1, t). This means that all future queries for time slot t would now return the latest time slot available for set represented by t-1.


Output
2 127

Time Complexity: O(n * log(d)), where d is the maximum deadline of all the jobs.
Auxiliary Space: O(d) 

Comment
Article Tags: