VOOZH about

URL: https://www.geeksforgeeks.org/dsa/edit-distance-dp-5/

⇱ Edit Distance - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Edit Distance

Last Updated : 4 May, 2026

Given two strings s1 and s2 and below operations that can be performed on s1. The task is to find the minimum number of edits (operations) to convert 's1' into 's2'.  

  • Insert: Insert any character before or after any index of s1
  • Remove: Remove a character of s1
  • Replace: Replace a character at any index of s1 with some other character.

Note: All of the above operations are of equal cost. 

Examples: 

Input: s1 = "geek", s2 = "gesek"
Output: 1
Explanation: We can convert s1 into s2 by inserting an 's' between two consecutive 'e' in s1.

Input: s1 = "gfg", s2 = "gfg"
Output: 0
Explanation: Both strings are same.

Input: s1 = "GEEXSFRGEEKKS" and s2 = "GEEKSFORGEEKS"
Output: 3
Explanation: Operation 1: Replace 'X' to 'K'
Operation 2: Insert 'O' between 'F' and 'R'
Operation 3: Remove second last character i.e. 'K'
Refer the below image for better understanding.

👁 example

[Naive Approach] Using Recursion-O(3^(m+n)) time and space O(m*n)

The idea is to process all characters one by one starting from either from left or right sides of both strings. Let us process from the right end of the strings, there are two possibilities for every pair of characters being traversed, either they match or they don't match.

  • If last characters of both string matches then we simply recursively calculate the answer for rest of part of the strings.
  • When last characters do not match, we perform all three operations to match the last characters, i.e. insert, replace, and remove. And then recursively calculate the result for the remaining part of the string. Upon completion of these operations, we select the minimum answer and add 1 to it.

Below is the recursive tree for this problem considering the case when the last characters never match.

👁 edit_distance_m_n_

When the last characters of strings matches. Make a recursive call editDistance(m-1, n-1) to calculate the answer for remaining part of the strings.

When the last characters of strings don't matches. Make three recursive calls as show below:

  • Insert s2[n-1] at last of s1 : editDistance(m, n-1)
  • Replace s1[m-1] with s2[n-1] : editDistance(m-1, n-1)
  • Remove s1[m-1] : editDistance(m-1, n)

Recurrence Relations

Case 1: When the last character of both the strings are same. editDistance(s1, s2, m, n) = editDistance(s1, s2, m-1, n-1)

Case 2: When the last characters are different
editDistance(s1, s2, m, n) = 1 + Minimum{ editDistance(s1, s2 ,m,n-1), editDistance(s1, s2 ,m-1, n), editDistance(s1, s2 , m-1, n-1)}

Base Case

  • Case 1: When s1 becomes empty i.e. m=0, return n, as it require n insertions to convert an empty string to s2 of size n
  • Case 2: When s2 becomes empty i.e. n=0, return m, as it require m removals to convert s1 of size m to an empty string.

Output
3


[Better Approach 1] Using Top-Down DP (Memoization) - O(m*n) time and O(m*n) space

In the above recursive approach, there are several overlapping subproblems:
editDist(m-1, n-1) is called Three times
editDist(m-1, n-2) is called Two times
editDist(m-2, n-1) is called Two times. And so on...

So, we can use Memoization to store the result of each subproblems to avoid recalculating the result again and again.

Below is the illustration of overlapping subproblems during the recursion.

👁 insert

Output
3

[Better Approach 2] Using Bottom-Up DP (Tabulation)-O(m*n) time and O(m*n) space

Use a tableto store solutions of subproblems to avoiding recalculate the same subproblems multiple times. By doing this, if same subproblems repeated during, we retrieve the solutions from the table itself.

Below are the steps to convert the recursive approach to Bottom up approach:

1. Choosing Dimensions of Table: The state of smaller sub-problems depends on the input parameters m and n because at least one of them will decrease after each recursive call. So we need to construct a 2D table dp[][] to store the solution of the sub-problems.

2. Choosing Correct size of Table: The range of parameters goes from 0 to m and 0 to n. So we choose dimensions as (m + 1)*(n + 1)

3. Filling the table: It consist of two stages, table initialisation and building the solution from the smaller subproblems:

  • Table initialisation: Before building the solution, we need to initialise the table with the known values i.e. base case. Here m = 0 and n = 0 is the situation of the base case, so we initialise first-column dp[i][0] with i and first-row dp[0][j] with j.
  • Building the solution of larger problems from the smaller subproblems: We can easily define the iterative structure by using the recursive structure of the above recursive solution.
  • if (s1[i - 1] == s2[j - 1]) dp[i][j] = dp[i - 1][j - 1];
  • if (s1[i - 1] != s2[j - 1]) dp[i][j] = 1 + min(dp[i][j - 1], dp[i - 1][j], dp[i - 1][j - 1]);

4. Returning final solution: After filling the table iteratively, our final solution gets stored at the bottom right corner of the 2-D table i.e. we return dp[m][n] as an output.

Illustration:



Output
3

[Expected Approach 1] Using Space Optimized DP-O(m*n) time and space O(n)

To fill a row in DP array we require only one row i.e. the upper row. For example, if we are filling the row where i=10 in DP array then we require only values of 9th row. So we create two one dimensional arrays, prev[] and curr[]. The array prev[] stores values of row i-1 and the array curr[] stores values of the current row i. This approach reduces the space complexity from O(m*n) to O(n)


Output
3

[Expected Approach 2] Using Space Optimized DP – O(m*n) Time and O(n) Space

In the previous approach The curr[] array is updated using 3 values only :

Value 1: curr[j] = prev[j-1] when s1[i-1] is equal to s2[j-1]
Value 2: curr[j] = prev[j] when s1[i-1] is not equal to s2[j-1]
Value 3: curr[j] = curr[j-1] when s1[i-1] is not equal to s2[j-1]

By keeping the track of these three values we can achiever our task using only a single 1-D array


Output
3

Real-World Applications of Edit Distance

  • Spell Checking and Auto-Correction
  • DNA Sequence Alignment
  • Plagiarism Detection
  • Natural Language Processing
  • Version Control Systems
  • String Matching

Edit Distance and Longest Common Subsequence

If we do not consider the replace operation, then edit distance problem is same as the Longest Common Subsequence (LCS) problem. With only insert and delete operations allowed, the edit distance between two strings is ( M + N - 2* LCS)

Comment