![]() |
VOOZH | about |
Given a text t and a pattern p where t consists of only lowercase English alphabets while p consists of lowercase English alphabets as well as special characters '.' and '*', the task is to implement a function to test regular expression such that:
'.' Matches any single character.'*' Matches zero or more of the preceding character.Note: For each appearance of the character '*', there will be a previous valid character to match.
Examples:
Input:t = "aaa", p = "a"Output:falseExplanation:"a" does not match the entire string "aaa".
Input:t = "abb", p = "a.*"Output:trueExplanation:replace.with b then p becomes ab* now replace * with one preceeding character hence p becomes abb.
Input:t = "", p = "a*b*"Output:trueExplanation:Note that * can match 0 occurrences also.
We can begin matching both pattern from one side. We begin from the right side. Following cases arise, when we match character by character.
Case 1 (If Last Characters are Same) : We move to the next character in both text t and pattern p
Case 2 (If Last Character of Patter is '.') : We move to the next character in both text t and pattern p
Case 2 (If Last Character of Patter is '*') : There must be at-least two characters in the pattern. If not, we return false. If yes, the following two cases arise.
a) The '*' and character preceding it match with 0 characters of the text. We move two characters in pattern and do not move in text.
b) The '*' and character preceding it match with 1 or more characters of the text. We match the preceding character of pattern with the current character of text. If match, then we move one character ahead in text and do not move in pattern.
We return false if both a) and b) are not true.
Below is implementation of the idea.
true false true true false
The above recursive solution has exponential time complexity in the worst case. Please note that we make two recursive calls in the last if condition. We can clearly notice overlapping subproblems here as we make calls for (n-1, m-1), (n, m-2) and/or (n-1, m). So we can use Dynamic Programming to solve this problem.
dp array of size (n + 1) * (m + 1). Please note that the range of values in the recursion goes from 0 to text length (or n) and 0 to pattern length (or m)dp[0][0] = true.true false true true false
Illustration
Let’s take an example t = "aab" and p = "c*a*b" and create a DP table.
c | * | a | * | b | |||
|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | ||
0 | TRUE | FALSE | TRUE | FALSE | TRUE | FALSE | |
a | 1 | FALSE | FALSE | FALSE | TRUE | TRUE | FALSE |
a | 2 | FALSE | FALSE | FALSE | FALSE | TRUE | FALSE |
b | 3 | FALSE | FALSE | FALSE | FALSE | FALSE | TRUE |
p is empty and it will match to s only if s is also empty which we have stored in dp[0][0]. Thus, remaining values of dp[0][i] will be false.p matches empty t. The answer is either an empty pattern or a pattern that represents an empty string such as "a*", "x*y*", "l*m*n*" and so on. In the above example, if t = "" and p = "c*", then due to *, c can be replaced by 0 cs which gives us an empty string. Hence, dp[0][2] = true.[i - 1] == p[j - 1] this means the (i - 1)th and (j - 1)th characters are same. This means, we have to check if the remaining strings are a match or not. If they are a match, then the current substrings will be a match, otherwise they won’t be a match i.e., dp[i][j] = dp[i - 1][j - 1]. We’re taking (i - 1)th and (j - 1)th characters to offset empty strings as we’re assuming our strings start from index 1.p[j - 1] == ".", then it means any single character can be matched. Therefore, here also, we will have to check if the remaining string is a match or not. Thus, dp[i][j] = dp[i - 1][j - 1].p[j - 1] == "*", then it means either it’s represents an empty string (0 characters), thus dp[i][j] = dp[i][j - 2] or t[i - 1] == p[j - 2] || p[j - 2] == ".", then current character of string equals the char preceding '*' in pattern so the result is dp[i-1][j].Time Complexity: O(m×n)
Auxiliary Space: O(m×n)