VOOZH about

URL: https://www.geeksforgeeks.org/dsa/prefix-function-and-kmp-algorithm-for-competitive-programming/

⇱ Prefix Function and KMP Algorithm for Competitive Programming - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Prefix Function and KMP Algorithm for Competitive Programming

Last Updated : 23 Jul, 2025

The prefix function is a string matching technique used in computer science and string algorithms. It efficiently computes an array that represents the length of the longest proper prefix which is also a suffix for each prefix of a given string. The Knuth-Morris-Pratt (KMP) algorithm utilizes the prefix function to perform pattern matching in linear time, making it an efficient algorithm for searching occurrences of a pattern within a text.

Knuth–Morris–Pratt (KMP) Algorithm is a linear time string searching algorithm that efficiently finds occurrences of a pattern within a text.

The Prefix Function is an integral part of the Knuth–Morris–Pratt Algorithm.

The prefix function for a string s is an array , where is the length of the longest proper prefix of the substring     which is also a suffix of this substring. Proper prefix of a string is a prefix that is not equal to the string itself. It is obvious that because a string of length has no proper prefixes.

Examples:

Example 1: s="xyzxyzx"

Prefix Function for s will be =[0,0,0,1,2,3,4]

Example 2: s="xxyxxxy"

Prefix Function for s will be =[0,1,0,1,2,2,3]

Find Prefix Function (O(N3) approach):

The idea is to do the following: For all i from 0 to N-1, try all possible lengths for the prefix/suffix, with each comparison taking O(N) time.

Below is the implementation of above approach:


Output
0 0 0 1 2 3 4 5 6 

Find Prefix Function in O(N2) time:

It can be observed that . This can be proved by contradiction. If , then we can take the suffix of length that is ending at index i+1, remove the last character from it, then we get a better suffix of length ending at position i, which is better than the value , which leads to a contradiction. Now since the value of prefix function can increase by at most 1, this means the function can grow at most N times and it can only decrease for at most N times too. So total no. comparisons can be 2*N, which gives a complexity of O(N2).

Below is the implementation of above approach:


Output
Prefix Function: 0 0 0 1 2 3 4

Find Prefix Function (Linear approach):

We can optimize the above approach to O(N). To compute , if , then , otherwise then we need to find the largest index j, such that and . Then we can compare characters at index j and i+1, if they are equal then , otherwise we need to find the next shortest j and repeat the procedure. If j becomes 0, then if , then will be 1 else 0.

For efficiently computing j, suppose we are currently at index i, we need to find the largest index j that holds the following condition: , this value is nothing but .

Below is the algorithm for the above approach:

  • Create an array pi, which denotes the prefix function for string s. Initialize p[0] with 0.
  • Run a loop from i=1 to n-1,
    • Assign j to pi[i-1], j denotes the largest index which holds the following condition: .
    • Compare s[j] and s[i], if they are equal assign p[i] will become j+1, else j will now become pi[j-1].
    • Repeat the same procedure till j is greater than 0. If j becomes, s[0] and s[i] will get compared, if they are equal pi[i] will be 1 else 0.
  • Return pi.

Below is the implementation of above approach:


Output
Prefix Function values for string "abababab": 0 0 1 2 3 4 5 6 

1. Pattern Searching- The Knuth-Morris-Pratt algorithm:

Problem:Given a patter p and text t, the task is to find all occurrences of pattern p in text t.

The idea is to choose a character ($ or #) which is not present in any of the string p and t, which acts a separator. Compute the prefix function for the string p+#+t. Notice the value of prefix function at any index will not exceed n because of the separator. Now if , at any position i, that means a match is found, i.e., occurrence of patter p in text t. Notice the , won't be true for first n+1 positions in prefix function since it belongs to pattern p and separator #. Hence we take all positions where .

Below is the implementation of above approach:

2. Count the number of occurrences of each prefix in same string:

Problem: Given a string s, find the number of occurrences of each prefix of string s in the same string i.e., string s.

The idea is to calculate the prefix function for the string s. Notice that each prefix appears at least 1 times, so we initialize the answer array with 1. Now denotes the length of longest prefix that occurs in s and ends at position i. The next shortest prefix of length j that occurs in s and ends at position i can be determined by and so on. To compute the final answer array efficiently, we count the number of number of times each value in prefix function appears. Then iterate from n-1 to 1 and for each value of increment the value at index of final answer array by .

Below is the implementation of above approach:

3. Count the number of occurrences of each prefix in some other string:

Problem Given a string s, find the number of occurrences of each prefix of string s in the string t.

The idea is to compute the prefix function of the string s#t. Notice that we only care about the value , where i>n since we need to find number of occurrences of s in t, and t begins from index n+1, where n is the length of string s. Now we can repeat the same procedure as we did in counting the number of occurrences of each prefix in same string.

Below is the implementation of above approach:


Output
Occurrences of prefixes of "aba" in "abacaba": 5 3 3 3 

4. Count the number of occurrences of each prefix in same string that also occurs as the suffix:

Problem Given a string s, find the number of occurrences of each prefix of string s in the string s which also occurs as the suffix of string s.

The idea is the compute the prefix function of string s. To find the number of occurrence of each prefix in sting s we can repeat the same procedure as done above. Now to find the lengths of prefix which also appears as suffix in string s,are- , ,... and so on. The number of times each of these prefixes occurs in string s can be directly determined as we pre computed the array which finds the number of occurrence of each prefix of string s in sting s.

Below is the implementation of above approach:


Output
Length: 3, Occurrences: 2
Length: 1, Occurrences: 4

Problem

Problem Link

Minimum characters to be added at front to make string palindrome

Practice

Minimum size substring to be removed to make a given string palindromic

Practice

Maximum number of given operations to remove the entire string

Practice


Comment
Article Tags:
Article Tags: