Rabin-Karp Algorithm for Pattern Searching

Last Updated : 28 Mar, 2026

Given two strings text (the text) and pattern (the pattern), consisting of lowercase English alphabets, find all 0-based starting indices where pattern occurs as a substring in text.

Examples:

Input: text = "geeksforgeeks", pattern = "geeks"
Output: [0, 8]
Explanation: The string "geeks" occurs at index 0 and 8 in text.
Input: text = "aabaacaadaabaaba", pattern = "aaba"
Output: [0, 9, 12]
Explanation:
👁 kmp-algorithm-for-pattern-searching

Try It Yourself

👁 redirect icon

In the Naive String Matching algorithm, we check whether every substring of the text of the pattern's size is equal to the pattern or not one by one.

Like the Naive Algorithm, the Rabin-Karp algorithm also check every substring. But unlike the Naive algorithm, the Rabin Karp algorithm matches the hash value of the pattern with the hash value of the current substring of text. So Rabin Karp algorithm has the following step
Compute the hash of the pattern
Compare pattern's hash with the hashes of all substrings (of same length as pattern) of the text. If the hashes match, we do a character-by-character check to confirm (to avoid errors due to hash collisions).

How is Hash Value calculated in Rabin-Karp?

The hash value is calculated using a rolling hash function, which allows you to update the hash value for a new substring by efficiently removing the contribution of the old character and adding the contribution of the new character. This makes it possible to slide the pattern over the text and calculate the hash value for each substring without recalculating the entire hash from scratch.

For a string s of length m, the simple hash is
hash(s) = (s[0] × d^(m−1) + s[1] × d^(m−2)+ ... + s[m−1] × d⁰) % q
Here,
s[i] represents the ASCII value of the character ('a' = 97, 'b' = 98, ..., 'z' = 122)
d is the size of the input alphabet (commonly 256 for ASCII characters)
q is a prime number used as modulus (commonly 101, 1000000007, etc.)
Using modulo helps prevent integer overflow and reduces hash collisions.

How do we compute rolling hash?

So we know hash values s[i-1, j-1] and now we need to compute for s[i..j].

Previous value of hash

The hash value of substring s[i−1 … j−1] is: (s[i-1] * d^(m-1) + s[i] * d^(m-2) + ... + s[j-1] * d⁰) mod q

New value of hash

The hash value of the next substring s[i … j] is: (s[i] * d^(m-1) + s[i+1] * d^(m-2)+ ... + s[j] * d⁰ ) mod q

From the above expressions, it is clear that we can get new value from old value by removing the term s[i-1] * d^(m-1) and adding the term s[j] * d⁰)

hash(i, j) = (d * (hash(i−1, j−1) − s[i−1] * d^(m−1) )+ s[j]) mod q

Here's how the hash value is typically calculated in Rabin-Karp:

Choose a base d for the hash function (commonly 31 for lowercase characters or 256 for ASCII). Also select a large prime number q as the modulus to prevent overflow and reduce hash collisions.
Precompute the value of the term d^(m-1)as we are going to repeatedly multiply with this term in rolling hash calculations as explained above.
Calculate the hash for the pattern and the first substring of text matching. Let the pattern hash function be p and text hash function be t. We use the formula (s[0] * d^(m-1) + s[1] * d^(m-2) + ... + s[m-1] * d⁰) mod q for the first computations.
Update the hash by removing the leftmost char and adding the next one using hash(i, j) = ( (hash(i-1, j-1) - s[i-1] * d^(m-1) ) mod q + s[j]) mod q
If the current text hash equals the pattern hash, verify character by character. This is needed because different substrings can have the same hash (collision).

Output

0 8

Time Complexity:

Average case: O(n + m) - Computing the hash of the pattern and the first substring of text takes O(m), and sliding the window over the text with hash comparisons takes O(n) on average. Character-by-character verification is rarely needed.
Worst case: O(n × m) - If many hash collisions occur, each substring may require a full character comparison.

Auxiliary Space: O(1)

Limitations of Rabin-Karp

Hash collisions may occur - different substrings can have the same hash.
Requires character-by-character check to confirm matches.
Risk of modulo overflow if not handled with care.
Performance depends on good hash function and prime modulus.
Slightly higher constant factors compared to simpler algorithms like KMP.

Types of Problems Rabin-Karp Can Solve

Pattern matching - Find all occurrences of a pattern in a large text
Plagiarism detection - Compare documents by checking for common substrings
Multiple pattern search - Efficiently search for several patterns at once
Substring comparison - Quickly compare substrings using hashes
Palindrome and DP problems - Used for hashing-based optimizations
Detect duplicate substrings - Find repeated sequences in strings
Longest common prefix/suffix - In constant time using precomputed hashes

Comment

Article Tags:

Pattern Searching

DSA

Modular Arithmetic

URL: https://www.geeksforgeeks.org/dsa/rabin-karp-algorithm-for-pattern-searching/