VOOZH about

URL: https://www.geeksforgeeks.org/dsa/suffix-tree-application-3-longest-repeated-substring/

⇱ Suffix Tree Application 3 - Longest Repeated Substring - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Suffix Tree Application 3 - Longest Repeated Substring

Last Updated : 23 Jul, 2025

Given a text string, find Longest Repeated Substring in the text. If there are more than one Longest Repeated Substrings, get any one of them. 
 

Longest Repeated Substring in GEEKSFORGEEKS is: GEEKS
Longest Repeated Substring in AAAAAAAAAA is: AAAAAAAAA
Longest Repeated Substring in ABCDEFG is: No repeated substring
Longest Repeated Substring in ABABABA is: ABABA
Longest Repeated Substring in ATCGATCGA is: ATCGA
Longest Repeated Substring in banana is: ana
Longest Repeated Substring in abcpqrabpqpq is: ab (pq is another LRS here)


This problem can be solved by different approaches with varying time and space complexities. Here we will discuss Suffix Tree approach (3rd Suffix Tree Application). Other approaches will be discussed soon.
As a prerequisite, we must know how to build a suffix tree in one or the other way. 
Here we will build suffix tree using Ukkonen’s Algorithm, discussed already as below: 
Ukkonen’s Suffix Tree Construction – Part 1 
Ukkonen’s Suffix Tree Construction – Part 2 
Ukkonen’s Suffix Tree Construction – Part 3 
Ukkonen’s Suffix Tree Construction – Part 4 
Ukkonen’s Suffix Tree Construction – Part 5 
Ukkonen’s Suffix Tree Construction – Part 6
Lets look at following figure: 
 

👁 Suffix Tree Application


This is suffix tree for string "ABABABA$". 
In this string, following substrings are repeated: 
A, B, AB, BA, ABA, BAB, ABAB, BABA, ABABA 
And Longest Repeated Substring is ABABA. 
In a suffix tree, one node can't have more than one outgoing edge starting with same character, and so if there are repeated substring in the text, they will share on same path and that path in suffix tree will go through one or more internal node(s) down the tree (below the point where substring ends on that path). 
In above figure, we can see that 
 

  • Path with Substring "A" has three internal nodes down the tree
  • Path with Substring "AB" has two internal nodes down the tree
  • Path with Substring "ABA" has two internal nodes down the tree
  • Path with Substring "ABAB" has one internal node down the tree
  • Path with Substring "ABABA" has one internal node down the tree
  • Path with Substring "B" has two internal nodes down the tree
  • Path with Substring "BA" has two internal nodes down the tree
  • Path with Substring "BAB" has one internal node down the tree
  • Path with Substring "BABA" has one internal node down the tree


All above substrings are repeated.
Substrings ABABAB, ABABABA, BABAB, BABABA have no internal node down the tree (after the point where substring end on the path), and so these are not repeated.
Can you see how to find longest repeated substring ?? 
We can see in figure that, longest repeated substring will end at the internal node which is farthest from the root (i.e. deepest node in the tree), because length of substring is the path label length from root to that internal node.
So finding longest repeated substring boils down to finding the deepest node in suffix tree and then get the path label from root to that deepest internal node.
 

Output:

Longest Repeated Substring in GEEKSFORGEEKS$ is: GEEKS
Longest Repeated Substring in AAAAAAAAAA$ is: AAAAAAAAA
Longest Repeated Substring in ABCDEFG$ is: No repeated substring
Longest Repeated Substring in ABABABA$ is: ABABA
Longest Repeated Substring in ATCGATCGA$ is: ATCGA
Longest Repeated Substring in banana$ is: ana
Longest Repeated Substring in abcpqrabpqpq$ is: ab
Longest Repeated Substring in pqrpqpqabab$ is: ab

In case of multiple LRS (As we see in last two test cases), this implementation prints the LRS which comes 1st lexicographically.
Ukkonen’s Suffix Tree Construction takes O(N) time and space to build suffix tree for a string of length N and after that finding deepest node will take O(N). 
So it is linear in time and space.
Followup questions: 
 

  1. Find all repeated substrings in given text
  2. Find all unique substrings in given text
  3. Find all repeated substrings of a given length
  4. Find all unique substrings of a given length
  5. In case of multiple LRS in text, find the one which occurs most number of times


All these problems can be solved in linear time with few changes in above implementation.
We have published following more articles on suffix tree applications: 
 


 

Comment