VOOZH about

URL: https://www.geeksforgeeks.org/dsa/longest-valid-word-with-all-prefixes/

⇱ Longest Valid Word with All Prefixes - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Longest Valid Word with All Prefixes

Last Updated : 3 Jul, 2025

Given an array of strings arr[], find the longest string such that all of its prefixes are also present in arr[]. If there are multiple such strings with the same maximum length, return the lexicographically smallest one.

Note: A prefix of a string is any substring starting from the first character, including the string itself.

Examples:

Input: arr[] = ["ab", "a", "abc", "abd"]
Output: abc
Explanation: Each of "abc" and "abd" has all its prefixes in the array. ("abc" → prefixes: "a", "ab", "abc " and "abd" → prefixes: "a", "ab", "abd"). Since both satisfy the condition and are of equal length, we return the lexicographically smaller one, which is "abc".

Input: arr[] = ["p", "pr", "pro", "probl", "problem", "pros", "process", "processor"]
Output: "pros"
Explanation: "pros" is the longest word with all prefixes present in the array(i.e., "p", "pr", "pro" , "pros")

[Naive Approach] Using Sorting and Binary Search

The idea is to sort the words lexicographically, then iterate through each word and check if all of its prefixes (including itself) exist in the list using binary search. The longest valid word is tracked, and in case of a tie, the lexicographically smaller one is chosen.


Output
abc

Time Complexity: O(n*k2*log n) for each word, of length up to k, generating k prefixes, with each requiring a binary search that cost O(k * log n) due to string comparison.
Auxiliary Space: O(1) as no extra data structure are used apart from a few variable.

[Better Approach] Rabin-Karp Based Prefix Validation

This approach uses the Rabin-Karp rolling hash technique with double hashing to uniquely represent each word and its prefixes using a pair of hash values, reducing collision risk. All word hashes are stored in a set. For each word, we incrementally compute prefix hashes using Rabin-Karp and check if all prefixes exist in the set. The longest valid word (or lexicographically smallest among equals) is returned.

Note: We avoid using only one hash because it increases the risk of hash collisions, which can cause incorrect prefix validation. Double hashing greatly reduces this risk by ensuring two independent hash checks.


Output
abc

Time Complexity: O(n * k * log n) - for each word checking all k prefixes in the set (log n per prefix).
Auxiliary Space: O(n) - used to store the hash ( pair of integers ) of each word.

[Expected Approach] Trie-Based Prefix Validation

The idea is to insert all words into a Trie, marking the end of each word. Then, for each word, we check whether all its prefixes also exist as complete words in the Trie. If a word passes this check, we consider it as a candidate for the result. Among all valid candidates, we select the longest one, preferring lexicographically smaller words in case of a tie.

Algorithm:

  • Define a TrieNode with 26 child pointers and a boolean isEnd.
  • Insert every word from the array into the Trie.
  • While inserting, create missing child nodes and mark isEnd = true at the last character.
  • For each word, verify that every prefix is present in the Trie and has isEnd = true.
  • Keep the longest word that passes this check; on ties choose the lexicographically smallest.

Output
abc

Time Complexity: O(n*k), one pass for insertion and one for prefix checks where n is the number of words and k is the average word length,
Auxiliary Space: O(n*k), where n is the number of words and k is the average word length.

Comment
Article Tags: