![]() |
VOOZH | about |
Given a book of words and an integer K. Assume you have enough main memory to accommodate all words. Design a dynamic data structure to find the top K most frequent words in a book. The structure should allow new words to be added in main memory.
Examples:
Input: fileData = "Welcome to the world of Geeks. This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks"
Output:
"your" : 3
"well" : 3
"and" : 4
"to" : 4
"Geeks" : 6
Important Points about Implementations
Time Complexity : O(n + n Log k) where n is the number of words in the file. We assume that every word is of constant length.
The approach leverages a Trie to efficiently store and search words as they are read from the file, while simultaneously keeping track of each word's occurrence count. Each Trie node is enhanced with an additional field, indexMinHeap, which indicates the position of the word in the Min Heap if it is currently among the top k frequent words (or -1 if it is not). In parallel, a Min Heap of fixed size k is maintained to record the k most frequent words encountered so far. Each node in the Min Heap contains the word, its frequency, and a pointer to the corresponding Trie leaf node. As words are processed, the algorithm updates their frequencies in the Trie and then reflects these changes in the Min Heap by either updating an existing entry, inserting a new entry if space is available, or replacing the root of the Min Heap (which represents the least frequent word among the top k) when the new word’s frequency exceeds it.
Step-by-Step Process to Execute the Code
indexMinHeap is not -1), simply update its frequency in the heap and call minHeapify() at the respective index.indexMinHeap, and rebuild the heap.indexMinHeap to -1), and call minHeapify() to restore the heap property.Below is given the implementation:
your : 3 well : 3 and : 4 to : 4 Geeks : 6
The above output is for a file with following content.
Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks.