VOOZH about

URL: https://www.geeksforgeeks.org/python/python-extract-elements-with-frequency-greater-than-k/

⇱ Python - Extract elements with Frequency greater than K - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Python - Extract elements with Frequency greater than K

Last Updated : 8 May, 2023

Given a List, extract all elements whose frequency is greater than K.

Input : test_list = [4, 6, 4, 3, 3, 4, 3, 4, 3, 8], K = 3 
Output : [4, 3] 
Explanation : Both elements occur 4 times. 

Input : test_list = [4, 6, 4, 3, 3, 4, 3, 4, 6, 6], K = 2 
Output : [4, 3, 6] 
Explanation : Occur 4, 3, and 3 times respectively.

Method #1 : Using count() + loop

In this, we use count() to get the frequency, and a loop is used to iterate for each of the elements of the List. 


Output
The original list : [4, 6, 4, 3, 3, 4, 3, 7, 8, 8]
The required elements : [4, 3]

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #2 : Using list comprehension + Counter()

In this, we perform the task of counting using Counter() and the iteration part is done inside list comprehension.


Output
The original list : [4, 6, 4, 3, 3, 4, 3, 7, 8, 8]
The required elements : [4, 3]

Time complexity: O(n), where n is the length of the test_list. The list comprehension + Counter() takes O(n) time
Auxiliary Space: O(n), extra space of size n is required

Method #3: Using a dictionary

In this we keep count of the elements, if the count of the element id K + 1, we add that element to the output.


Output
The original list : [4, 6, 4, 3, 3, 4, 3, 4, 6, 6]
The required elements : [4, 3, 6]

Time Complexity: O(n), where n is the length test_list. 
Auxiliary Space: O(n), where n is number of elements in output list.

Method #4: Using operator.countOf() method


Output
The original list : [4, 6, 4, 3, 3, 4, 3, 7, 8, 8]
The required elements : [3, 4]

Time Complexity:O(N)
Auxiliary Space:O(N)

Method#5: using NumPy module

Output:

The required elements : [3, 4]

Time Complexity:O(N)
Auxiliary Space:O(N)

Method #6: Using filter() and lambda()

Algorithm:

  1. Import the Counter class from the collections module.
  2. Define the input list test_list and the threshold value K.
  3. Use the Counter class to count the frequency of each element in the test_list.
  4. Use the filter() function with a lambda function to keep only the elements whose frequency is greater than K.
  5. Convert the result into a list and print it

Output
The required elements : [4, 3]

Time Complexity:

The time complexity of this code is O(n), where n is the length of the input list. This is because the Counter() function from the collections module has a time complexity of O(n), where n is the number of elements in the list. The filter() function and the conversion to a list take linear time, which is also O(n).

Auxiliary Space:

The space complexity of this code is also O(n), where n is the length of the input list. This is because the Counter() function creates a dictionary that stores the frequency count of each element, and the size of this dictionary is proportional to the number of elements in the input list. The output list created by filter() can also be at most n elements long, so the space complexity of the final result is also O(n).

Method #7: Using  defaultdict module from the collections library:

Algorithm:

  1. Import the defaultdict module from the collections library.
  2. Initialize an empty dictionary, freq_dict, using defaultdict(int) to store the frequency count of each element in the test_list.
  3. Set the value of K.
  4. Loop through each key-value pair in freq_dict.
  5. If the frequency of an element is greater than K, append that element to the res list.
  6. Print the final result.
     

Output
The original list : [4, 6, 4, 3, 3, 4, 3, 7, 8, 8]
The required elements : [4, 3]


Time Complexity: O(n), where n is the length of the input list. This is because we iterate through the list once to build the frequency dictionary and once again to find elements with frequency greater than K.


Auxiliary Space: O(n), where n is the length of the input list. This is because we store the frequency counts of each element in the test_list in the freq_dict dictionary. The space taken by the res list is proportional to the number of elements with frequency greater than K. However, the worst-case scenario is when all elements have a unique frequency count, resulting in a dictionary with n key-value pairs.
 

Comment