Problem Statement
We have to implement a topKFrequent
function that takes an integer array and a value k
as input and returns the k
most frequent elements in the input array. For example, topKFrequent({1, 1, 2, 2, 3, 5, 2}, 2)
will return the top 2
most frequent elements in the array i.e. {2, 1}
.
If multiple elements have the same frequency, then any of them could be returned in the solution as long as its length is k
.
Brute Force Solution
First, we have to gather the data for the frequency of array elements inside a hashmap. We have to group elements with the same frequency, so we’ll invert the key and values in the hashmap from element -> count
to count -> [elements]
.
We can also extract a list of frequencies from this hashmap which will be sorted in descending order to extract the top k
elements.
Psuedo-code for the Brute Force Solution
|
|
Time Complexity Analysis
Best Case Scenario
The first loop over the array has $O(n)$ time complexity, where $n$ is the size of the input array.
Let’s assume that the second loop also has $O(n)$ time complexity (length(counter) == length(reverseCounter)
). In the best-case scenario, there will be only one key in reverseCounter
i.e. all elements have the same frequency, which decreases the time complexity of this loop to $O(1)$.
The fastest sorting algorithm will sort the freq
array in $O(n \log(n))$ time.
The final loop (over freq
) will execute k
times. On each iteration, it will also loop over the list stored in reverseCounter[freq[index]]
. Since the loop will exit as soon as the length of the result array reaches k
its time complexity could be generalized to $O(k)$.
Thus, the total time complexity of the best-case scenario for brute force solution will be $O(n) + O(1) + O(n \log(n)) + O(k)$, which could be generalized to $O(n \log n)$.
Worst Case Scenario
In the worst-case scenario, all the elements in the input array will have distinct frequencies. Thus, the length of counter
and reverseCounter
will be the same resulting in total time complexity $O(n) + O(n) + O(n \log(n)) + O(k)$, which could also be generalized to $O(n \log n)$.
Space Complexity Analysis
The memory space used by counter
and reverseCounter
will be $O(n)+O(n)$. We are assuming that the freq
array is sorted in place so it won’t require additional memory space. Finally, the space complexity of the topKElements
array will be $O(n)$ if k=length(nums)
. Thus, the total space complexity of the brute-force solution could be simplified to $O(n)$.
Code for the Brute Force Solution
|
|
Optimized Solution
The sorting operation on the frequency array has the highest time complexity ($O(n \log(n))$).
The highest frequency an element can achieve in an array is the length of the array (all the elements are the same) and the lowest frequency for an element is 1
(a single element in the array). Thus, instead of sorting frequencies, we can loop over the known range of frequencies and return the top k
elements.
Psuedo code for the Optimized Solution
|
|
Time Complexity Analysis
Best Case Scenario
In the best-case scenario (all elements have the same frequency), the time complexity of all loops except the one on reverseCounter
will be $O(n)$ which sums up to $O(2n)$ for the complete function (could also be generalized to $O(n)$).
Worst Case Scenario
For the worst-case scenario, where all elements have distinct frequencies, the time complexity of every loop is $O(n)$ resulting in $O(3n)$ (generalized to $O(n)$) time complexity for the complete function.
Space Complexity Analysis
The additional memory required by the optimized solution to store counter
, reverseCounter
, and topKElements
will be $O(n)$.
Code for the Optimized Solution
|
|
Thank you for taking the time to read this blog post! If you found this content valuable and would like to stay updated with my latest posts consider subscribing to my RSS Feed.
Resources
347. Top K Frequent Elements
Top K Frequent Elements - Bucket Sort - Leetcode 347 - Python