Top K Frequent Elements: three solutions and when each one matters

by Duong Tran Quang aka DTDucasJune 13, 2026 · 10 min read#000061 reads#hash-table#arrays#bucket-sort#heap#leetcode

Three genuinely different solutions exist for this problem, and which one you write says something about what you understood. The follow-up — "your algorithm must be better than O(n log n)" — eliminates the obvious approach and forces the choice between a heap and a structure most people haven't thought of yet.

What the constraints actually force

The problem: given nums (up to 10^5 elements, values in [-10^4, 10^4]) and integer k, return the k most frequent elements in any order.

Two constraints matter here. First, k is guaranteed in range [1, count of unique elements] — you never handle an impossible k, and the answer is guaranteed unique (no two elements share frequency exactly at the k-th boundary). Second, and more interesting, the follow-up says time complexity must be better than O(n log n).

That second constraint rules out sorting entirely. What replaces it depends on whether k is small relative to n or you want true linear time.

There's also a hidden constraint worth naming: element values are bounded to [-10^4, 10^4]. That doesn't affect any of these three approaches directly, but it means frequency is bounded by n, not by the value range. That bound on frequency is what makes bucket sort work.

Step zero: count frequencies

All three approaches start the same way. Build a frequency map in O(n).

python

from collections import Counter
freq = Counter(nums)   # {element: count}

In C#:

csharp

var freq = new Dictionary<int, int>();
foreach (int n in nums)
    freq[n] = freq.GetValueOrDefault(n, 0) + 1;

After this, the problem transforms: given value -> frequency, return the k entries with the highest frequency. The difference between approaches is entirely in how you extract those top-k entries.

Approach 1: sort the frequency map — O(n log n)

Sort the unique elements by frequency descending, take the first k. This is the baseline you write to confirm you understood the problem.

python

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        freq = Counter(nums)
        return [x for x, _ in sorted(freq.items(), key=lambda p: p[1], reverse=True)[:k]]

csharp

public class Solution {
    public int[] TopKFrequent(int[] nums, int k) {
        var freq = new Dictionary<int, int>();
        foreach (int n in nums)
            freq[n] = freq.GetValueOrDefault(n, 0) + 1;
 
        var sorted = freq.OrderByDescending(kvp => kvp.Value).Take(k);
        return sorted.Select(kvp => kvp.Key).ToArray();
    }
}

O(n log n) — counting is O(n), sorting the distinct elements costs at most O(n log n). Clean, readable, and explicitly excluded by the follow-up.

By-hand trace with nums = [1,1,1,2,2,3], k = 2:

After counting:  freq = {1: 3, 2: 2, 3: 1}
Sort descending: [(1, 3), (2, 2), (3, 1)]
Take first k=2:  [1, 2]

I still write this first in an interview as the baseline, then say "the follow-up rules this out, let me do better."

Complexity: O(n log n) time, O(n) space. Fails the follow-up.

Approach 2: min-heap of size k — O(n log k)

Keep a min-heap of exactly k entries ordered by frequency. For each new frequency-map entry, push it, then pop if the heap exceeds k. The element that gets popped is always the one with the lowest frequency — which means after processing all entries, the heap holds exactly the top-k.

python

import heapq
from collections import Counter
 
class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        freq = Counter(nums)
        # heap entries are (frequency, element) so heapq compares by frequency first
        heap: list[tuple[int, int]] = []
        for val, count in freq.items():
            heapq.heappush(heap, (count, val))
            if len(heap) > k:
                heapq.heappop(heap)   # evict the lowest-frequency element
        return [val for _, val in heap]

csharp

public class Solution {
    public int[] TopKFrequent(int[] nums, int k) {
        var freq = new Dictionary<int, int>();
        foreach (int n in nums)
            freq[n] = freq.GetValueOrDefault(n, 0) + 1;
 
        // PriorityQueue<TElement, TPriority> is a min-heap in .NET 6+
        // lowest priority (= lowest frequency) dequeues first
        var heap = new PriorityQueue<int, int>();
        foreach (var (val, count) in freq) {
            heap.Enqueue(val, count);
            if (heap.Count > k)
                heap.Dequeue();   // evict the least frequent
        }
 
        var result = new int[k];
        for (int i = 0; i < k; i++)
            result[i] = heap.Dequeue();
        return result;
    }
}

By-hand trace with nums = [1,1,1,2,2,3], k = 2:

Heap entries are (frequency, value). A min-heap keeps the smallest frequency at the root.

freq = {1: 3, 2: 2, 3: 1}

Process val=1, count=3:
  push (3, 1)   -> heap: [(3,1)]        size=1, no eviction

Process val=2, count=2:
  push (2, 2)   -> heap: [(2,2), (3,1)] size=2, no eviction
                          ^root (min-freq)

Process val=3, count=1:
  push (1, 3)   -> heap: [(1,3), (3,1), (2,2)]  size=3 > k=2
  pop min (1,3) -> heap: [(2,2), (3,1)]          size=2
                          ^root

Result: elements from heap = [2, 1]  (any order is valid)

The push-then-pop ordering is deliberate: the new element competes against the current weakest before anything is discarded. If you pop before push, you might discard the new element without giving it a fair comparison.

Why a min-heap to find the top? Because you want to evict the weakest element when the heap is full. The minimum sits at the root of a min-heap, making eviction O(log k). With a max-heap, the weakest element would be buried at a leaf — you'd have to linear-scan to find it.

Complexity: O(n log k) time — each of the up-to-n distinct values gets one push and possibly one pop, both O(log k). O(n) space for the frequency map, O(k) for the heap. Since k <= n, this is strictly better than O(n log n) whenever k < n.

Approach 3: bucket sort — O(n)

This one requires noticing the hidden structure: frequency is bounded by array length. A value cannot appear more than n times. So instead of sorting by frequency, you can index by frequency directly.

Allocate n + 1 buckets. buckets[i] holds all values that appear exactly i times. Then scan from buckets[n] down to buckets[1], collecting values until you have k. Total cost: O(n).

python

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        freq = Counter(nums)
        buckets: list[list[int]] = [[] for _ in range(len(nums) + 1)]
        for val, count in freq.items():
            buckets[count].append(val)
 
        result: list[int] = []
        for i in range(len(buckets) - 1, 0, -1):
            for val in buckets[i]:
                result.append(val)
                if len(result) == k:
                    return result
        return result

csharp

public class Solution {
    public int[] TopKFrequent(int[] nums, int k) {
        var freq = new Dictionary<int, int>();
        foreach (int n in nums)
            freq[n] = freq.GetValueOrDefault(n, 0) + 1;
 
        // buckets[i] = list of elements that appear exactly i times
        var buckets = new List<int>[nums.Length + 1];
        for (int i = 0; i <= nums.Length; i++)
            buckets[i] = new List<int>();
 
        foreach (var (val, count) in freq)
            buckets[count].Add(val);
 
        var result = new List<int>();
        for (int i = nums.Length; i >= 1 && result.Count < k; i--)
            result.AddRange(buckets[i]);
 
        return result.Take(k).ToArray();
    }
}

By-hand trace with nums = [1,1,1,2,2,3], k = 2:

freq = {1: 3, 2: 2, 3: 1}   n = 6, allocate buckets[0..6]

Fill buckets:
  val=1, count=3  ->  buckets[3] = [1]
  val=2, count=2  ->  buckets[2] = [2]
  val=3, count=1  ->  buckets[1] = [3]

Scan from i=6 down:
  i=6: []  (empty)
  i=5: []  (empty)
  i=4: []  (empty)
  i=3: [1]  ->  result=[1],  size=1 < k=2
  i=2: [2]  ->  result=[1,2], size=2 == k, return [1, 2]

The structure makes the scan efficient:

The constraint that makes this possible — frequency <= array length — is obvious once stated but easy to miss under interview pressure. It is what allows you to pre-size the bucket array without any hash structure.

Complexity: O(n) time, O(n) space. Both the bucket array and the frequency map are O(n). You give up nothing in space relative to the heap, and you gain the linear time bound.

Complexity comparison

Approach	Time	Space	Meets follow-up?
Sorting	O(n log n)	O(n)	No
Min-heap	O(n log k)	O(n)	Yes
Bucket sort	O(n)	O(n)	Yes

Edge cases and how the code handles them

k = 1: you want only the single most frequent element. The heap never holds more than 1 entry, so it degenerates to a running maximum. Bucket sort stops at the first non-empty bucket from the top.
k = number of unique elements: you want everything. The heap fills to exactly k and never evicts — every element survives. Bucket sort collects all buckets before stopping. Both return all unique elements.
nums = [1], k = 1: single element, frequency 1. The frequency map has one entry; all three approaches return [1] immediately. The only thing to double-check is that buckets[1] exists when n=1 — it does, since you allocate n+1 = 2 buckets.
All elements distinct: nums = [1,2,3,4], k = 2. Every element has frequency 1. Bucket sort drops everything into buckets[1] and returns the first k encountered — order is unspecified, which the problem accepts. The heap holds whichever two elements happen to be processed last (also unspecified order, also fine).
Large k close to n: if k = n - 1, the heap holds nearly everything and degenerates toward O(n log n). Bucket sort remains O(n). This is where the theoretical gap between the two approaches actually shows up in practice.

Which one to reach for

Sorting is the answer you write to confirm you understand the problem. The heap is what you write when you need to prove you can do better. Bucket sort is what you write when you want O(n) and can justify the constraint out loud.

In practice I default to the heap. O(n log k), correct, clean, and uses only structures the interviewer will recognise immediately. Bucket sort is the nicer theoretical answer, but explaining why n + 1 buckets suffice takes a sentence the interviewer should hear before you write the code — not after.

The O(n log k) vs O(n) gap matters at scale. At n = 10^5 and k = 10^4, the heap does roughly 10^5 * 14 ~ 1.4M operations; bucket sort does a flat 10^5. Real difference. At n = 100 with any k, nobody cares. Let the actual problem scale tell you which to use.

This problem in the arrays-hashing series

This is the culminating problem of the frequency-counting thread in the series. Two Sums used a seen-set for O(1) membership; Valid Anagram used a frequency array for character counting; Group Anagrams used a canonical-key map. Here, frequency itself becomes the first-class data, and the question is how to rank by it efficiently.

The underlying mental model — "when values are bounded, index by value rather than sorting" — reappears in radix sort, counting sort, and any problem where the range of a key is O(n). Once you see it here, you'll see it elsewhere.

Sibling problems worth doing next:

215. Kth Largest Element in an Array — same size-k heap pattern, keyed on value not frequency; introduces quickselect as an O(n) alternative
692. Top K Frequent Words — same frequency counting, adds lexicographic tie-breaking which complicates the heap comparison
451. Sort Characters By Frequency — frequency bucket sort but you return all characters sorted, no k cutoff
973. K Closest Points to Origin — heap keyed on Euclidean distance; frequency map replaced by a distance computation

Part of the series: Arrays & Hashing

NEWSLETTER

Occasional notes on what I'm building

Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.

Comments

No comments yet. Be the first.

Top K Frequent Elements: three solutions and when each one matters

What the constraints actually force

Step zero: count frequencies

Approach 1: sort the frequency map — O(n log n)

Approach 2: min-heap of size k — O(n log k)

Approach 3: bucket sort — O(n)

Complexity comparison

Edge cases and how the code handles them

Which one to reach for

This problem in the arrays-hashing series

Occasional notes on what I'm building

Comments

Let's build something reliable, end to end