Kth Largest Element: three ways to avoid sorting the whole array
Pick up LeetCode 215 cold and the obvious move is sort(reverse=True), return nums[k-1]. It works. It's also the answer that makes interviewers follow up with "can you do better?" — because you've used O(n log n) time to find one element, discarding all the comparative work you did on the n - k elements you didn't need.
The problem (LeetCode 215): given an integer array nums and an integer k, return the kth largest element in sorted order. Duplicates count — if the array is [5, 5, 4] and k=2, the answer is 5, not 4. The follow-up in the problem statement ("Can you solve it without sorting?") is the hint that two better approaches exist.
What the constraints force
1 <= k <= nums.length <= 10^5. Three things fall out of this:
k is always valid. No need to guard for k > n. The problem guarantees an answer exists.
n can reach 100,000. An O(n^2) solution is too slow — that's 10 billion operations in the worst case. A plain sort at O(n log n) fits comfortably. But an O(n) average solution fits even better.
-10^4 <= nums[i] <= 10^4. The value range is bounded: 20,001 distinct integers. This matters for a bucket sort variant (O(n) worst case by counting occurrences), though the heap and QuickSelect approaches are more general and don't need it.
No overflow concern on element values. But in C#, if you're computing (left + right) / 2 in QuickSelect, the index arithmetic can theoretically overflow if left and right are near Int32.MaxValue. Use left + (right - left) / 2.
Brute force: sort and index
Sorting descending and returning nums[k-1] is the baseline. I'd write it in a contest where I needed to move fast and came back later.
from typing import List
class Solution:
def findKthLargest(self, nums: List[int], k: int) -> int:
nums.sort(reverse=True)
return nums[k - 1]public class Solution {
public int FindKthLargest(int[] nums, int k) {
Array.Sort(nums, (a, b) => b.CompareTo(a));
return nums[k - 1];
}
}Time O(n log n). Space O(1) for in-place sort (Python's Timsort allocates O(n) scratch space internally, but that's an implementation detail). The correctness argument is trivial. The problem is that you do O(n log n) comparisons to determine the order of every element, then throw away everything except index k-1.
For this problem's constraint (n <= 10^5), this passes LeetCode's judge. In a production setting where this query runs millions of times against large arrays, you care.
Min heap: keep exactly k elements
The insight: if I want the kth largest element, I want to track the k largest elements I've seen so far. At any point the smallest of those k elements is my current candidate for the answer. A min heap of size k does exactly this.
- If the heap has fewer than
kelements, push unconditionally. - If the heap already has
kelements and the new element is larger than the heap root (the smallest of the current top-k), pop the root and push the new element. The new element knocked out the smallest of the previous top-kand took its place. - If the new element is smaller than or equal to the root, it cannot be in the top-
k. Skip it.
After processing all n elements, the heap holds exactly the k largest values seen. The root — the smallest among them — is the kth largest.
import heapq
from typing import List
class Solution:
def findKthLargest(self, nums: List[int], k: int) -> int:
min_heap: List[int] = []
for num in nums:
if len(min_heap) < k:
heapq.heappush(min_heap, num)
elif num > min_heap[0]:
heapq.heapreplace(min_heap, num)
return min_heap[0]using System.Collections.Generic;
public class Solution {
public int FindKthLargest(int[] nums, int k) {
// PriorityQueue<TElement, TPriority> ships with .NET 6+.
// Priority equals the element itself for a min heap.
var minHeap = new PriorityQueue<int, int>();
foreach (int num in nums) {
if (minHeap.Count < k) {
minHeap.Enqueue(num, num);
} else if (num > minHeap.Peek()) {
minHeap.Dequeue();
minHeap.Enqueue(num, num);
}
}
return minHeap.Peek();
}
}Tracing through nums = [3,2,1,5,6,4], k = 2
Step 2 is where the min heap property matters. After pushing both 3 and 2, the heap root is 2 (the smaller). When 1 arrives, it loses to 2 and gets dropped. When 5 arrives, it beats 2 and replaces it — now the heap holds [3, 5], the two largest values seen so far. After 6 pushes out 3, the heap is [5, 6]. The 4 that comes last can't beat 5. Final root: 5. Correct.
Time O(n log k). Each of the n elements triggers at most one push and one pop from the heap, and both are O(log k). Space O(k) — only the heap, not the whole input.
When k is small relative to n, this is noticeably faster than sorting. With k=10 and n=100,000, the heap processes each element in O(log 10) ≈ 3.3 operations instead of O(log 100,000) ≈ 17.
QuickSelect: partition toward the answer
QuickSelect is the part of QuickSort you keep when you only want one answer. In QuickSort you partition and recurse on both halves. In QuickSelect you partition and recurse on the one half that contains the kth largest.
The partition works like this: pick a pivot, rearrange the array so everything larger than the pivot is to the pivot's left and everything smaller is to its right, then place the pivot in its final position. After partition, if the pivot landed at index k-1 (0-indexed, treating the array as sorted descending), you're done — that element is the answer. If the pivot is too far right (index > k-1), the answer is in the left portion. Too far left (index < k-1), search the right portion.
Randomizing the pivot selection avoids the O(n^2) worst case that happens when you always pick the min or max (which would occur deterministically on already-sorted input if you always chose the last element as pivot).
import random
from typing import List
class Solution:
def findKthLargest(self, nums: List[int], k: int) -> int:
def quickselect(left: int, right: int) -> int:
pivot_idx = random.randint(left, right)
nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
pivot = nums[right]
store = left
for i in range(left, right):
if nums[i] > pivot:
nums[store], nums[i] = nums[i], nums[store]
store += 1
nums[store], nums[right] = nums[right], nums[store]
if store == k - 1:
return nums[store]
elif store > k - 1:
return quickselect(left, store - 1)
else:
return quickselect(store + 1, right)
return quickselect(0, len(nums) - 1)using System;
public class Solution {
private readonly Random _rand = new Random();
public int FindKthLargest(int[] nums, int k) {
return QuickSelect(nums, 0, nums.Length - 1, k);
}
private int QuickSelect(int[] nums, int left, int right, int k) {
int pivotIdx = _rand.Next(left, right + 1);
Swap(nums, pivotIdx, right);
int pivot = nums[right];
int store = left;
for (int i = left; i < right; i++) {
if (nums[i] > pivot) {
Swap(nums, store, i);
store++;
}
}
Swap(nums, store, right);
if (store == k - 1) return nums[store];
if (store > k - 1) return QuickSelect(nums, left, store - 1, k);
return QuickSelect(nums, store + 1, right, k);
}
private static void Swap(int[] nums, int i, int j) {
(nums[i], nums[j]) = (nums[j], nums[i]);
}
}Tracing through nums = [3,2,1,5,6,4], k = 2
Say the random pivot picks index 5 (value 4). The partition puts everything greater than 4 to the left:
After partition: [5, 6, 4, 3, 2, 1] (elements > 4 moved left)
pivot 4 lands at index 2
Index 2 > k-1 = 1, so search left: quickselect(0, 1). Now on [5, 6]. Pivot picks one of them — say 6 (index 1). Nothing in range [0, 0] is greater than 6, so store stays at 0. Pivot lands at index 1. That's > k-1 = 1? No, 1 == 1? No, we need index == k-1 = 1. store is 0 at this point, swap puts 6 at index 1. Wait — let me re-trace this properly.
Subproblem: left=0, right=1, k=2. Pivot is nums[1]=6. store=0. For i=0: nums[0]=5 > 6? No. Loop ends. Swap nums[0] with nums[1]: array becomes [6, 5, ...]. store=0. Is store == k-1 = 1? No — 0 != 1. Is store > k-1? No. So recurse right: quickselect(1, 1, 2). Single element: pivot is 5. After partition store=1. Is store == k-1 = 1? Yes. Return nums[1] = 5. Correct.
The trace shows the key property: each recursive call works on a strictly smaller subarray, so the algorithm terminates.
Time O(n) average, O(n^2) worst case. With random pivot, the expected number of comparisons works out to O(n) — each level of recursion processes n, then n/2, then n/4 ... summing to 2n. Space O(1) for the partition (in-place), but the recursion stack is O(log n) average depth, O(n) worst case.
One thing worth knowing: the worst case is not academic. If you run the non-randomized version (always picking nums[right] as pivot) against an already-sorted descending array with k=1, every partition call places the pivot at position 0 and recurses on n-1 elements. That's exactly O(n^2). The random pivot kills that pathology.
Edge cases that matter
k = 1 (find the maximum). All three approaches return the correct maximum. For the heap, the heap holds one element — the running maximum. For QuickSelect, the answer is the leftmost element after full partition.
k = n (find the minimum). QuickSelect recurses until the rightmost surviving subarray. The heap processes all n elements, keeps n elements (the whole array), and the root is the minimum. Correct.
All elements the same, e.g. [7,7,7,7], k=3. Brute force: nums[2] = 7. Heap: every element equals the root, so num > min_heap[0] is always false after the heap fills — the heap stays as [7,7,7] and returns 7. QuickSelect: the partition loop never swaps because no element strictly exceeds the pivot — store stays at left, pivot lands at left. We recurse right repeatedly. Eventually the subarray reaches the k-th index. Returns 7. All correct.
Already sorted descending, k=1, non-randomized QuickSelect. As described: O(n^2) comparisons. This is the reason the randomized version exists. Randomize the pivot.
Single element, k=1. All three return that element. The heap's len(min_heap) < k branch pushes the single element, loop ends, return min_heap[0].
Complexity comparison
| Approach | Time | Space | Notes |
|---|---|---|---|
| Sort descending | O(n log n) | O(1) | Simplest; overkill for large n or small k |
| Min heap (size k) | O(n log k) | O(k) | Best when k << n; stable; works on streams |
| QuickSelect | O(n) avg, O(n²) worst | O(log n) avg stack | Fastest average; mutates input; not streaming |
What I'd actually ship
For an interview where the question is "find kth largest in a static array," I'd reach for the min heap. The code is clean, the complexity argument is straightforward, and the space usage is controlled. QuickSelect has better average time complexity, but it mutates the input array (sometimes a problem), the randomization is easy to forget, and the worst-case story requires explaining why O(n^2) is acceptable.
The min heap approach also generalizes to streaming inputs — if nums arrived as a data stream rather than a complete array, the heap algorithm works unchanged. QuickSelect fundamentally doesn't; you need the full array to partition. That generalization is exactly what LeetCode 703 (Kth Largest Element in a Stream) tests.
If I knew k was small — say, always less than 100 — the min heap wins on constants too. Log base 2 of 100 is less than 7. Every heap operation is under 7 comparisons. Sorting pays 17 comparisons per element on average for n=100,000.
If someone required in-place mutation and O(n) average time, I'd use QuickSelect with a randomized pivot and be explicit about the worst case. In practice that situation arises less often than interviews suggest.
The underlying pattern: top-K maintenance
The mental model from this problem is the top-K heap pattern: use a min heap of fixed size k to maintain the k largest (or a max heap of size k to maintain the k smallest). The heap root is always your candidate answer. New elements only enter if they can displace the current worst of the top-k.
This pattern appears in a cluster of related problems:
LeetCode 703 — Kth Largest Element in a Stream. Same heap, but elements arrive one at a time. You never have the full array, so QuickSelect is off the table. The heap is the only clean solution.
LeetCode 347 — Top K Frequent Elements. You're not selecting on raw value but on frequency. Build a frequency map first, then apply the same size-k min heap on (frequency, element) tuples. The heap logic is identical; the comparison key changes.
LeetCode 973 — K Closest Points to Origin. Distance to origin replaces raw value. Min heap of size k on (distance, point). Again, identical structure.
LeetCode 692 — Top K Frequent Words. Adds lexicographic tiebreaking when frequencies match. The heap comparison becomes (frequency, word) with word comparison reversed (so a lower word beats a higher-frequency-equal word). The pattern holds; the comparison key is more complex.
In each of those, once you recognize "I'm maintaining the k best by some criterion," the heap drops in immediately. The criterion changes, the structure doesn't.
This is the first problem in the heap-priority-queue series because it's the cleanest possible statement of the top-K pattern. Everything that follows builds on it: more complex comparison keys, heaps used for scheduling or merging, and problems where you pop from the heap mid-loop rather than just at the end.
Part of the series: Heap / Priority Queue
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
