K Closest Points to Origin: sorting, min heap, and when to use a bounded max heap
The distance formula for this problem is sqrt(x² + y²), and the first instinct is to sort by it. That works. But sorting is O(n log n) whether you need 1 result or n-1, and for large n with small k there's a better shape to the solution. This problem exists in the series specifically because the three approaches — sort, min heap, bounded max heap — represent a genuine trade-off ladder, not just stylistic variations.
What the constraints force
The input is 1 <= k <= points.length <= 10^4, with coordinates bounded by [-10^4, 10^4]. A few things follow directly.
The coordinate bound means x² + y² fits comfortably in a 32-bit integer: worst case is 10^4 * 10^4 + 10^4 * 10^4 = 2 * 10^8, well below Int32.MaxValue (about 2.1 * 10^9). So you can use squared Euclidean distance as the comparison key without overflow and without the floating-point noise that Math.Sqrt introduces. I skip the sqrt everywhere below — it doesn't affect the ordering, only the magnitude.
n <= 10^4 means O(n log n) is fast enough. You'd never hit a TLE here with any of the three approaches. The difference between them shows up when n is large and k is small, which is the case the max-heap approach is designed for. At n = 10^4, that difference is imperceptible. At n = 10^7 with k = 10, it's the difference between holding ten million items in memory or ten.
Approach 1: full sort
Sort all points by squared distance, take the first k. Simple, readable, and correct.
class Solution:
def kClosest(self, points: list[list[int]], k: int) -> list[list[int]]:
points.sort(key=lambda p: p[0] * p[0] + p[1] * p[1])
return points[:k]public class Solution {
public int[][] KClosest(int[][] points, int k) {
Array.Sort(points, (p1, p2) => {
int d1 = p1[0] * p1[0] + p1[1] * p1[1];
int d2 = p2[0] * p2[0] + p2[1] * p2[1];
return d1.CompareTo(d2);
});
int[][] result = new int[k][];
for (int i = 0; i < k; i++) result[i] = points[i];
return result;
}
}Time: O(n log n). Space: O(1) if you count the sort as in-place (Python's Timsort uses O(n) auxiliary; Array.Sort in C# uses introsort at O(log n) stack depth for the recursion). Either way, no extra data structure proportional to n.
The problem with this: you sort things you throw away. If n is 10,000 and k is 3, you've established a total order over 10,000 items to extract 3. That's wasted work, and it's exactly the gap the heap approaches close.
Approach 2: min heap (heapify then pop k times)
Push all points into a min heap keyed by squared distance. Pop k times. Each pop extracts the minimum-distance point.
import heapq
class Solution:
def kClosest(self, points: list[list[int]], k: int) -> list[list[int]]:
heap = []
for p in points:
d = p[0] * p[0] + p[1] * p[1]
heapq.heappush(heap, (d, p))
result = []
for _ in range(k):
_, p = heapq.heappop(heap)
result.append(p)
return resultpublic class Solution {
public int[][] KClosest(int[][] points, int k) {
var minHeap = new PriorityQueue<int[], int>();
foreach (var p in points) {
int d = p[0] * p[0] + p[1] * p[1];
minHeap.Enqueue(p, d);
}
int[][] result = new int[k][];
for (int i = 0; i < k; i++) {
result[i] = minHeap.Dequeue();
}
return result;
}
}Time: O(n log n) to build the heap, then O(k log n) for the k pops. Total O(n log n). Space: O(n) — the heap holds all n points.
The asymptotic time is the same as sorting. The memory is worse: O(n) heap vs O(1) sort. There's no advantage over sorting for this problem as stated, unless you're streaming points one by one and can't buffer them all for a sort — then the heap is natural. For a static array, the sort is strictly better here.
I mostly use this approach when explaining heap mechanics to someone, because the structure is transparent: you push everything, then pull the best k. It makes the heap invariant visible.
Approach 3: bounded max heap (optimal when k is small relative to n)
The real improvement: instead of a min heap of size n, keep a max heap of size k. Invariant: the heap always holds the k closest points seen so far, with the farthest of those k sitting at the top.
When you encounter a new point, if its distance is less than the heap's maximum (the farthest of your current k closest), evict the maximum and insert the new point. Otherwise discard the new point. At the end, the heap contains exactly the k closest.
import heapq
class Solution:
def kClosest(self, points: list[list[int]], k: int) -> list[list[int]]:
# Python's heapq is a min heap; negate distance to simulate a max heap
max_heap = []
for p in points:
d = p[0] * p[0] + p[1] * p[1]
if len(max_heap) < k:
heapq.heappush(max_heap, (-d, p))
elif d < -max_heap[0][0]:
heapq.heapreplace(max_heap, (-d, p))
return [p for _, p in max_heap]public class Solution {
public int[][] KClosest(int[][] points, int k) {
// .NET's PriorityQueue is a min heap; negate distance for max-heap behavior
var maxHeap = new PriorityQueue<int[], int>(Comparer<int>.Create((a, b) => b.CompareTo(a)));
foreach (var p in points) {
int d = p[0] * p[0] + p[1] * p[1];
if (maxHeap.Count < k) {
maxHeap.Enqueue(p, d);
} else if (d < maxHeap.Peek().Item2 || (maxHeap.TryPeek(out _, out int topPriority) && d < topPriority)) {
maxHeap.Dequeue();
maxHeap.Enqueue(p, d);
}
}
return maxHeap.UnorderedItems.Select(x => x.Element).ToArray();
}
}The C# version with PriorityQueue requires a workaround because PriorityQueue doesn't expose the current minimum priority cleanly when used as a max heap. A cleaner approach stores the priority explicitly:
public class Solution {
public int[][] KClosest(int[][] points, int k) {
// Store (distance, point) in a list; treat as max heap by negating the compare
var heap = new List<(int dist, int[] point)>(k);
int heapCount = 0;
foreach (var p in points) {
int d = p[0] * p[0] + p[1] * p[1];
if (heapCount < k) {
heap.Add((d, p));
heapCount++;
if (heapCount == k) {
// Build max heap once we have k elements
for (int i = k / 2 - 1; i >= 0; i--) SiftDown(heap, i, k);
}
} else if (d < heap[0].dist) {
heap[0] = (d, p);
SiftDown(heap, 0, k);
}
}
return heap.Select(x => x.point).ToArray();
}
private void SiftDown(List<(int dist, int[] point)> heap, int i, int size) {
while (true) {
int largest = i;
int left = 2 * i + 1, right = 2 * i + 2;
if (left < size && heap[left].dist > heap[largest].dist) largest = left;
if (right < size && heap[right].dist > heap[largest].dist) largest = right;
if (largest == i) break;
(heap[i], heap[largest]) = (heap[largest], heap[i]);
i = largest;
}
}
}Time: O(n log k). Each of the n points triggers at most one heap operation costing O(log k). Space: O(k).
Walking through the example by hand
points = [[1,3],[-2,2]], k = 1.
The heap holds at most 1 element. When we see [-2,2] with distance 8, it beats the current maximum (10), so it replaces. Simple.
For points = [[3,3],[5,-1],[-2,4]], k = 2:
Squared distances: [3,3] -> 18, [5,-1] -> 26, [-2,4] -> 20.
- Push [3,3]: heap = [(-18, [3,3])], size 1 < 2.
- Push [5,-1]: heap = [(-26, [5,-1]), (-18, [3,3])], size 2 = k. Max heap root is -26, so the farthest point tracked is [5,-1] at distance 26.
- See [-2,4]: distance 20. Is 20 < 26? Yes. Replace root: evict [5,-1], push [-2,4]. Heap = [(-20, [-2,4]), (-18, [3,3])].
- Done. Result: [[3,3], [-2,4]].
Edge cases
k = 1. All three approaches work. The max heap holds exactly one item and we compare every subsequent point against it. Nothing unusual.
k = points.length. The max heap is never full enough to evict anything — every point gets pushed. We end up with all n points. Correct.
points contains [0,0]. Squared distance is 0. If k >= 1, the origin is always in the result. The bounded heap will accept it immediately and it will never be evicted (no other point can have a smaller distance). All three approaches handle this without special-casing.
Two points with equal distance. The problem says "the answer is guaranteed to be unique (except for the order that it is in)" — meaning there are no ties at the boundary where swapping one point for another would produce a valid alternative result. So you don't need to worry about a distance-tie at position k.
Comparison
| Approach | Time | Space | When to use |
|---|---|---|---|
| Full sort | O(n log n) | O(1) | Default; k is close to n, or code clarity matters most |
| Min heap (size n) | O(n log n) | O(n) | Streaming input; explaining heap mechanics |
| Bounded max heap (size k) | O(n log k) | O(k) | Large n, small k; memory-sensitive production path |
Which one I'd ship
For a static array at these constraints (n <= 10^4), the full sort is my first choice. It's two lines in Python, relies on a well-tested standard sort, and there are no tricky negation hacks. Anyone reading the code understands it immediately.
If n were 10^7 and k were 50, I'd reach for the bounded max heap. The memory difference — holding 10 million points vs 50 — is real, and O(n log 50) vs O(n log 10^7) is a meaningful constant factor in practice.
The min heap version I use only in interview contexts to demonstrate that I understand heap construction, or when the input is a stream where I can't buffer everything first.
Where this sits in the series
The previous problems in this series — the Kth Largest Element and Top K Frequent Elements — established the heap reflex: when a problem says "k-th" or "top k", reach for a priority queue. This problem adds the geometric dimension (Euclidean distance as the key) and makes the trade-off between the min-heap and bounded-max-heap variants concrete.
The bounded max heap pattern generalises directly to:
- LeetCode 215 — Kth Largest Element in an Array: same bounded max heap, but the comparison is the value itself, not a computed distance. The O(n log k) solution there is the direct analogue.
- LeetCode 347 — Top K Frequent Elements: frequency replaces distance; you build a max heap bounded to size k over (frequency, element) pairs.
- LeetCode 692 — Top K Frequent Words: extends 347 with a lexicographic tiebreak — the comparison key becomes (frequency, -lexicographic_order), which is where the negation trick shows up in a less obvious form.
- LeetCode 378 — Kth Smallest Element in a Sorted Matrix: the matrix structure lets you do better than O(n log k), but a bounded heap is a reasonable starting point before the binary search approach.
The pattern across all of them: you have n items, you want k of them ranked by some key, and you don't need the full sorted order. That shape is the bounded max heap.
Part of the series: Heap / Priority Queue
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
