Merge K Sorted Lists: why O(k²N) breaks and how two O(N log k) strategies fix it
The problem at first glance looks like a modest generalization: you already know how to merge two sorted linked lists, so surely k lists is just doing that a few more times. That intuition is exactly right, but "a few more times" is where the cost hides. If you repeatedly merge list 1 into the result, then list 2, then list 3, the growing result list gets re-walked every time. By the time you finish, you have done O(k²N) comparisons. With k up to 10^4 and total nodes N up to 10^4, that is a hundred million operations in the worst case. The constraints are telling you to think harder.
The problem
You are given an array of k linked lists, each already sorted in ascending order. Merge all of them into one sorted linked list and return it. Full statement: LeetCode 23.
Constraints:
- k == lists.length
- 0 <= k <= 10^4
- 0 <= lists[i].length <= 500
- -10^4 <= lists[i][j] <= 10^4
- lists[i] is sorted in ascending order.
- The sum of lists[i].length will not exceed 10^4.What the constraints force
k == lists.length, 0 <= k <= 10^4, 0 <= lists[i].length <= 500, total nodes <= 10^4, values in [-10^4, 10^4].
Three things stand out. First, k can reach ten thousand. Any algorithm that does O(k) work per node is O(kN) overall, and with both k and N at 10^4 that is 10^8 operations — slow. This is the ceiling that eliminates the naive linear-scan approach in competitive settings.
Second, total nodes N is also bounded at 10^4. So N and k share the same order of magnitude. That makes O(N log k) — roughly 10^4 * 14 — quite comfortable.
Third, the value range [-10^4, 10^4] is signed and includes negatives. You cannot use 0 as a sentinel and you cannot ignore negative values when building your min heap. The comparison logic must be on raw values.
The sorted-within-each-list guarantee is the structural gift the problem gives you. It means you never need to look inside a list beyond the current head — the minimum candidate from each list always sits at the front. Both efficient approaches exploit this directly.
Approach 1: Brute force — linear scan for minimum each step
The most literal reading of "find the minimum and advance": at every step, walk all k list heads and find the smallest one, append it to the result, advance that list. Repeat until all lists are exhausted.
from typing import List, Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
class Solution:
def mergeKLists(self, lists: List[Optional[ListNode]]) -> Optional[ListNode]:
if not lists:
return None
dummy = ListNode(0)
current = dummy
while True:
min_val = float('inf')
min_idx = -1
for i in range(len(lists)):
if lists[i] is not None and lists[i].val < min_val:
min_val = lists[i].val
min_idx = i
if min_idx == -1:
break
current.next = lists[min_idx]
current = current.next
lists[min_idx] = lists[min_idx].next
return dummy.nextpublic class ListNode {
public int val;
public ListNode next;
public ListNode(int val = 0, ListNode next = null) {
this.val = val;
this.next = next;
}
}
public class Solution {
public ListNode MergeKLists(ListNode[] lists) {
if (lists == null || lists.Length == 0) return null;
ListNode dummy = new ListNode(0);
ListNode current = dummy;
while (true) {
int minVal = int.MaxValue;
int minIdx = -1;
for (int i = 0; i < lists.Length; i++) {
if (lists[i] != null && lists[i].val < minVal) {
minVal = lists[i].val;
minIdx = i;
}
}
if (minIdx == -1) break;
current.next = lists[minIdx];
current = current.next;
lists[minIdx] = lists[minIdx].next;
}
return dummy.next;
}
}Every node costs one pass through k list pointers to find the minimum. N nodes total means N passes, giving O(k * N) time. Space is O(1) — no auxiliary structure, just pointer manipulation.
The flaw is clear: each node extraction requires a full scan of all k lists. When k is large, most of those k comparisons are wasted because you are revisiting lists that cannot possibly have the current minimum (their heads are still larger). You are throwing away the sorted order that the problem gives you for free.
Approach 2: Min heap — O(log k) minimum extraction
The structural fix is direct: replace the O(k) linear scan with a min heap that keeps exactly one entry per active list. Extracting the minimum is now O(log k), and after extraction you push the next node from the same list back in. The heap size never exceeds k.
import heapq
from typing import List, Optional
class Solution:
def mergeKLists(self, lists: List[Optional[ListNode]]) -> Optional[ListNode]:
if not lists:
return None
heap = []
for i, node in enumerate(lists):
if node:
heapq.heappush(heap, (node.val, i, node))
dummy = ListNode(0)
current = dummy
while heap:
val, list_idx, node = heapq.heappop(heap)
current.next = node
current = current.next
if node.next:
heapq.heappush(heap, (node.next.val, list_idx, node.next))
return dummy.nextThe tuple (node.val, list_idx, node) carries the list index as a tiebreaker. Python's heap compares tuples element by element, so when two nodes have the same value, the comparison falls to list_idx (an int) rather than attempting to compare ListNode objects — which would raise a TypeError. This is not a micro-optimization; without it, the code crashes on tied values.
using System.Collections.Generic;
public class Solution {
public ListNode MergeKLists(ListNode[] lists) {
if (lists == null || lists.Length == 0) return null;
// PriorityQueue<TElement, TPriority> is available from .NET 6+
var pq = new PriorityQueue<(int listIdx, ListNode node), int>();
for (int i = 0; i < lists.Length; i++) {
if (lists[i] != null) {
pq.Enqueue((i, lists[i]), lists[i].val);
}
}
ListNode dummy = new ListNode(0);
ListNode current = dummy;
while (pq.Count > 0) {
var (listIdx, node) = pq.Dequeue();
current.next = node;
current = current.next;
if (node.next != null) {
pq.Enqueue((listIdx, node.next), node.next.val);
}
}
return dummy.next;
}
}C#'s PriorityQueue (added in .NET 6) takes a separate priority argument, so there is no tiebreaker issue — the priority is just the integer value, and the element is the pair (listIdx, node). Cleaner than the Python workaround.
Time: O(N log k). Each of the N nodes is pushed once and popped once, and each heap operation costs O(log k). Space: O(k) for the heap, which holds at most k entries at any moment.
Step-by-step trace: lists = [[1,4,5],[1,3,4],[2,6]]
Let me walk through this with the concrete example from the problem.
Initial heap (after seeding with list heads):
heap = [(1, idx=0, node→1->4->5), (1, idx=1, node→1->3->4), (2, idx=2, node→2->6)]
result: dummy ->
Step 1: pop (1, 0, node→1)
append node(1) to result
push next node(4) from list 0 -> heap = [(1,1,→1->3->4), (2,2,→2->6), (4,0,→4->5)]
result: dummy -> 1
Step 2: pop (1, 1, node→1)
append node(1) to result
push next node(3) from list 1 -> heap = [(2,2,→2->6), (3,1,→3->4), (4,0,→4->5)]
result: dummy -> 1 -> 1
Step 3: pop (2, 2, node→2)
append node(2) to result
push next node(6) from list 2 -> heap = [(3,1,→3->4), (4,0,→4->5), (6,2,→6)]
result: dummy -> 1 -> 1 -> 2
Step 4: pop (3, 1, node→3)
append node(3) to result
push next node(4) from list 1 -> heap = [(4,0,→4->5), (4,1,→4), (6,2,→6)]
result: dummy -> 1 -> 1 -> 2 -> 3
Step 5: pop (4, 0, node→4)
append node(4) to result
push next node(5) from list 0 -> heap = [(4,1,→4), (5,0,→5), (6,2,→6)]
result: dummy -> 1 -> 1 -> 2 -> 3 -> 4
Step 6: pop (4, 1, node→4)
append node(4), list 1 exhausted (node.next == null)
heap = [(5,0,→5), (6,2,→6)]
result: dummy -> 1 -> 1 -> 2 -> 3 -> 4 -> 4
Step 7: pop (5, 0, node→5)
append node(5), list 0 exhausted
heap = [(6,2,→6)]
result: dummy -> 1 -> 1 -> 2 -> 3 -> 4 -> 4 -> 5
Step 8: pop (6, 2, node→6)
append node(6), list 2 exhausted
heap = []
result: dummy -> 1 -> 1 -> 2 -> 3 -> 4 -> 4 -> 5 -> 6
Heap empty. Return dummy.next.
The heap never grows beyond 3 entries (k = 3). Each extraction guarantees the global minimum among all remaining heads.
Approach 3: Divide and conquer — merge sort applied to lists
Instead of processing one node at a time, you can restructure the problem entirely. Pair up the k lists and merge each pair. You now have k/2 lists. Pair them up again. After log k rounds, you have one merged list.
from typing import List, Optional
class Solution:
def mergeKLists(self, lists: List[Optional[ListNode]]) -> Optional[ListNode]:
if not lists:
return None
return self._divide(lists, 0, len(lists) - 1)
def _divide(self, lists, start, end):
if start == end:
return lists[start]
if start > end:
return None
mid = (start + end) // 2
left = self._divide(lists, start, mid)
right = self._divide(lists, mid + 1, end)
return self._merge_two(left, right)
def _merge_two(self, l1, l2):
dummy = ListNode(0)
cur = dummy
while l1 and l2:
if l1.val <= l2.val:
cur.next = l1
l1 = l1.next
else:
cur.next = l2
l2 = l2.next
cur = cur.next
cur.next = l1 or l2
return dummy.nextpublic class Solution {
public ListNode MergeKLists(ListNode[] lists) {
if (lists == null || lists.Length == 0) return null;
return Divide(lists, 0, lists.Length - 1);
}
private ListNode Divide(ListNode[] lists, int start, int end) {
if (start == end) return lists[start];
if (start > end) return null;
int mid = (start + end) / 2;
ListNode left = Divide(lists, start, mid);
ListNode right = Divide(lists, mid + 1, end);
return MergeTwo(left, right);
}
private ListNode MergeTwo(ListNode l1, ListNode l2) {
ListNode dummy = new ListNode(0);
ListNode cur = dummy;
while (l1 != null && l2 != null) {
if (l1.val <= l2.val) {
cur.next = l1;
l1 = l1.next;
} else {
cur.next = l2;
l2 = l2.next;
}
cur = cur.next;
}
cur.next = l1 ?? l2;
return dummy.next;
}
}The recursion tree looks like this for k = 4:
Each level of the tree processes all N nodes exactly once. There are log k levels. Total work: O(N log k). Stack depth is O(log k) for the recursion frames.
Time: O(N log k), same as the heap. Space: O(log k) for the call stack — strictly better than the heap's O(k) when k is large.
Edge cases that matter
Empty input (lists = []): The if not lists / if lists.Length == 0 guard fires and returns null. Nothing to merge.
Array of empty lists (lists = [[]] or lists = [null, null]): The heap approach seeds no entries — every head is null, so nothing is pushed. The while loop never runs. Returns dummy.next = null. The divide and conquer approach returns lists[0] which is null. Both correct.
Single list (k = 1): Heap seeds one entry, immediately pops it, pushes the rest of the list one node at a time. Returns the list unchanged. Divide and conquer hits start == end immediately and returns lists[0]. One-liner result, no actual merging.
All nodes with the same value: The Python heap tiebreaker (list_idx) prevents TypeError. The C# PriorityQueue handles it naturally since priority is an int. The resulting order among equal-valued nodes from different lists is stable by list index — consistent but the problem does not require stability.
Negative values: The value range includes -10^4. The heap comparison is on raw integer values, negatives included. No special handling needed. The sorted order within each list is still ascending regardless of sign.
Lists of very different lengths: One list might have 500 nodes, another might have 1. The heap handles this transparently — the long list keeps feeding nodes in one by one as they get extracted, the short list contributes its one node and goes silent. Divide and conquer handles it equally well since the merging of two lists of unequal length is the base case.
Comparison table
| Approach | Time | Space | Notes |
|---|---|---|---|
| Linear scan brute force | O(k * N) | O(1) | Simple; fails badly when k is large |
| Min heap | O(N log k) | O(k) | Transparent implementation; heap size bounded by k |
| Divide and conquer | O(N log k) | O(log k) | Lower space; familiar merge-sort structure |
Both heap and divide-and-conquer land at O(N log k). The difference is in space and in how the work is organized. The heap processes nodes one at a time via global minimum extraction. Divide and conquer processes them in bulk by reducing the problem size logarithmically.
Which one I'd ship, and why
For an interview, I reach for the min heap. The logic is transparent, the code is short, and the tiebreaker trick in Python is a genuine signal that you have thought through implementation details rather than just described an algorithm. When the interviewer asks how you handle duplicate values, you can point directly to list_idx in the tuple. That conversation is easy.
For production code where k could be large and stack depth matters — say, a database engine merging sorted runs from parallel workers — I would prefer divide and conquer. The O(log k) stack depth is hard to exhaust, and the structure maps naturally to parallel execution: you can run independent branches on separate cores. The heap cannot parallelize as cleanly because every extraction serializes through the shared heap state.
In practice, the constants favor the heap slightly for small k (the heap management overhead is lower than the recursive call overhead). Once k reaches the hundreds, the divide-and-conquer's lower space footprint starts to matter. Pick based on what your constraints actually look like.
The underlying pattern and where it appears
The name for this is k-way merge. The pattern shows up wherever you have k sorted sequences and need to produce one sorted sequence efficiently. The heap is the canonical implementation: seed with one element per sequence, always extract the minimum, always replenish from the same sequence. The divide-and-conquer is the merge-sort variant: reduce k sequences to 1 by halving the count at each level.
Both patterns transfer directly beyond linked lists. The sequences could be arrays, iterators from disk files, or streams from network sockets. The merge logic stays the same; only the data structure changes.
Where this sits in the linked-list series
This is the hardest merge problem in the linked-list series. The earlier problems established the foundation: Merge Two Sorted Lists (21) is the primitive that every approach here uses as a building block — if that two-list merge is not immediate for you, start there. Reverse Linked List (206) and Reorder List (143) each add a pointer-manipulation step on top of a traversal; they are simpler than this one. Sort List (148) is the natural next step: apply divide-and-conquer merge sort to sort an unsorted linked list, which is essentially this problem run in reverse — building sorted runs first, then merging them.
Four problems worth connecting:
Find K Pairs with Smallest Sums (373) uses a min heap to extract k smallest pairs from two arrays — the same heap-based k-way extraction, applied to a virtual sequence of candidate pairs instead of actual list nodes.
Kth Smallest Element in a Sorted Matrix (378) treats each row as a sorted list and asks for the k-th smallest element overall. The heap approach from this problem adapts directly: seed the heap with the first element of each row.
Smallest Range Covering Elements from K Lists (632) extends this problem by asking not for the merged list but for the narrowest window that contains at least one element from each list. Same heap structure, different question asked at each extraction step.
Merge Sorted Array (88) is the array version of the two-list merge: simpler constraints but the same comparison logic. Worth solving if you want to see how the dummy-node trick disappears when you have pre-allocated output space.
Part of the series: Linked List
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
