Merge Two Sorted Lists: the dummy head trick
There's a version of this problem without the dummy head node. I've written it. It works, but you have to special-case the first node — figure out which list to pull from first, set the result head to that node, and then enter the loop. It's seven or eight lines of bookkeeping before the actual logic starts. The dummy head cuts all of that. You start with a fake node you'll never return, give current something real to point at, and the loop becomes identical for every step including the first one.
That's the whole trick. It sounds small. In practice it's the difference between code that's easy to get wrong and code that's hard to get wrong.
What the problem actually gives you
Two singly linked lists, both sorted in non-decreasing order. Merge them into one sorted list by relinking the existing nodes — no new nodes, just rewired next pointers. Return the head of the merged list.
list1: 1 -> 2 -> 4 -> null
list2: 1 -> 3 -> 4 -> null
result: 1 -> 1 -> 2 -> 3 -> 4 -> 4 -> null
The constraints are worth reading carefully. Up to 50 nodes per list, values in [-100, 100], both lists sorted in non-decreasing order. The small node count rules out worrying about call stack depth for the recursive solution (100 frames max). The value range with negatives means nothing special — the comparison <= works fine on negative integers. The constraint that actually drives everything is the sorted order guarantee.
Without that guarantee, you'd merge and then sort: O((m+n) log(m+n)). With it, you never need to look ahead more than one step. At any point, the smallest unprocessed element sits at the head of one of the two lists. Compare the two heads, take the smaller, advance that pointer. When one list runs dry, attach whatever remains of the other — it's already sorted, so it drops in directly with a single pointer assignment.
That single sorted-order guarantee is what makes O(m+n) time possible, and it's the insight that distinguishes merge from sort-then-merge.
The iterative solution with dummy head
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
class Solution:
def mergeTwoLists(self, list1: ListNode, list2: ListNode) -> ListNode:
dummy = ListNode(0)
current = dummy
while list1 and list2:
if list1.val <= list2.val:
current.next = list1
list1 = list1.next
else:
current.next = list2
list2 = list2.next
current = current.next
current.next = list1 or list2
return dummy.nextpublic class Solution {
public ListNode MergeTwoLists(ListNode list1, ListNode list2) {
var dummy = new ListNode(0);
var current = dummy;
while (list1 != null && list2 != null) {
if (list1.val <= list2.val) {
current.next = list1;
list1 = list1.next;
} else {
current.next = list2;
list2 = list2.next;
}
current = current.next;
}
current.next = list1 ?? list2;
return dummy.next;
}
}The loop body does three things every iteration: pick the smaller head, attach it, advance. The current = current.next at the end is critical — without it, current stays pointing at the same node and every subsequent assignment overwrites current.next without building a chain.
The tail attachment current.next = list1 or list2 works because at most one list is non-null after the loop exits. If list1 is exhausted, list1 is None/null and the expression resolves to list2 (or null if both are exhausted). The entire remaining list — already sorted — connects in O(1).
dummy.next is the real head. dummy itself is discarded.
Step-by-step walkthrough
Tracing list1 = [1,2,4], list2 = [1,3,4] through the pointer moves:
No node was created. Every node in the result was already in list1 or list2 — we just rewired their next pointers.
Why <= and not <
The comparison uses <=. When both heads are equal, I take from list1. The choice is arbitrary for correctness — either order produces a valid sorted list — but <= makes the merge stable: nodes from list1 appear before same-valued nodes from list2, preserving relative order. Stability matters in merge sort, and this is the subroutine merge sort calls, so the habit is worth building.
The recursive solution
The recursive approach expresses the algorithm the way you'd describe it out loud: if list1's head is smaller, the merged list starts with list1's head, and its tail is the result of merging the rest of list1 with list2. Repeat until one list is empty, then the other list drops in as the tail.
class Solution:
def mergeTwoLists(self, list1: ListNode, list2: ListNode) -> ListNode:
if not list1:
return list2
if not list2:
return list1
if list1.val <= list2.val:
list1.next = self.mergeTwoLists(list1.next, list2)
return list1
else:
list2.next = self.mergeTwoLists(list1, list2.next)
return list2public class Solution {
public ListNode MergeTwoLists(ListNode list1, ListNode list2) {
if (list1 == null) return list2;
if (list2 == null) return list1;
if (list1.val <= list2.val) {
list1.next = MergeTwoLists(list1.next, list2);
return list1;
} else {
list2.next = MergeTwoLists(list1, list2.next);
return list2;
}
}
}Each call picks one node and recurses on the reduced problem. The base cases handle the empty-list situation that the dummy head handles for the iterative version. To see how the call stack unwinds on the first two steps:
The call depth reaches m+n in the worst case. With lists up to 50 nodes each, that's at most 100 stack frames — fine for this problem. It would not be fine for Sort List (148), where you're calling this on lists of arbitrary length as part of merge sort. For that context, the iterative version is mandatory.
Complexity
| Approach | Time | Space |
|---|---|---|
| Iterative | O(m + n) | O(1) |
| Recursive | O(m + n) | O(m + n) |
Both touch every node exactly once. The difference is entirely in space: the iterative version uses a fixed number of pointers, while the recursive version accumulates stack frames proportional to the total number of nodes.
O(m + n) time is optimal — you have to look at every node at least once to know where it belongs in the output. There's no smarter algorithm, because you need to read the full output.
Edge cases
Both solutions handle these without any special branching, which is worth understanding explicitly:
Both lists empty. The loop never runs; current.next stays null. dummy.next is null. For the recursive version, the first base case returns immediately.
One list empty. The while condition fails immediately. current.next is set to the non-null list via the tail attachment line. The recursive version returns the non-empty list from the second base case.
One list entirely smaller than the other. Say list1 = [1,2,3] and list2 = [4,5,6]. The loop drains list1 completely, then current.next = list2 attaches all of list2 in one shot. No wasted iteration on the tail.
All values equal. The <= condition consistently pulls from list1, producing a stable interleaving: list1's nodes appear first when values tie.
Negative values. [-100, -50] merges with [-75, 0] exactly as you'd expect — the comparison is just integer comparison, and negative integers compare correctly.
Which version I'd actually ship
For LeetCode 21 in isolation, either solution passes. For anything downstream — merge sort, k-way merge, external merge — I'd reach for the iterative version without thinking about it. The recursive version's stack usage is bounded at 100 frames here, but the moment you embed this function inside Sort List (148), the outer merge sort is already consuming O(log n) stack frames. The recursive merge adds another O(n) on top. That compounds. The iterative version doesn't have this problem.
There's also readability to consider. The iterative version is four pointer operations per loop iteration. The recursive version looks elegant on paper but you have to think through the unwinding to convince yourself the pointers land correctly. I've seen people write the recursive version in interviews and then talk themselves into believing it doesn't work — the unwinding is non-obvious at first glance.
I'd ship iterative. The dummy head is the cost of entry, and it's worth it.
Where this pattern appears
The signal for merge is "two sorted sequences, produce one sorted sequence." Once you see that framing, the pattern activates almost automatically. It shows up embedded inside other problems: merge sort calls this directly, external sort algorithms merge sorted chunks, and some interval problems reduce to merging sorted interval lists.
The underlying mental model is greedy selection from sorted sources: because both inputs are sorted, the globally smallest remaining element is always at the head of one of the two pointers. Greedy is locally optimal, and because the inputs are sorted, locally optimal is also globally optimal. That's why a single pass suffices.
Related problems
Merge Sorted Array (88) is the same merge logic on arrays. The interesting version is in-place, which uses three pointers starting from the back of the arrays to avoid shifting elements.
Merge K Sorted Lists (23) extends this to k lists. Calling this function k - 1 times works but costs O(nk) time — each pass over an already-merged list is wasted work. The efficient version uses a min-heap: always pull the globally smallest head in O(log k) per step, total O(n log k). The merge logic inside is identical to what we wrote here.
Sort List (148) is merge sort on a linked list. Splits at midpoint with slow/fast pointers, recursively sorts each half, and then calls this exact merge to recombine. The iterative version here is the one to use — adding recursive stack on top of merge sort's recursion is unnecessary.
Add Two Numbers (2) traverses two linked lists node-by-node with a carry, which is structurally similar traversal but completely different logic. The sorted-order guarantee that drives the merge pattern doesn't apply there.
Part of the series: Linked List
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
