Single Number: why XOR beats the hash set
You have an integer array where every value appears exactly twice — except one lonely element that shows up only once. Return that element. The catch: linear time, constant space. You cannot sort, you cannot use a set, you cannot count frequencies in a dictionary. The constraint rules out every standard bookkeeping structure before you even write a line.
The source lists two approaches: a hash set solution (valid for time, fails for space) and XOR bit manipulation (nails both). Both are worth understanding. The hash set shows why the naive framing fails. XOR shows why one algebraic property collapses the whole problem.
The hash set reading — and why it falls short
The most natural read of this problem is a frequency problem: see a number for the first time, record it; see it again, mark it as paired. At the end, whoever is unpaired is the answer.
A hash set makes that concrete. If the number is not in the set, add it. If it is already in the set, remove it (that is the pair). Whatever remains is the singleton.
class Solution:
def singleNumber(self, nums: list[int]) -> int:
seen: set[int] = set()
for num in nums:
if num in seen:
seen.remove(num)
else:
seen.add(num)
return seen.pop()using System.Collections.Generic;
public class Solution {
public int SingleNumber(int[] nums) {
var seen = new HashSet<int>();
foreach (int num in nums) {
if (seen.Contains(num)) {
seen.Remove(num);
} else {
seen.Add(num);
}
}
// HashSet<int> doesn't have First() without LINQ; use the enumerator
foreach (int n in seen) return n;
return -1; // unreachable given the problem guarantee
}
}The logic is clean and the time complexity is fine: a single pass, each hash set operation is O(1) amortized, so overall O(n). But the space is O(n) — in the worst case (the singleton is the last element), the set holds (n - 1) / 2 values simultaneously before any removal. With n up to 3 * 10^4, that is roughly 15,000 integers sitting in memory. Not catastrophic in practice, but it does not satisfy the problem's constant-space constraint.
So the hash set is pedagogically useful and could appear in a real interview as an intermediate step — it shows you understand the problem. But it is not the answer.
By-hand trace through the hash set
Take nums = [7, 3, 5, 3, 7]. The singleton is 5.
Start: seen = {}
num = 7: not in seen -> add seen = {7}
num = 3: not in seen -> add seen = {7, 3}
num = 5: not in seen -> add seen = {7, 3, 5}
num = 3: in seen -> remove seen = {7, 5}
num = 7: in seen -> remove seen = {5}
Return seen.pop() = 5The pairs cancel themselves. The singleton has no partner, so it never gets removed. This is exactly what XOR will do algebraically — pairs cancel, singleton survives — but without the memory overhead.
What the constraints actually force
With 1 <= nums.length <= 3 * 10^4 and n guaranteed odd (one singleton plus some pairs), the array is small enough that O(n) time is fine. The binding constraint is space: you need O(1).
O(1) space means one accumulator variable. You cannot store any auxiliary structure that grows with input. The only operations you can perform that maintain state in a single integer are arithmetic operations and bitwise operations. The question becomes: is there a bitwise operation that naturally cancels pairs?
There is, and it is XOR.
XOR cancels pairs — why the math works
XOR (^) has three properties that matter here:
a ^ a = 0(any value XORed with itself is zero)a ^ 0 = a(zero is the identity)- XOR is both commutative and associative — order does not matter
These three together mean that if you XOR every element in the array, every pair cancels to zero, and the singleton XORed against zero is just the singleton.
Concretely: for [a, b, a, c, b, c, d], accumulating XOR gives:
a ^ b ^ a ^ c ^ b ^ c ^ d
= (a ^ a) ^ (b ^ b) ^ (c ^ c) ^ d <- reordering is valid, XOR is commutative
= 0 ^ 0 ^ 0 ^ d
= dThe singleton survives. Everything else is annihilated. This is not a trick — it is the direct consequence of those three algebraic rules.
XOR implementation
class Solution:
def singleNumber(self, nums: list[int]) -> int:
result = 0
for num in nums:
result ^= num
return resultpublic class Solution {
public int SingleNumber(int[] nums) {
int result = 0;
foreach (int num in nums) {
result ^= num;
}
return result;
}
}That is the whole thing. One variable, one pass, two properties of XOR.
By-hand trace through XOR
Same input: nums = [7, 3, 5, 3, 7]. I'll track binary representations alongside the decimal to make the bit cancellation visible.
Start: result = 0 (0b00000)
num = 7: result = 0 ^ 7 = 7 (0b00000 ^ 0b00111 = 0b00111)
num = 3: result = 7 ^ 3 = 4 (0b00111 ^ 0b00011 = 0b00100)
num = 5: result = 4 ^ 5 = 1 (0b00100 ^ 0b00101 = 0b00001)
num = 3: result = 1 ^ 3 = 2 (0b00001 ^ 0b00011 = 0b00010)
num = 7: result = 2 ^ 7 = 5 (0b00010 ^ 0b00111 = 0b00101)
Return 5The intermediate values look chaotic — that is expected. XOR accumulates bit-by-bit, and the final state is determined by which bits appear an odd number of times across all inputs. Only the bits belonging to 5 (which appears once) survive; all others are contributed by even-count values and cancel.
Edge cases
Single-element array (nums = [42]): The loop runs once: result = 0 ^ 42 = 42. Returns 42. Correct — there are no pairs, the sole element is trivially the singleton.
Negative integers (nums = [-5, 2, -5]): XOR operates on the two's-complement bit pattern, and Python/C# both use two's-complement for integers. (-5) ^ (-5) = 0 holds because every bit in -5 is paired with itself. result = 0 ^ 2 = 2. Correct.
Singleton is the first element (nums = [9, 4, 4]): The pairs appear after the singleton. Because XOR is commutative, order does not matter: 9 ^ 4 ^ 4 = 9 ^ (4 ^ 4) = 9 ^ 0 = 9. Correct.
Singleton is the last element (nums = [4, 4, 9]): Same logic in different order: 4 ^ 4 ^ 9 = 0 ^ 9 = 9. Correct.
All duplicates with value zero, singleton nonzero (nums = [0, 0, 13]): 0 ^ 0 ^ 13 = 0 ^ 13 = 13. Correct — zero is the XOR identity and causes no problems.
Complexity
| Approach | Time | Space | Notes |
|---|---|---|---|
| Hash set (toggle) | O(n) | O(n) | Fails the space constraint |
| XOR accumulation | O(n) | O(1) | One variable, one pass, no aux structure |
Both are single-pass linear time. The difference is entirely in space. The XOR approach uses the same memory regardless of input size — just one integer register.
Which one to ship, and why
XOR is the only correct answer given the constraints. I would write it immediately in an interview, but I would explain the hash set intuition first to demonstrate that I understand the problem before reaching for the bit trick. The hash set makes the "pair cancellation" model explicit in a way that the XOR does not — and showing that you can derive the XOR property from the set intuition ("what if we could cancel pairs without storing them?") is stronger than pulling the trick from memory.
In production code, the XOR version is also strictly better: it is shorter, cache-friendly (no hash table allocations), and correct on edge cases without any special handling. There is no reason to prefer the hash set here.
The underlying pattern: bit-parity accumulation
The transferable mental model is this: XOR accumulates the parity of each bit position across all inputs. A bit set an even number of times is zero in the result. A bit set an odd number of times is one in the result. When every value appears exactly twice except one, the result's bit pattern is exactly the bit pattern of the singleton — because only its bits contribute with odd parity.
This "parity accumulation" pattern shows up in several ways in bit manipulation problems. Once you see XOR as a parity detector rather than just a logic gate, a family of problems opens up.
Where this sits in the series, and what comes next
This is the entry point of the bit manipulation series — the problem that introduces XOR as a tool for something non-obvious. If you are comfortable with this one, the sibling problems escalate the constraint:
Single Number II (137) pushes the pairing assumption: now every element appears exactly three times except one. a ^ a ^ a != 0 so XOR alone does not cancel. You need to track bit counts modulo 3, which requires maintaining two accumulators (ones and twos) to represent a 3-state counter per bit. The same XOR-based thinking applies, but the accumulator design is more involved.
Single Number III (260) goes further: two elements each appear once, everything else twice. A single XOR pass gives you the XOR of the two singletons, not either singleton. You then need to isolate a bit that differs between them (the lowest set bit of their XOR), split the array by that bit, and XOR each partition separately. Two passes, two accumulators — and the key insight is still XOR parity.
Find the Difference (389) is the same core trick applied to strings: given a string s and a shuffled version t with one extra character, find the extra character. XOR all characters of both strings together; the pairs cancel and the extra character survives. Almost identical code to this problem.
Missing Number (268) is a cousin: given [0, n] with one number missing, XOR the array with [0, 1, ..., n]. The present numbers cancel in pairs (array value XOR range value), and only the missing number's range value remains unpaired. The trick requires knowing the expected range, which this problem does not — but the algebraic structure is the same.
Part of the series: Bit Manipulation
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
