LRU Cache: why O(1) eviction requires you to hold two data structures at once
The constraint that makes this problem interesting is not the capacity. It is the last line: get and put must each run in O(1) average time. That single requirement rules out every straightforward solution and forces you toward a specific pair of data structures working in concert.
Most cache problems can be solved with a map. The map handles fast lookup in O(1). What a map cannot do is tell you, in O(1), which key was touched least recently. Order information is not something hash maps maintain. So you need a second structure that does maintain order — one where moving an element to "most recently used" is O(1), and reading off the least recently used end is also O(1). A doubly linked list with direct node references satisfies both. The hash map stores key -> node pointer, and the linked list stores the recency order. Together they give you everything.
That insight is the whole problem. The rest is careful pointer manipulation.
The problem
Design a data structure that implements a Least Recently Used (LRU) cache: a get(key) that returns the cached value or -1 if absent, and a put(key, value) that inserts or updates an entry and evicts the least recently used key when the cache exceeds its capacity — both operations required to run in O(1) average time. Full statement: LeetCode 146.
Constraints:
- 1 <= capacity <= 3000
- 0 <= key <= 10^4
- 0 <= value <= 10^5
- At most 2 * 10^5 calls will be made to get and put.What the constraints actually force
1 <= capacity <= 3000. Small enough that the O(n) approach works in tests if you are not careful — but the problem explicitly requires O(1), so capacity size is not the escape hatch.
0 <= key <= 10^4, 0 <= value <= 10^5. All non-negative integers. There are no negative sentinels hiding in the value range, and 0 is a valid value. A get returning -1 genuinely means the key is absent — never mistake it for a stored value, since values are capped at 10^5.
At most 2 * 10^5 calls to get and put. This is the real pressure. At O(n) per operation and capacity up to 3000, you are looking at 6 * 10^8 work units in the worst case. That is too slow. The O(1) requirement is not aspirational — it is load-bearing.
Approach 1: Array with linear search
The straightforward reading of the spec: keep a list of (key, value, timestamp) tuples. Each access increments a global timestamp and stamps the accessed entry. When eviction is needed, scan the list to find the minimum timestamp.
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = [] # list of [key, value, timestamp]
self.timestamp = 0
def get(self, key: int) -> int:
for i in range(len(self.cache)):
if self.cache[i][0] == key:
self.timestamp += 1
self.cache[i][2] = self.timestamp
return self.cache[i][1]
return -1
def put(self, key: int, value: int) -> None:
self.timestamp += 1
for i in range(len(self.cache)):
if self.cache[i][0] == key:
self.cache[i][1] = value
self.cache[i][2] = self.timestamp
return
if len(self.cache) < self.capacity:
self.cache.append([key, value, self.timestamp])
else:
lru_idx = min(range(len(self.cache)), key=lambda i: self.cache[i][2])
self.cache[lru_idx] = [key, value, self.timestamp]public class LRUCache {
private int capacity;
private List<(int key, int value, long ts)> cache;
private long timestamp;
public LRUCache(int capacity) {
this.capacity = capacity;
this.cache = new List<(int, int, long)>();
this.timestamp = 0;
}
public int Get(int key) {
for (int i = 0; i < cache.Count; i++) {
if (cache[i].key == key) {
timestamp++;
cache[i] = (key, cache[i].value, timestamp);
return cache[i].value;
}
}
return -1;
}
public void Put(int key, int value) {
timestamp++;
for (int i = 0; i < cache.Count; i++) {
if (cache[i].key == key) {
cache[i] = (key, value, timestamp);
return;
}
}
if (cache.Count < capacity) {
cache.Add((key, value, timestamp));
} else {
int lruIdx = 0;
long minTs = cache[0].ts;
for (int i = 1; i < cache.Count; i++) {
if (cache[i].ts < minTs) { minTs = cache[i].ts; lruIdx = i; }
}
cache[lruIdx] = (key, value, timestamp);
}
}
}Complexity: Both get and put are O(capacity) — you scan the entire list on every call. Space is O(capacity) for the list itself.
This is correct. It fails on performance. With capacity at 3000 and 2 * 10^5 calls, you can hit 6 * 10^8 comparisons — well over any reasonable time limit. The structure also has a subtler problem: every operation scans every slot even when the cache is only half full. The timestamp approach is elegant but the scan is unavoidable.
Approach 2: Hash map + doubly linked list
The core insight: to achieve O(1) on both access and eviction, you need two things simultaneously — (1) find any node in O(1) given its key, and (2) remove the least-recently-used node in O(1). A hash map gives you (1). A doubly linked list with sentinel head and tail gives you (2). Neither structure alone solves both.
The list maintains recency order. The most recently used node sits just after the head sentinel. The least recently used node sits just before the tail sentinel. Every access — read or write — moves the touched node to the head. Eviction always removes the node just before the tail.
The sentinel trick is what makes this clean. Without sentinel nodes, every operation has to check whether the node being removed is at the head or tail. With sentinels, the head and tail are always valid nodes pointing into the list, and "add to front" and "remove from back" are the same four-pointer operations every time.
class Node:
def __init__(self, key: int = 0, value: int = 0):
self.key = key
self.value = value
self.prev: "Node | None" = None
self.next: "Node | None" = None
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache: dict[int, Node] = {}
# Sentinel nodes — they never hold real data
self.head = Node()
self.tail = Node()
self.head.next = self.tail
self.tail.prev = self.head
def _remove(self, node: Node) -> None:
node.prev.next = node.next
node.next.prev = node.prev
def _add_to_front(self, node: Node) -> None:
node.prev = self.head
node.next = self.head.next
self.head.next.prev = node
self.head.next = node
def get(self, key: int) -> int:
if key not in self.cache:
return -1
node = self.cache[key]
self._remove(node)
self._add_to_front(node)
return node.value
def put(self, key: int, value: int) -> None:
if key in self.cache:
node = self.cache[key]
node.value = value
self._remove(node)
self._add_to_front(node)
else:
node = Node(key, value)
self.cache[key] = node
self._add_to_front(node)
if len(self.cache) > self.capacity:
# Evict the node just before the tail sentinel
lru = self.tail.prev
self._remove(lru)
del self.cache[lru.key]public class Node {
public int Key, Value;
public Node Prev, Next;
public Node(int key = 0, int value = 0) { Key = key; Value = value; }
}
public class LRUCache {
private int capacity;
private Dictionary<int, Node> cache;
private Node head, tail;
public LRUCache(int capacity) {
this.capacity = capacity;
cache = new Dictionary<int, Node>();
head = new Node();
tail = new Node();
head.Next = tail;
tail.Prev = head;
}
private void Remove(Node node) {
node.Prev.Next = node.Next;
node.Next.Prev = node.Prev;
}
private void AddToFront(Node node) {
node.Prev = head;
node.Next = head.Next;
head.Next.Prev = node;
head.Next = node;
}
public int Get(int key) {
if (!cache.ContainsKey(key)) return -1;
Node node = cache[key];
Remove(node);
AddToFront(node);
return node.Value;
}
public void Put(int key, int value) {
if (cache.ContainsKey(key)) {
Node node = cache[key];
node.Value = value;
Remove(node);
AddToFront(node);
} else {
Node node = new Node(key, value);
cache[key] = node;
AddToFront(node);
if (cache.Count > capacity) {
Node lru = tail.Prev;
Remove(lru);
cache.Remove(lru.Key);
}
}
}
}Complexity: Both get and put are O(1). Every hash map operation is O(1) average. Every linked list operation (_remove, _add_to_front) touches exactly four pointers — independent of list size. Eviction reads tail.prev in O(1) and removes it in O(1). Space is O(capacity) for the map plus O(capacity) for the list nodes.
Step-by-step trace: capacity = 2
This is the problem's own example. I want to make the list state visible at each operation.
Initial state:
cache = {}
HEAD <-> TAIL
put(1, 1):
New node. Insert at front.
HEAD <-> [1:1] <-> TAIL
cache = {1: node[1:1]}
put(2, 2):
New node. Insert at front.
HEAD <-> [2:2] <-> [1:1] <-> TAIL
cache = {1: node[1:1], 2: node[2:2]}
get(1):
Hit. Remove [1:1], re-insert at front.
HEAD <-> [1:1] <-> [2:2] <-> TAIL
return 1
put(3, 3):
New node. len(cache) would become 3 > capacity=2.
Evict tail.prev = [2:2]. Remove from list and map.
Insert [3:3] at front.
HEAD <-> [3:3] <-> [1:1] <-> TAIL
cache = {1: node[1:1], 3: node[3:3]}
get(2):
Miss (evicted). return -1
put(4, 4):
New node. len(cache) would become 3 > capacity=2.
Evict tail.prev = [1:1]. Remove from list and map.
Insert [4:4] at front.
HEAD <-> [4:4] <-> [3:3] <-> TAIL
cache = {3: node[3:3], 4: node[4:4]}
get(1): Miss. return -1
get(3): Hit. Move [3:3] to front. return 3
get(4): Hit. Move [4:4] to front. return 4
The trace matches the expected output [null, null, null, 1, null, -1, null, -1, 3, 4] exactly.
One detail worth noting: the eviction after put(3, 3) removes key 2, not key 1. After get(1) moved node 1 to the front, node 2 slid to the tail-adjacent position. The list order is the ground truth for recency — and get updates it, which is why get has to call _remove + _add_to_front even on a read.
Edge cases that require careful thought
capacity = 1. Every put of a new key evicts the current occupant. get on the existing key is fine; it re-seats the node at the front (which is the same position it is already in, with sentinels absorbing the pointer work). The sentinel design makes this indistinguishable from any other case — no special-casing needed.
put on an existing key. This must update the value and refresh recency, but must not increase the cache size. If you forget to move the node to the front on an update, the node's recency position is wrong — a subsequent eviction might evict a key that was just "written." In the code above, both branches of put call _remove and _add_to_front, so this cannot be omitted accidentally.
get on a key that was just evicted. Returns -1 and leaves the cache untouched. The hash map lookup fails immediately — no list access happens. This is correct and fast.
All calls are get on missing keys. The cache stays empty. _remove and _add_to_front are never called. No issue.
Pointer order in _add_to_front. This is the place where off-by-one pointer logic breaks things. The node being inserted must update head.next.prev before head.next is reassigned — otherwise head.next already points to the new node and you lose the reference to the old first real node. The order in the code is: set node.prev = head, set node.next = head.next, then head.next.prev = node, then head.next = node. Follow this order exactly.
Comparison table
| Approach | get | put | Space | Notes |
|---|---|---|---|---|
| Array + linear scan | O(capacity) | O(capacity) | O(capacity) | Correct but too slow for 2*10^5 calls |
| Hash map + doubly linked list | O(1) | O(1) | O(capacity) | Meets the O(1) requirement; standard production approach |
The transferable pattern: when you need two things a single structure cannot give you
The design pattern here is worth naming explicitly. Sometimes you hit a performance requirement that one data structure satisfies on one dimension but not another. Here: fast lookup (hash map) plus fast ordered removal (doubly linked list). The coupling is through the node pointer — the hash map does not store the value, it stores a pointer to the node in the list that stores the value. The list can be restructured in O(1) because you have direct node references.
This "pointer coupling" pattern appears in several other places. You see it in the skip list (random access meets ordered iteration), in adjacency lists for graphs, and whenever an operating system's page replacement algorithm needs to work in O(1). Once you recognise that the LRU problem decomposes into "find" and "order" operations, and that neither the hash map nor the list alone handles both, the combined design becomes obvious.
The thing I find worth emphasising: the sentinel nodes are not a micro-optimisation. They are a correctness simplification. Without them, _remove needs to check whether the node is the head or tail, and _add_to_front needs to check whether the list is empty. With sentinels, those checks vanish. The list is never truly empty (head and tail are always there), and every remove and insert operates on interior nodes by construction. Keep the sentinels.
What I would ship, and why
For any real interview or production context, I write the hash map + doubly linked list solution directly. The O(n) array version is useful as a verbal explanation of what the problem is asking — but I would not spend time coding it unless the interviewer specifically asks to see the naive approach first.
In production C#, I would use LinkedList<T> and Dictionary<K, LinkedListNode<V>> rather than building the node class manually. The standard library's LinkedList exposes AddFirst, Remove, and Last — all O(1) — which maps directly onto the operations here. In Python, collections.OrderedDict (which maintains insertion order and supports move_to_end) gives you LRU in about eight lines, though understanding why it works requires knowing it is built on the same hash map + doubly linked list internals.
Where this fits in the linked-list series
This series has built from single-linked manipulation (Reverse Linked List, Merge Two Sorted Lists) through multi-pointer problems (Reorder List, Remove Nth Node) to cycle detection (Linked List Cycle). LRU Cache is the first design problem in the series. The shift is from "operate on a given list" to "design a structure that uses a list internally." The linked-list knowledge is the same; the framing is inverted.
LFU Cache (460) is the natural next step. Least Frequently Used tracks access count rather than recency. You need a map of frequency -> doubly linked list, plus a map of key -> (value, frequency). The structural moves are the same as LRU, but eviction picks from the minimum-frequency bucket rather than the tail.
All O(one) Data Structure (432) generalises the idea further: increment and decrement counts for string keys, return the key with minimum or maximum count, all in O(1). The solution uses a doubly linked list of frequency buckets — the same "linked list stores order, hash map stores position" coupling, more elaborate.
Insert Delete GetRandom O(1) (380) adds random uniform sampling to a set with O(1) insert and delete. The trick is an array for the random access plus a hash map for O(1) lookup/delete. Same design idiom: two structures covering each other's weaknesses, coupled through indices rather than pointers.
Design Twitter (355) pushes further into design territory: a newsfeed that returns the 10 most recent tweets from a user's followees. The linked list becomes a heap-merge over per-user tweet lists. These are the same building blocks used differently — the skill transfers even when the exact structure changes.
Part of the series: Linked List
Occasional notes on what I'm building
Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.
Comments
No comments yet. Be the first.
