Time-Based Key-Value Store: when the constraint hands you the algorithm

by Duong Tran Quang aka DTDucasJune 14, 2026 · 11 min read#000620 reads#binary-search#hash-map#design#leetcode#data-structures

No ratings yetSign in to rate

The problem gives you a key-value store where each key can hold many values, one per timestamp, and you need to retrieve the most recent value at or before a given time. That "most recent value at or before" phrasing is the tell: it is asking for the rightmost position where a condition holds, which is the exact shape of binary search. The constraint that makes it work is one sentence buried in the spec: "All the timestamps of set are strictly increasing." That single line transforms a design problem into a binary search problem.

The problem

Design a time-based key-value store that can hold multiple values per key, each tagged with a timestamp, and retrieve the most recent value at or before a queried timestamp. Full statement: LeetCode 981.

text

Constraints:
- 1 <= key.length, value.length <= 100
- key and value consist of lowercase English letters and digits.
- 1 <= timestamp <= 10^7
- All the timestamps timestamp of set are strictly increasing.
- At most 2 * 10^5 calls will be made to set and get.

What the constraints actually force

Before choosing an approach, read the constraints out loud and think about what each one eliminates.

1 <= timestamp <= 10^7 and at most 2 * 10^5 calls to set and get. Those numbers together mean a linear scan over all stored entries per get could touch 2 * 10^5 elements, and with 2 * 10^5 get calls, a worst-case naive solution runs at O(n^2) total — roughly 4 * 10^10 operations. That is too slow.

key.length, value.length <= 100 means string comparison is cheap, but the timestamps are what we need to search, not the string content.

"Timestamps of set are strictly increasing." This is the constraint that forces the algorithm. Because set calls arrive in increasing timestamp order, the list of (value, timestamp) pairs for any given key is always sorted. You do not need to sort it yourself. You get a sorted array for free. That is the green light for binary search.

Brute force: linear scan with an early exit

The straightforward reading: store everything in a hash map keyed by the string key, with a list of (value, timestamp) pairs per key. When get is called, walk the list forward and keep updating a result variable for every timestamp that does not exceed the target. Stop when you see a timestamp larger than the target, since the list is sorted.

python

class TimeMap:
    def __init__(self):
        self.store = {}
 
    def set(self, key: str, value: str, timestamp: int) -> None:
        if key not in self.store:
            self.store[key] = []
        self.store[key].append((value, timestamp))
 
    def get(self, key: str, timestamp: int) -> str:
        if key not in self.store:
            return ""
        result = ""
        for value, ts in self.store[key]:
            if ts <= timestamp:
                result = value
            else:
                break  # sorted list, no need to look further
        return result

csharp

public class TimeMap {
    private Dictionary<string, List<(string value, int timestamp)>> store;
 
    public TimeMap() {
        store = new Dictionary<string, List<(string, int)>>();
    }
 
    public void Set(string key, string value, int timestamp) {
        if (!store.ContainsKey(key)) {
            store[key] = new List<(string, int)>();
        }
        store[key].Add((value, timestamp));
    }
 
    public string Get(string key, int timestamp) {
        if (!store.ContainsKey(key)) return "";
        string result = "";
        foreach (var (value, ts) in store[key]) {
            if (ts <= timestamp) {
                result = value;
            } else {
                break;
            }
        }
        return result;
    }
}

The early break is not an optimization added on top — it is a direct consequence of the "strictly increasing" constraint. Without it, you would scan the entire list every time.

Set is O(1) (amortized append). Get is O(n) where n is the number of entries stored for that key. Space is O(n) total for all stored pairs.

This is clean and fast enough for light workloads. The problem is the O(n) per get call. If one key accumulates 10^5 entries, each get on that key costs 10^5 operations, and with 10^5 get calls you are back to quadratic.

Binary search: the right tool for a sorted list

Same data structure, completely different search strategy in get. Because the list is sorted by timestamp, we can binary search for the rightmost position where ts <= timestamp.

python

class TimeMap:
    def __init__(self):
        self.store = {}
 
    def set(self, key: str, value: str, timestamp: int) -> None:
        if key not in self.store:
            self.store[key] = []
        self.store[key].append((value, timestamp))
 
    def get(self, key: str, timestamp: int) -> str:
        if key not in self.store:
            return ""
        entries = self.store[key]
        left, right = 0, len(entries) - 1
        result = ""
        while left <= right:
            mid = (left + right) // 2
            if entries[mid][1] <= timestamp:
                result = entries[mid][0]  # candidate, but maybe not the best
                left = mid + 1            # look right for a larger qualifying ts
            else:
                right = mid - 1           # too large, look left
        return result

csharp

public class TimeMap {
    private Dictionary<string, List<(string value, int timestamp)>> store;
 
    public TimeMap() {
        store = new Dictionary<string, List<(string, int)>>();
    }
 
    public void Set(string key, string value, int timestamp) {
        if (!store.ContainsKey(key)) {
            store[key] = new List<(string, int)>();
        }
        store[key].Add((value, timestamp));
    }
 
    public string Get(string key, int timestamp) {
        if (!store.ContainsKey(key)) return "";
        var entries = store[key];
        int left = 0, right = entries.Count - 1;
        string result = "";
        while (left <= right) {
            int mid = (left + right) / 2;
            if (entries[mid].timestamp <= timestamp) {
                result = entries[mid].value;  // candidate
                left = mid + 1;               // look right
            } else {
                right = mid - 1;              // look left
            }
        }
        return result;
    }
}

Set is still O(1). Get is O(log n). Space is O(n).

Hand trace through the example

The call sequence from Example 1:

set("foo", "bar", 1)
get("foo", 1)   -> "bar"
get("foo", 3)   -> "bar"
set("foo", "bar2", 4)
get("foo", 4)   -> "bar2"
get("foo", 5)   -> "bar2"

After both set calls, store["foo"] contains:

index:     0             1
entries:  ("bar", 1)   ("bar2", 4)

Let me step get("foo", 3):

Step by step:

left=0, right=1, mid=0. entries[0].ts = 1 <= 3. Record result = "bar". Move left = 1.
left=1, right=1, mid=1. entries[1].ts = 4 > 3. Move right = 0.
left=1 > right=0. Loop exits. Return "bar".

Now get("foo", 5):

left=0, right=1, mid=0. ts=1 <= 5. Record result="bar". left=1.
left=1, right=1, mid=1. ts=4 <= 5. Record result="bar2". left=2.
left=2 > right=1. Loop exits. Return "bar2".

The pattern is consistent: whenever entries[mid].ts <= timestamp, we save the candidate and push the left pointer right, hunting for an even better (larger qualifying) timestamp.

Edge cases and what actually happens

Key was never set. Both implementations check for key existence before touching the list and return "" immediately. The hash map lookup is O(1).

Timestamp smaller than all stored timestamps. Say store["foo"] = [("bar", 5)] and you call get("foo", 2). The binary search runs: mid=0, ts=5 > 2, right=-1. Loop never records a candidate. result stays "". Correct.

Timestamp larger than all stored timestamps. get("foo", 100) on the same list: mid=0, ts=5 <= 100, result="bar", left=1. Loop exits. Returns "bar". Correct — you return the value at the largest existing timestamp.

Exact timestamp match. get("foo", 5) on [("bar", 5)]: mid=0, ts=5 <= 5, result="bar", left=1. Loop exits. Returns "bar". Correct.

Single entry for a key. The binary search handles a list of length one the same way. left=right=0, one pass, done.

Multiple keys. Each key is independent in the hash map. No interaction between keys.

Why the brute force's early `break` doesn't save it

The brute force's break is real and meaningful — in the best case, a get for a very recent timestamp exits after checking just the last few entries. But the worst case is still linear: get("foo", 10^7) with 10^5 entries for "foo" scans all 10^5 entries before stopping. Binary search never touches more than log2(10^5) ~ 17 entries regardless of the target.

The gap between 17 and 10^5 is what makes this worth doing.

What I'd ship, and why

I'd ship the binary search version without hesitation. The implementation is a standard left <= right loop with a standard "rightmost qualifying position" update — about twenty lines total. That is not meaningfully harder to read than the linear scan. The complexity improvement is two orders of magnitude on a hot path (every get call), and there is no trade-off: set stays O(1), space stays O(n).

The only reason I would start with the brute force is in an interview where I want to demonstrate that I understand what I am optimizing before optimizing it. State the brute force in two minutes, call out the constraint that makes it improvable ("timestamps are strictly increasing, so the list is sorted"), then immediately move to binary search.

One design note worth calling out: the problem says timestamps arrive in strictly increasing order. If that guarantee ever changed — if you could set("foo", "x", 5) after set("foo", "y", 10) — the list would no longer be sorted on arrival, and you would need to sort it (or use a sorted container like a balanced BST or a SortedList in C#). The binary search implementation would remain identical; the insertion would become O(log n) instead of O(1). Recognizing which assumptions the algorithm leans on is how you avoid surprises when requirements change.

Comparison table

Approach	Set	Get	Space	Notes
Brute force	`O(1)`	`O(n)`	`O(n)`	Linear scan; early exit in practice but not worst case
Binary search	`O(1)`	`O(log n)`	`O(n)`	Requires sorted list — guaranteed by problem constraints

Both use the same hash map structure. The only difference is what happens inside get.

The underlying pattern and where it shows up

The name for this pattern is "binary search on a sorted event log." You have a collection of (key, sorted-array-of-events) pairs, and queries ask for the last event at or before some threshold. Binary search finds the answer in O(log n) per query.

This is not only a LeetCode construction. Real systems do exactly this. A version control system looking up the state of a file at a given commit hash. A time-series database retrieving the most recent sensor reading before a cutoff. A database MVCC layer finding which row version was visible at a given transaction ID. The algorithm is the same in all of them.

Where this fits in the binary search series

The earlier problems in this series — Binary Search (704), Search in Rotated Sorted Array (33), Find Minimum in Rotated Sorted Array (153) — applied binary search to arrays given directly to you. Search a 2D Matrix (74) extended it to a two-dimensional structure by treating the matrix as a flat sorted array. This problem adds one more layer: you are building the sorted structure yourself, inside a designed data structure, and the reason it is sorted is a constraint about input order, not an explicit sort call.

The three adjacent problems that share this shape:

Snapshot Array (LeetCode 1146) is the closest sibling. You maintain an array with snapshots, and each get(index, snap_id) retrieves the value at a given index for the most recent snapshot at or before snap_id. The structure is identical: per-index sorted list of (value, snap_id) pairs, binary search on the snap ids. The twist is that you only store a new entry when the value actually changes, which requires a different index strategy.

Design Hit Counter (LeetCode 362) asks you to count hits within a rolling 5-minute window. The underlying structure is a queue of timestamps, and the operation is "drop everything outside the window" rather than "find the rightmost entry within a threshold." The binary search variant for this problem would search for the cutoff point, then subtract counts.

Online Stock Span (LeetCode 901) maintains a monotonic stack of price-day pairs. Each next(price) call asks how many consecutive days the stock was at or below the current price. It is a different flavor of "aggregate over a time window," solved with a stack instead of binary search, but the query structure is similar.

My Calendar I (LeetCode 729) uses binary search on a sorted list of intervals to check for overlaps before inserting each new booking. The "insert and search a dynamically growing sorted list" mechanic is shared with this problem.

If you are comfortable with 981, the natural next step is 1146 (Snapshot Array) — same data structure, more constrained storage that forces you to think carefully about what you actually need to store.

Part of the series: Binary Search

NEWSLETTER

Occasional notes on what I'm building

Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.

Comments

No comments yet. Be the first.