Valid Anagram: counting characters instead of comparing positions

by Duong Tran Quang aka DTDucasJune 13, 2026 · 10 min read#000040 reads#hash-table#arrays#leetcode#strings

Two strings are anagrams if they contain the same characters with the same counts — order doesn't matter. "anagram" and "nagaram" are anagrams: both have three as, one n, one g, one r, one m. "rat" and "car" are not: r, a, t versus c, a, r.

The question is: how do you verify equal character frequencies efficiently?

Before any logic, the length check is free and decisive. If len(s) != len(t), you are done immediately — different lengths cannot produce equal character counts. That single early return eliminates a whole class of inputs before touching a single character.

What the constraints actually force

The problem says s and t consist of lowercase English letters only, and 1 <= s.length, t.length <= 5 * 10^4.

Read "lowercase English letters only" literally: the character universe is exactly twenty-six values. That is a fixed bound, not a variable. It means a 26-element integer array is a complete character map with no dynamic allocation and no hash collisions — the whole alphabet fits in a single cache line. The constraint is not incidental. It is telling you the right data structure.

Read n <= 5 * 10^4 as a soft pass for O(n log n) — fifty thousand elements sort in well under a millisecond — but it does not rule out O(n). The bounded alphabet makes O(n) available and it is strictly better, so there is no reason to stop at sorting.

Sorting: correct but not as cheap as it looks

The simplest mental model: sort both strings, compare. Anagrams always sort to the same sequence.

python

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        return sorted(s) == sorted(t)

In Python this is a genuine one-liner. It handles the length mismatch implicitly — different-length sorted outputs can never be equal, so False comes back correctly without an explicit length check.

In C# you need to materialize the sorted arrays first:

csharp

public class Solution {
    public bool IsAnagram(string s, string t) {
        if (s.Length != t.Length) return false;
        char[] sArr = s.ToCharArray();
        char[] tArr = t.ToCharArray();
        Array.Sort(sArr);
        Array.Sort(tArr);
        return new string(sArr) == new string(tArr);
    }
}

The cost is O(n log n) time and O(n) space. You are computing a full character ordering when you only need frequency equality. The sort builds information — a globally ordered sequence — that you immediately discard after the comparison. That is the waste. For n = 5 * 10^4 in a toy LeetCode context it barely matters, but in a pipeline processing millions of strings it does.

The frequency array: O(n) time, O(1) space

The constraint told you the character universe has size 26. A 26-element integer array indexed by char - 'a' is a complete frequency table. The approach: increment for every character in s, decrement for every character in t. If the strings are anagrams, every increment from s gets cancelled by a matching decrement from t, and the array ends at all zeros.

python

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        if len(s) != len(t):
            return False
        count = [0] * 26
        for a, b in zip(s, t):
            count[ord(a) - ord('a')] += 1
            count[ord(b) - ord('a')] -= 1
        return all(c == 0 for c in count)

zip(s, t) consumes both strings in one pass. ord(a) - ord('a') maps 'a' to 0, 'b' to 1, ... 'z' to 25. The final all(c == 0 ...) scans exactly 26 elements — constant work regardless of input length.

Let's trace s = "anagram", t = "nagaram" step by step. Start: count = [0] * 26.

Every slot returns to zero because every character that s contributed, t also contained. The -1 entries that appear mid-trace — count[13] = -1 after the first pair, count[6] = -1 after the third — represent characters seen in t before their matching character in s has appeared. That is fine. The only thing that matters is the final state.

In C#, character subtraction works directly because char arithmetic is defined:

csharp

public class Solution {
    public bool IsAnagram(string s, string t) {
        if (s.Length != t.Length) return false;
        int[] count = new int[26];
        for (int i = 0; i < s.Length; i++) {
            count[s[i] - 'a']++;
            count[t[i] - 'a']--;
        }
        foreach (int c in count) {
            if (c != 0) return false;
        }
        return true;
    }
}

s[i] - 'a' maps 'a' -> 0, 'b' -> 1, ... 'z' -> 25. Same arithmetic as ord() - ord('a') in Python. The foreach loop returns early on the first non-zero count — in the False case this avoids scanning remaining slots unnecessarily.

The Counter shortcut in Python

python

from collections import Counter
 
class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        return Counter(s) == Counter(t)

Two lines. Counter builds a frequency dictionary; == compares by content. This is idiomatic Python and handles Unicode automatically because dictionaries have no key range restriction. O(n) time, O(k) space where k is distinct character count — at most 26 for lowercase English, so functionally the same as the array.

I reach for Counter in production Python. In an interview I write the explicit array version first, then mention Counter as the idiomatic shortcut — because showing you understand the frequency counting mechanism matters more than knowing a stdlib class exists.

Complexity across all approaches

Approach	Time	Space	Notes
Sorting	O(n log n)	O(n)	Correct; doesn't exploit the 26-char constraint
Frequency array	O(n)	O(1)	Optimal for fixed character set
Counter (Python)	O(n)	O(k)	Idiomatic; generalizes to Unicode naturally
Dictionary (C#)	O(n)	O(k)	Unicode-safe; slightly more verbose

The Unicode follow-up

The problem explicitly asks: what if inputs can contain Unicode characters?

The 26-element array stops working — you cannot bound character values to 0–25. The frequency dictionary approach generalizes without any structural change:

python

from collections import Counter
 
class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        return Counter(s) == Counter(t)

Same code. Counter accepts any hashable key, so any Unicode character becomes a dictionary key. Space is now O(k) where k is the number of distinct characters in the actual inputs — unbounded in theory, bounded in practice by what the strings contain.

In C#, the Dictionary-based approach is the same idea as the array approach, just with a hash map replacing the fixed-size index:

csharp

public class Solution {
    public bool IsAnagram(string s, string t) {
        if (s.Length != t.Length) return false;
        var count = new Dictionary<char, int>();
        foreach (char c in s)
            count[c] = count.GetValueOrDefault(c, 0) + 1;
        foreach (char c in t) {
            if (!count.ContainsKey(c)) return false;
            count[c]--;
            if (count[c] == 0) count.Remove(c);
        }
        return count.Count == 0;
    }
}

The Remove when a count hits zero keeps the final .Count == 0 check clean. Without it you would need to verify all remaining values are zero — correct but noisier. Removing zero-count keys means a non-empty dictionary at the end is immediately meaningful: some character appears an unequal number of times.

Edge cases worth knowing

Different lengths — the early return catches this before any character processing. The check itself costs O(1).

Identical strings — s = t = "abc": every character in s increments the count, and the same character at the same index in t decrements it immediately. Every slot returns to zero. The code returns True. This is correct: a string is trivially an anagram of itself.

Single character — "a" vs "a": one increment, one decrement, count[0] = 0, returns True. "a" vs "b": count[0] = 1, count[1] = -1, not all zero, returns False. Both right.

All same character — "aaaa" vs "aaaa": count[0] bounces +1, -1, +1, -1, +1, -1, +1, -1 across the four zip pairs, lands at 0. Fine.

Character in one string but not the other — "abc" vs "abd": after the full pass, count[2] = 1 (extra c from s, never cancelled), count[3] = -1 (decremented by d from t, never incremented). Not all zero, returns False. The negative value here is meaningful — it says t contained a character that s did not.

Pattern and mental model

The core pattern is frequency counting: when you need to verify that two collections contain the same elements with the same counts, maintain a difference table and check it ends at zero. This is the "frequency map" upgrade from the set membership pattern used in Contains Duplicate.

Contains Duplicate asked: have I seen this value at all? A set answers it — one bit per value.

Valid Anagram asks: have I seen this value the same number of times in both sequences? A count map answers it — one integer per value.

That extra dimension — count instead of presence — shows up in several problems immediately after this one. Group Anagrams uses the same frequency information but as a dictionary key to bucket strings together. Find All Anagrams in a String applies it inside a sliding window over a longer string, maintaining the frequency array incrementally as the window moves.

Where this sits in the series

Two Sum used a value -> index hash map. Contains Duplicate used a set for membership. Valid Anagram introduces the frequency map: you no longer care whether you have seen a value, you care how many times.

That extension is exactly what Group Anagrams requires next. Instead of checking whether two strings have equal counts, you use the sorted string or count tuple as a grouping key — same underlying frequency logic, now applied as a bucket identifier rather than a boolean check.

Three sibling problems worth solving after this one:

49. Group Anagrams — frequency tuple or sorted-string key to bucket anagrams. The frequency counting is identical; the new challenge is using it as a dictionary key across many strings.
438. Find All Anagrams in a String — sliding window maintaining a frequency array incrementally. The window advances one character at a time: increment the incoming character, decrement the outgoing one. Matching the window's array against the pattern's array is exactly the Valid Anagram check, but applied at every position.
567. Permutation in String — structurally identical to 438 but asks for any permutation rather than enumerating positions. Same sliding window, same frequency matching.

Part of the series: Arrays & Hashing

NEWSLETTER

Occasional notes on what I'm building

Get an email when I publish a new post — engineering write-ups, no spam. Unsubscribe anytime.

Comments

No comments yet. Be the first.

Valid Anagram: counting characters instead of comparing positions

What the constraints actually force

Sorting: correct but not as cheap as it looks

The frequency array: O(n) time, O(1) space

The Counter shortcut in Python

Complexity across all approaches

The Unicode follow-up

Edge cases worth knowing

Pattern and mental model

Where this sits in the series

Occasional notes on what I'm building

Comments

Let's build something reliable, end to end