242. Valid Anagram
Master three efficient solutions to the Valid Anagram problem using Python: array-based frequency counting, defaultdict, and Counter implementations. Learn the time and space complexity analysis of each approach while working with hash tables in Python.
LeetCode Link: 242. Valid Anagram
Problem Definition
Given two strings s
and t
, return true
if t
is an anagram of s
, and false
otherwise.
Example 1:
Input: s = "anagram", t = "nagaram"
Output: true
Example 2:
Input: s = "rat", t = "car"
Output: false
Constraints:
1 <= s.length, t.length <= 5 * 104
s
andt
consist of lowercase English letters.
Thought Process
Let's dive straight into an efficient solution for this problem. While a brute force approach using nested loops would give us O(n²) time complexity, we can do much better.
When working with strings containing only lowercase letters, we can efficiently determine if two strings are anagrams using an array as a frequency counter, which acts as a lightweight hash table. Why an array? Because the ASCII values of lowercase letters (a
to z
) form a continuous sequence of 26 values, making an array with 26 slots perfect for this task.
If you're new to the relationship between arrays, sets, maps, and hash tables, I've written about this topic in depth in my post:
Example: Strings s = "aee"
and t = "eae"
- The Concept
We’ll use an array,record
, with 26 positions (one for each lowercase letter), initialized to0
. Each position corresponds to a letter of the alphabet:- Index
0
representsa
, index1
representsb
, ..., index25
representsz
.
- Index
- Mapping Characters to Array Indices
To find the index for a character, subtract the ASCII value of'a'
from the ASCII value of the character:- For example,
'a' - 'a' = 0
,'e' - 'a' = 4
.
- For example,
- Algorithm Steps:
- Traverse string
s
and increment the counter for each character. - Traverse string
t
and decrement the counter for each character. - After processing both strings, check if all positions in
record
are0
.- If yes, the strings are anagrams.
- If not, they’re not anagrams.
- Traverse string
Solution 1 - 26-Slot
class Solution:
def isAnagram(self, s: str, t: str) -> bool:
record = [0] * 26
for i in s:
# No need to remember the ASCII value of 'a';
# just calculate a relative index value.
record[ord(i) - ord("a")] += 1
for i in t:
# Subtract the count for each character in string t.
record[ord(i) - ord("a")] -= 1
for i in range(26):
if record[i] != 0:
# If any element in the record array is not zero,
# it means one of the strings has extra or missing characters.
return False
return True
The ord
function in Python returns the Unicode code point (integer value) of a given character. It is short for "ordinal," which refers to the numerical representation of characters in the Unicode standard.
How It Works
- Each character has a corresponding integer value in Unicode. For example:
'a'
→ 97'b'
→ 98'A'
→ 65' ' (space)
→ 32
This function checks if two strings s
and t
are anagrams by comparing their character frequencies:
Use Case in the Code:
In the anagram-checking solutions, ord
is used to calculate an array index for each character by subtracting the Unicode value of 'a'
. For example:
index = ord(char) - ord('a')
This maps:
'a'
to0
'b'
to1
'z'
to25
This is efficient when working with lowercase English letters, as it provides a fixed range of indices (0–25) that can directly correspond to array positions.
Code Explanation
- Frequency Count Using
record
:record
is an array of size 26 (for each lowercase letter).- For every character in
s
, increment the corresponding index inrecord
. - For every character in
t
, decrement the corresponding index.
- Verify Anagram Property:
- After processing both strings, if all values in
record
are0
, the strings are anagrams (they have the same characters with the same frequencies). - If any value in
record
is non-zero, the strings are not anagrams.
- After processing both strings, if all values in
Complexity
- Time Complexity:
Traversing both strings takes O(n), where n is the length of the strings. - Space Complexity:
Therecord
array always has 26 positions, making space complexity O(1).
Solution 2 - collections.defaultdict()
class Solution:
def isAnagram(self, s: str, t: str) -> bool:
from collections import defaultdict # Import defaultdict from collections module
s_dict = defaultdict(int) # Initialize a dictionary for string `s` with default value 0
t_dict = defaultdict(int) # Initialize a dictionary for string `t` with default value 0
for x in s:
# Count the frequency of each character in string `s`
s_dict[x] += 1
for x in t:
# Count the frequency of each character in string `t`
t_dict[x] += 1
# Compare the two dictionaries; if they are identical, the strings are anagrams
return s_dict == t_dict
How defaultdict
Works
defaultdict
is a subclass of Python’s built-indict
that provides a default value for a key that does not yet exist in the dictionary.- When you try to access or modify a key that isn’t present in the dictionary:
defaultdict
automatically creates the key with the default value.- No
KeyError
is raised.
If you use a regular dict
, attempting to increment a key that doesn’t exist would raise a KeyError
. You’d have to explicitly check whether the key is in the dictionary before modifying it:
Code Explanation
- Using
defaultdict
for Frequency Counting:
- A
defaultdict
is used to store the frequency of each character in both strings. - It automatically initializes missing keys with a default value of
0
.
- Building Frequency Dictionaries:
- Loop through
s
to count the frequency of each character ins_dict
. - Loop through
t
to count the frequency of each character int_dict
.
- Compare Frequency Dictionaries:
- If
s_dict
andt_dict
are identical, it meanss
andt
have the same characters in the same frequencies, so they are anagrams. - If the dictionaries differ, they are not anagrams.
Complexity
Time Complexity: O(n)
- First loop through string s: O(n) where n is the length of s
- Second loop through string t: O(m) where m is the length of t
- Dictionary comparison operation (
s_dict == t_dict
): O(k) where k is the number of unique characters- In Python, dictionary comparison checks each key-value pair
- Since we're dealing with lowercase letters only, k is at most 26
Therefore, the total time complexity is O(n + m + k) which simplifies to O(n).
Space Complexity: O(1)
s_dic
t stores at most 26 key-value pairs (one for each lowercase letter)t_dict
also stores at most 26 key-value pairs- Even though we're using dictionaries, their size is bounded by the alphabet size
- Therefore, the space requirement is constant regardless of input size
Advantages of defaultdict
- Simpler Code:
- Using
defaultdict
, you skip theif-else
check entirely. - Missing keys are initialized with the default value (in this case,
0
) automatically.
- Readability:
- The code is more concise and easier to understand, as the frequency counting logic is directly implemented in one line:
s_dict[x] += 1
- Efficiency:
- Avoiding the
if-else
check can make the code slightly faster in large datasets.
Key Takeaway
Using defaultdict
is ideal for counting problems like this one, where all keys need to have a default value (0
) before any operations are performed. While a regular dict
can also work, it requires additional code to handle missing keys explicitly.
Solution 3 - collections.Counter
class Solution(object):
def isAnagram(self, s: str, t: str) -> bool:
from collections import Counter # Import Counter from the collections module
a_count = Counter(s) # Count the frequency of each character in string `s`
b_count = Counter(t) # Count the frequency of each character in string `t`
# Compare the two frequency dictionaries;
# if they are identical, the strings are anagrams
return a_count == b_count
This solution uses Python's Counter
from the collections
module to check if two strings s
and t
are anagrams by comparing their character frequencies.
Code Explanation
- Using
Counter
:
Counter
automatically counts the occurrences of each character in the input string and stores the counts in a dictionary-like structure.
For example:
Counter("aabbcc") # Output: {'a': 2, 'b': 2, 'c': 2}
- Count the Frequencies:
a_count
stores the character counts for strings
.b_count
stores the character counts for stringt
.
- Compare the Two Counters:
- If
a_count
andb_count
are identical, it meanss
andt
have the same characters in the same frequencies, so they are anagrams. - If the counters differ, the strings are not anagrams.
Complexity
Time Complexity: O(n)
- Counter(s) operates in O(n) time where n is the length of string s
- Counter(t) operates in O(m) time where m is the length of string t
- Dictionary comparison (
a_count == b_count
) takes O(k) where k is the number of unique characters- For lowercase letters, k ≤ 26
Total time complexity is O(n + m + k) which simplifies to O(n).
Space Complexity: O(1)
a_count
stores at most 26 entries (one per lowercase letter)b_count
also stores at most 26 entries- Since we're limited to lowercase letters, the space used is bounded
- Thus space usage is constant regardless of input size
Discussion