349. Intersection of Two Arrays
Discover three efficient approaches to finding common elements between arrays in Python. Learn how to optimize your code using dictionaries, arrays as hash maps, and built-in set operations - improving runtime while handling duplicates elegantly.
Problem Definition
LeetCode Link: 349. Intersection of Two Arrays
Given two integer arrays nums1
and nums2
, return an array of their
intersection. Each element in the result must be unique and you may return the result in any order.
Example 1:
Input: nums1 = [1,2,2,1], nums2 = [2,2]
Output: [2]
Example 2:
Input: nums1 = [4,9,5], nums2 = [9,4,9,8,4]
Output: [9,4]
Explanation: [4,9] is also accepted.
Constraints:
1 <= nums1.length, nums2.length <= 1000
0 <= nums1[i], nums2[i] <= 1000
Thought Process
When searching for common elements between two arrays, the obvious approach might be to compare each element against every other—but this brute force method comes with a costly O(n²) runtime that quickly becomes impractical as arrays grow larger.
Fortunately, we can leverage more sophisticated data structures to dramatically improve performance. By utilizing hash sets and dictionaries, we can achieve O(1) lookup times on average, reducing the overall complexity to O(m + n), where m and n represent the respective array sizes.
Let's explore three key strategies:
- The Dictionary-Set Approach: This method combines the speed of dictionary lookups with set operations to ensure unique results. By storing element frequencies in a dictionary, we can rapidly verify the presence of elements from the second array, making this approach both efficient and flexible.
- Array-Based Frequency Counting: For scenarios with a known, limited range of values (say, 0 to 1000), array-based frequency counting offers an elegant solution. While this approach shines in terms of both speed and memory usage within its constraints, it becomes impractical when dealing with arbitrary or extensive value ranges.
- Python's Built-in Set Operations: Python provides a remarkably clean solution through its built-in set intersection capability. Simply converting both arrays to sets and computing their intersection yields a concise, readable solution. While this approach may offer less flexibility for complex frequency-based operations, its simplicity makes it an excellent default choice for many use cases.
Solution 1 - Using a Dictionary and a Set
How It Works
- Dictionary for Frequency Counting:
First, we process the first array (nums1
) and store each element’s frequency in a dictionary (table
). This gives us an instant check to see if an element has appeared and how many times. - Set for Result Storage:
We then iterate through the second array (nums2
). For each element, we check if it appears in the dictionary. If it does, we add it to a set calledres
. Using a set here ensures that we never insert the same element twice, thereby maintaining uniqueness effortlessly. - Final Conversion:
After processing, we convert the resulting setres
to a list before returning. This meets the problem’s requirement of returning a list structure.
This approach strikes a good balance between clarity and efficiency. The dictionary lookups are O(1) on average, and so are set insertions.
Code Explanation
class Solution:
def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
# Step 1: Build a dictionary for elements in nums1
table = {}
for num in nums1:
table[num] = table.get(num, 0) + 1
# Step 2: Create a set to hold the intersection results
res = set()
# Step 3: Check elements of nums2 against the dictionary
for num in nums2:
if num in table:
# Found an intersection, add to the result set
res.add(num)
# Remove from the dictionary to prevent adding duplicates
del table[num]
# Return the result set as a list
return list(res)
Complexity Analysis
- Time Complexity: O(m + n), where m is the length of
nums1
and n is the length ofnums2
. We do one pass through each array. - Space Complexity: O(m + n) in the worst case, as we might store up to all elements of
nums1
in the dictionary and potentially all intersection elements in the set. In practice, it’s usually less.
Solution 2 - Using Arrays as Hash Maps
How It Works
- Frequency Arrays Instead of Dictionaries:
If the range of numbers is limited and known (e.g.,0 <= element <= 1000
), we can replace dictionaries with fixed-size arrays. Here, we create two arrayscount1
andcount2
, each with a size that covers the entire known range of possible elements. - Counting Frequencies:
We iterate overnums1
and increment the frequency count atcount1[element]
. Similarly, we incrementcount2[element]
for each element innums2
. After these passes,count1[i]
tells us how many timesi
appeared innums1
, andcount2[i]
tells us how many timesi
appeared innums2
. - Finding Intersection:
We then go through the index range of these arrays (from0
to1000
), and whenever bothcount1[i] > 0
andcount2[i] > 0
, it meansi
is present in both arrays. We add it to the result.
When to Use This Approach?
This method is extremely fast because lookups and increments in arrays are O(1) without hashing overhead. However, it’s only suitable if the problem guarantees that all values fall within a small range. If the range is large or unknown, the memory usage becomes impractical.
Code Explanation
class Solution:
def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
# Initialize counters for a known range [0, 1000]
count1 = [0]*1001
count2 = [0]*1001
result = []
# Count frequencies in nums1
for i in range(len(nums1)):
count1[nums1[i]] += 1
# Count frequencies in nums2
for j in range(len(nums2)):
count2[nums2[j]] += 1
# Identify intersection elements
for k in range(1001):
if count1[k] > 0 and count2[k] > 0:
result.append(k)
return result
Detailed Explanation of Steps:
- We create two frequency arrays,
count1
andcount2
, each of length 1001, initialized to zero. - By incrementing the corresponding indices as we process each array, we record the presence of each number.
- After processing both arrays, we know exactly which indices appear in both, as they’ll have a frequency greater than zero in both
count1
andcount2
. - Finally, we scan through the entire frequency range and collect any index that appears in both arrays.
Complexity Analysis
- Time Complexity: O(m + n + R), where R is the fixed range (in this case, 1001). This is effectively O(m + n) given a constant R.
- Space Complexity: O(R), since we rely on two fixed-size arrays. For R=1001, it’s constant additional space.
Solution 3 - Using Built-in Set Operations
How It Works
- Convert Both Arrays to Sets: By converting
nums1
andnums2
into sets, we automatically remove duplicates. We now have two unique sets of elements, sayset1
andset2
. - Intersection Operation: Python sets provide a built-in intersection operator (
&
). Usingset1 & set2
directly gives us the intersection of the two sets without manually checking elements. - Convert Back to List: Since the problem expects a list, we convert the final set intersection into a list before returning.
Why This Approach Is Elegant?
This method relies heavily on Python’s built-in set operations, which are both efficient and highly readable. There’s no need to manually handle duplicates or maintain separate data structures. However, it doesn’t offer the fine-grained control (like counting frequencies) that a dictionary might provide. Still, for a simple intersection, it’s perfectly suitable.
Code Explanation
class Solution:
def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
# Convert both lists to sets
set1 = set(nums1)
set2 = set(nums2)
# Directly compute the intersection and return it as a list
return list(set1 & set2)
Detailed Explanation of Steps:
- Converting
nums1
andnums2
to sets strips away duplicates right from the start. - The intersection operation
set1 & set2
is a built-in, optimized way to find common elements. - The result of
set1 & set2
is another set containing only elements found in both sets. Converting it to a list meets the return type requirement.
Complexity Analysis
- Time Complexity: O(m + n), due to creating sets and then performing a set intersection which averages O(min(m, n)).
- Space Complexity: O(m + n), for storing the sets. The intersection itself does not exceed the size of these sets.
Key Takeaways
- Efficiency Through Hashing:
Leveraging hash-based data structures like sets and dictionaries can reduce complexity from O(n²) to O(n). - Uniqueness Handling:
Sets inherently manage uniqueness, making them ideal for problems that require removing duplicates. - Trade-offs in Data Structures:
- Dictionary + Set: More flexible, allows frequency tracking, and explicit control over duplicates.
- Arrays as Hash Maps: Very fast and simple if the range is known and small, but not suitable for large or unknown ranges.
- Built-in Set Operations: Extremely clean and concise, perfect for straightforward intersection tasks without extra frequency logic.
- Selecting the Right Tool:
Choose the approach that best suits the constraints. If the problem’s value range is small, arrays can be fastest. If the range is large or unknown, sets and dictionaries are more efficient and flexible. If simplicity is key and you just want the intersection, built-in set intersection is the simplest route.
By understanding these approaches, you can pick the one that aligns with the problem constraints and your specific needs.
Discussion