Problem Definition

LeetCode Link: 349. Intersection of Two Arrays

Given two integer arrays nums1 and nums2, return an array of their 

intersection. Each element in the result must be unique and you may return the result in any order.

Example 1:

Input: nums1 = [1,2,2,1], nums2 = [2,2]
Output: [2]

Example 2:

Input: nums1 = [4,9,5], nums2 = [9,4,9,8,4]
Output: [9,4]
Explanation: [4,9] is also accepted.

Constraints:

  • 1 <= nums1.length, nums2.length <= 1000
  • 0 <= nums1[i], nums2[i] <= 1000

Thought Process

When searching for common elements between two arrays, the obvious approach might be to compare each element against every other—but this brute force method comes with a costly O(n²) runtime that quickly becomes impractical as arrays grow larger.

Fortunately, we can leverage more sophisticated data structures to dramatically improve performance. By utilizing hash sets and dictionaries, we can achieve O(1) lookup times on average, reducing the overall complexity to O(m + n), where m and n represent the respective array sizes.

Let's explore three key strategies:

  • The Dictionary-Set Approach: This method combines the speed of dictionary lookups with set operations to ensure unique results. By storing element frequencies in a dictionary, we can rapidly verify the presence of elements from the second array, making this approach both efficient and flexible.
  • Array-Based Frequency Counting: For scenarios with a known, limited range of values (say, 0 to 1000), array-based frequency counting offers an elegant solution. While this approach shines in terms of both speed and memory usage within its constraints, it becomes impractical when dealing with arbitrary or extensive value ranges.
  • Python's Built-in Set Operations: Python provides a remarkably clean solution through its built-in set intersection capability. Simply converting both arrays to sets and computing their intersection yields a concise, readable solution. While this approach may offer less flexibility for complex frequency-based operations, its simplicity makes it an excellent default choice for many use cases.

Solution 1 - Using a Dictionary and a Set

How It Works

  1. Dictionary for Frequency Counting:
    First, we process the first array (nums1) and store each element’s frequency in a dictionary (table). This gives us an instant check to see if an element has appeared and how many times.
  2. Set for Result Storage:
    We then iterate through the second array (nums2). For each element, we check if it appears in the dictionary. If it does, we add it to a set called res. Using a set here ensures that we never insert the same element twice, thereby maintaining uniqueness effortlessly.
  3. Final Conversion:
    After processing, we convert the resulting set res to a list before returning. This meets the problem’s requirement of returning a list structure.

This approach strikes a good balance between clarity and efficiency. The dictionary lookups are O(1) on average, and so are set insertions.

Code Explanation

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        # Step 1: Build a dictionary for elements in nums1
        table = {}
        for num in nums1:
            table[num] = table.get(num, 0) + 1
        
        # Step 2: Create a set to hold the intersection results
        res = set()
        
        # Step 3: Check elements of nums2 against the dictionary
        for num in nums2:
            if num in table:
                # Found an intersection, add to the result set
                res.add(num)
                # Remove from the dictionary to prevent adding duplicates
                del table[num]
        
        # Return the result set as a list
        return list(res)

Complexity Analysis

  • Time Complexity: O(m + n), where m is the length of nums1 and n is the length of nums2. We do one pass through each array.
  • Space Complexity: O(m + n) in the worst case, as we might store up to all elements of nums1 in the dictionary and potentially all intersection elements in the set. In practice, it’s usually less.

Solution 2 - Using Arrays as Hash Maps

How It Works

  1. Frequency Arrays Instead of Dictionaries:
    If the range of numbers is limited and known (e.g., 0 <= element <= 1000), we can replace dictionaries with fixed-size arrays. Here, we create two arrays count1 and count2, each with a size that covers the entire known range of possible elements.
  2. Counting Frequencies:
    We iterate over nums1 and increment the frequency count at count1[element]. Similarly, we increment count2[element] for each element in nums2. After these passes, count1[i] tells us how many times i appeared in nums1, and count2[i] tells us how many times i appeared in nums2.
  3. Finding Intersection:
    We then go through the index range of these arrays (from 0 to 1000), and whenever both count1[i] > 0 and count2[i] > 0, it means i is present in both arrays. We add it to the result.

When to Use This Approach?
This method is extremely fast because lookups and increments in arrays are O(1) without hashing overhead. However, it’s only suitable if the problem guarantees that all values fall within a small range. If the range is large or unknown, the memory usage becomes impractical.

Code Explanation

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        # Initialize counters for a known range [0, 1000]
        count1 = [0]*1001
        count2 = [0]*1001
        result = []
        
        # Count frequencies in nums1
        for i in range(len(nums1)):
            count1[nums1[i]] += 1
        
        # Count frequencies in nums2
        for j in range(len(nums2)):
            count2[nums2[j]] += 1
        
        # Identify intersection elements
        for k in range(1001):
            if count1[k] > 0 and count2[k] > 0:
                result.append(k)
        
        return result

Detailed Explanation of Steps:

  • We create two frequency arrays, count1 and count2, each of length 1001, initialized to zero.
  • By incrementing the corresponding indices as we process each array, we record the presence of each number.
  • After processing both arrays, we know exactly which indices appear in both, as they’ll have a frequency greater than zero in both count1 and count2.
  • Finally, we scan through the entire frequency range and collect any index that appears in both arrays.

Complexity Analysis

  • Time Complexity: O(m + n + R), where R is the fixed range (in this case, 1001). This is effectively O(m + n) given a constant R.
  • Space Complexity: O(R), since we rely on two fixed-size arrays. For R=1001, it’s constant additional space.

Solution 3 - Using Built-in Set Operations

How It Works

  1. Convert Both Arrays to Sets: By converting nums1 and nums2 into sets, we automatically remove duplicates. We now have two unique sets of elements, say set1 and set2.
  2. Intersection Operation: Python sets provide a built-in intersection operator (&). Using set1 & set2 directly gives us the intersection of the two sets without manually checking elements.
  3. Convert Back to List: Since the problem expects a list, we convert the final set intersection into a list before returning.

Why This Approach Is Elegant?
This method relies heavily on Python’s built-in set operations, which are both efficient and highly readable. There’s no need to manually handle duplicates or maintain separate data structures. However, it doesn’t offer the fine-grained control (like counting frequencies) that a dictionary might provide. Still, for a simple intersection, it’s perfectly suitable.

Code Explanation

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        # Convert both lists to sets
        set1 = set(nums1)
        set2 = set(nums2)
        
        # Directly compute the intersection and return it as a list
        return list(set1 & set2)

Detailed Explanation of Steps:

  • Converting nums1 and nums2 to sets strips away duplicates right from the start.
  • The intersection operation set1 & set2 is a built-in, optimized way to find common elements.
  • The result of set1 & set2 is another set containing only elements found in both sets. Converting it to a list meets the return type requirement.

Complexity Analysis

  • Time Complexity: O(m + n), due to creating sets and then performing a set intersection which averages O(min(m, n)).
  • Space Complexity: O(m + n), for storing the sets. The intersection itself does not exceed the size of these sets.

Key Takeaways

  1. Efficiency Through Hashing:
    Leveraging hash-based data structures like sets and dictionaries can reduce complexity from O(n²) to O(n).
  2. Uniqueness Handling:
    Sets inherently manage uniqueness, making them ideal for problems that require removing duplicates.
  3. Trade-offs in Data Structures:
    • Dictionary + Set: More flexible, allows frequency tracking, and explicit control over duplicates.
    • Arrays as Hash Maps: Very fast and simple if the range is known and small, but not suitable for large or unknown ranges.
    • Built-in Set Operations: Extremely clean and concise, perfect for straightforward intersection tasks without extra frequency logic.
  4. Selecting the Right Tool:
    Choose the approach that best suits the constraints. If the problem’s value range is small, arrays can be fastest. If the range is large or unknown, sets and dictionaries are more efficient and flexible. If simplicity is key and you just want the intersection, built-in set intersection is the simplest route.

By understanding these approaches, you can pick the one that aligns with the problem constraints and your specific needs.