Jc-alt logo
jonathancamberos
data structures and algorithms

LeetCode: Arrays and Hashing

LeetCode: Arrays and Hashing
69 min read
#data structures and algorithms
Table Of Content

Arrays & Hashing Intro:

Leetcode problems with elegant solutions using hashing.

What is a Hashmap

A dictionary is just a direct mapping of keys->values:

Dictionary: ['a': 2, 'b': 2, 'c' : 3, ...]

A Hashmap is just a dictionary that maps keys->values using a hash function. Hashmaps are a common use case for hashing. We choose hashing to maximize randomness and minimize collisions between elements to keys.

Hashing

Hashing is simply a function. It takes in an input, and spits out an output. As an example, here is the function mod, which takes in an integer and spits out an integer:

1 mod (3) = 1

2 mod (3) = 2

3 mod (3) = 0

4 mod (3) = 1

Now this is could be our function for our hashmap, but it would lead to high collisions and low randomness as there are only 3 possible key results: 0, 1, and 2. So hashmaps are only efficient as the chosen function.

With a good Hash function:

Insert, Lookup, and Delete take O(1). In this case, every element gets its own unique key and so a lookup using this function would be constant O(1).

With a bad Hash function:

Insert, Lookup, and Delete take O(n). Lets say our function is hash():

    hash()  ->  [key mod (5)]
    hash(10) = 10 % 5 = 0
    hash(22) = 22 % 5 = 2
    hash(31) = 31 % 5 = 1
    hash(14) = 14 % 5 = 4
    hash(17) = 17 % 5 = 2 (collision!)

    Handle collision using chaining with a linked list of elements.

    Index 0  |  Index 1  |  Index 2         |  Index 3  |  Index 4
    -------- | --------- | ---------------- | --------- | ---------
    [10]     |  [31]     |  [22 -> 17]      |  [  ]     |  [14]

In this case there are only 5 keys or buckets would be hashed to. After inserting n elements into one key or bucket, performing a lookup would for some element k would result in traversing a list of size n in the worst case leading to O(n).

This would degrade insert, lookup, and delete from the original O(1) to O(n).

Hashmaps having quick O(1) average time for insert(), lookup(), and delete(), are great for storing key pairs of lists of elements to perform more complex tasks where performance in time and space complexity are important.

Balancing

Balancing occurs when a hash table has passed its set load factor.

Load Factor = n/m Where n = num of elements, m = num of keys

Thus, when certain fraction of the table size, say 75% full with 3 elements and 4 potential keys. The table must increase in size and rehash/reinsert every element.

The efficiency of a hash table decreases significantly without balancing, as this leads to increased collisions as the table fills up, which leads to time needed to resolve collisions, as we traverse through list to find the element in linear time O(n). Operations like search, insert, and delete, degrade from O(1) on average to O(n) in the worst case.

Usually, when the load factor is reached, the table size doubles to 2n. And usually, a common load factor is 0.75 or 75% full. Finally, we must rehash every element leading to: [(double table size * load factor) - current n elements] =

(2n * 0.75) - n =

1.5n - n =

0.5n =

n/2 = additional space after rebalancing

Rehashing is notably expensive as we must rehash every element with the new hash function, leading to O(n) for resizing and reinserting n elements.

Array Application: In place Transformations

Perform transformations or reorderings on the array without using extra space.

Ex: Rotate an array to the right by k steps.

    def rotate(nums: List[int], k: int) -> None:
        n = len(nums)
        k %= n  # handle cases where k > n
        
        def reverse(start: int, end: int) -> None:
            while start < end:
                nums[start], nums[end] = nums[end], nums[start]
                start += 1
                end -= 1

        reverse(0, n - 1)    # Reverse entire array
        reverse(0, k - 1)    # Reverse first k elements
        reverse(k, n - 1)    # Reverse remaining elements

HashMap Application: Representations

Using a HashMap to represent objects or data based on specific criteria.

Ex: Representing a string by character frequency.

    def freqCount():
        s = "aabbcc"
        freq = {}
        for char in s:
            freq[char] = freq.get(char, 0) + 1
        # freq = {'a': 2, 'b': 2, 'c': 2}

HashMap Application: Grouping by Criteria

Group elements based on a defined criterion, such as sorting or categorization.

Ex: Grouping strings that are anagrams of each other:

    def groupAnagrams(strs):
        anagrams = {}
        for word in strs:
            key = "".join(sorted(word))
            if key not in anagrams:
                anagrams[key] = []
            anagrams[key].append(word)
        # anagrams = {'aet': ['eat', 'tea', 'ate'], 'ant': ['tan', 'nat'], 'abt': ['bat']}

HashMap Application: Memoization in Dynamic Programming

Store solutions to subproblems in a HashMap to avoid redundant calculations.

Ex: Fibonacci number computation with memoization:

    def fib(n, memo={}):
        if n <= 1:
            return n
        if n not in memo:
            memo[n] = fib(n - 1, memo) + fib(n - 2, memo)
        return memo[n]

HashMap Application: Backtracking with Caching for Pruning

Use a HashMap to cache explored paths or states to prune the search space effectively.

Ex: Subset Sum with caching:

    def subsetSum(nums, target):
        # Cache to store paths that can no longer lead to valid solutions
        cache = {}
        result = []

        def backtrack(current, index, total):
            # Check if we've encountered this path before
            state = (tuple(current), total)
            if state in cache:
                return
            cache[state] = True

            # Prune: If the total exceeds the target
            if total > target:
                return
            
            # Valid solution
            if total == target:
                result.append(list(current))
                return

            # Explore further candidates
            for i in range(index, len(nums)):
                # Choose
                current.append(nums[i])
                # Recurse
                backtrack(current, i + 1, total + nums[i])
                # Undo the choice
                current.pop()

        backtrack([], 0, 0)
        return result

HashMap Application: Representing Relationships

Model relationships between entities.

Ex: Adjacency list representation of a graph:

    def graph():
        edges = [(1, 2), (2, 3), (1, 3)]  # List of edges in the graph
        graph = {}  # Initialize an empty hashmap to represent the graph

        # Build the adjacency list
        for u, v in edges:
            if u not in graph:  # If the node `u` is not yet in the graph, initialize it
                graph[u] = []
            if v not in graph:  # If the node `v` is not yet in the graph, initialize it
                graph[v] = []
            graph[u].append(v)  # Add `v` to the list of neighbors for `u`
            graph[v].append(u)  # Add `u` to the list of neighbors for `v`

        # Resulting adjacency list representation
        # graph = {1: [2, 3], 2: [1, 3], 3: [2, 1]}

HashMap Application: Logic Mapping

Use a HashMap to represent problem specific logic or conditions.

Ex: Map complements to indices for quick lookup:

    def twoSum(nums, target):
        # Map to store complement -> index
        complement_map = {}

        for i, num in enumerate(nums):
            complement = target - num
            
            # Check if current number is a complement of a previously seen number
            if num in complement_map:
                return [complement_map[num], i]
            
            # Store complement of current number with index for future lookup
            complement_map[complement] = i

        return []

HashMap Application: Algorithm

Case where a problem that seems to require a HashMap is easily solved by an algorithm specifically made for that problem

Ex: Boyer-Moore Voting Algorithm

def majorityElement(nums: List[int]) -> int:
    candidate = None
    count = 0

    # Phase 1: Find a candidate for the majority element
    for num in nums:
        if count == 0:  # Reset candidate
            candidate = num
        count += (1 if num == candidate else -1)

    # Phase 2: Confirm the candidate (optional if majority element is guaranteed)
    return candidate

217. Contains Duplicate ::2:: - Easy

Topics: Array, Hash Table, Sorting

Intro

Given an integer array nums, return true if any value appears at least twice in the array, return false if every element is distinct.

Example InputOutput
nums = [1,2,3,1]true
nums = [1,2,3,4]false
nums = [1,1,1,3,3,4,3,2,4,2]true

Constraints:

Array Length: 1 ≤ nums.length ≤ 105

Integer Size: -109 ≤ nums[i] ≤ 109

Abstraction

Given a list of elements, we need to check if any of those elements are duplicated.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (all pairs comparison)O(n2)O(1)For each pair of elements, we compare them, leading to quadratic number of comparison O(n2)No additional memory is allocated O(1)
Tim SortO(n log n)O(n)Tim sort over the array dominates, leading to O(n log n)Temporary arrays used during the merge phase dominate, leading to O(n)
HashmapO(n)O(n)Iterate over every element O(n), with O(1) lookup and insertion into the map, leading to linear time O(n)We store each each element in a hashmap, which requires additional memory proportional to the number of elements O(n)
HashsetO(n)O(n)Same logic as HashmapSame logic as Hashmap
BugError
Find the Bug: (iterating element-wise)Iterating element-wise will cause elements to compare with themselves.

Brute Force: (all pairs comparison)

    def containsDuplicate(self, nums: List[int]) -> bool:

        # time complexity: iteration of O(n)
        for i in range(len(nums)):

            # time complexity: iteration of O(n^2)
            # 1 + 2 + 3 ... (n - 3) (n - 2) (n - 1) -> 
            # sandwich with reverse sum and cross out like terms -> n(n-1)/2 = O(n^2)
            for j in range(i + 1, len(nums)):

                # time complexity: comparison O(1)
                if nums[i] == nums[j]:
                    return True
        
        # overall: time complexity O(n^2) 
        # overall: space complexity O(1)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Outer LoopO(n)O(1)Iteration over the list takes linear time O(n)No additional memory is allocated O(1)
Inner LoopO(n2)O(1)For each element in the outer loop, the inner loop iterates through all subsequent elements with Big(O) is O(n2). Derivation hereNo additional memory is allocated O(1)
ComparisonO(1)O(1)Each pair of elements is compared in constant time O(1)No additional memory is needed for comparisons O(1)
OverallO(n2)O(1)The inner loop dominates leading to O(n2) time complexity.No additional memory was allocated leading to constant O(1) space complexity.

Inefficient: Sorting (Tim Sort)

   def containsDuplicate(self, nums: List[int]) -> bool:
        
        # sorts small segment of array using in-place Insertion Sort 
        def insertionSort(arr: List[int], start: int, end: int) -> None:
            
            # insertion sort algo on small segment [start+1, end]
            for i in range(start + 1, end + 1):
                key = arr[i]
                j = i - 1

                while j >= start and arr[j] > key:
                    arr[j + 1] = arr[j]
                    j -= 1
                arr[j + 1] = key

        # merges two sorted subarrays into a single sorted segment using in-place Merge Sort
        def merge(arr: List[int], start: int, mid: int, end: int) -> None:
            
            # temporary arrays for left and right halves
            left = arr[start:mid + 1]
            right = arr[mid + 1:end + 1]

            # pointer for left and right arrays
            i = j = 0

            # pointer for main array where merged elements are placed
            k = start

            # merge sort algo elements into arr[start...end]
            while i < len(left) and j < len(right):
                if left[i] <= right[j]:
                    arr[k] = left[i]
                    i += 1
                else:
                    arr[k] = right[j]
                    j += 1
                k += 1

            # place left over elements within the array, cannot use extend()
            while i < len(left):
                arr[k] = left[i]
                i += 1
                k += 1
            while j < len(right):
                arr[k] = right[j]
                j += 1
                k += 1

        # sorts array into place using runs of 32
        def timSort(arr: List[int], run: int = 32) -> None:
            
            n = len(arr)

            # Sort small runs using Insertion Sort
            # time complexity: initial sort runs O((n/run)) * run^2) ~ O(n)
            for i in range(0, n, run):
                insertionSort(arr, i, min(i + run - 1, n - 1))

            # Iterate over array and merge sort runs
            # time complexity: O(log(n/run) * n) ~ O(n log n)
            size = run
            while size < n:

                # merge neighbor pairs of runs of length 'size'
                for start in range(0, n, size * 2):
                    mid = min(start + size - 1, n - 1)
                    end = min(start + size * 2 - 1, n - 1)

                    # merge if two valid runs exists (mid < end)
                    if mid < end:
                        merge(arr, start, mid, end)

                # merged two runs of length size
                # update new run length = old run len + old run len
                size *= 2

        timSort(nums)
        
        # check for duplicates in the sorted array
        # time complexity: iterate over sorted array of n length O(n)
        for i in range(1, len(nums)):
            if nums[i] == nums[i - 1]:
                return True
        
        # overall: time complexity O(n log n)
        # overall: space complexity O(n)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Insertion SortO(n)O(1)Sorting small runs of size run with O(run^2) per segment, totaling O(n)No additional memory allocation for in-place sorting O(1)
Merging SegmentsO(n log n)O(n)log(n/run) levels of merges, each requiring O(n) time, O(n log (n/run)), leading to O(n log n)Temporary arrays during merge O(n)
Duplicate ValidationO(n)O(1)Iterating over sorted array of n length O(n)No additional memory for comparison operation O(1)
OverallO(n log n)O(n)Merge sort dominates time complexity, leading to O(n log n)Temporary arrays during merge dominates, leading to O(n)

Find the Bug: (iterating element-wise)

    def containsDuplicate(self, nums: List[int]) -> bool:

        # time complexity: iteration of O(n)
        for i in nums:

            # time complexity: iteration of O(n)
            for j in nums: 

                # INCORRECT: iterating element-wise will comparing elements to themselves
                # use index iteration in this case
                if i == j 
                    return True

        return False 

Solution 1: Hashmap - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:

        # space complexity: dictionary of O(n)
        count = defaultdict(int)

        # time complexity: iteration over list of n length O(n)
        for num in nums:

            # time complexity: lookup operation in constant O(1)
            if count[num] >= 1:
                return True
            count[num] += 1

        # overall: time complexity  O(n) 
        # overall: space complexity O(n)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(n)Iteration over the list takes linear time O(n) with insertion into Hashmap in O(1) time. Iteration dominates O(n).Hashmap grows proportional to the number of unique elements. Worst case O(n)
LookupO(1)O(1)Each lookup is O(1), performed once per elementLookup operations do not require additional memory allocation.
OverallO(n)O(n)Iteration dominates over insertion for time complexity linearly for O(n)Memory allocation for Hashmap dominates leading to O(n) space complexity.

Solution 2: Hashset - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:
        
        # space complexity: hashset of O(n)
        seen = set()

        # time complexity: iteration of O(n)
        for n in nums:

            # time complexity: lookup operation of O(1)
            if n in seen:
                return True
            seen.add(n)
            
        # overall: time complexity  O(n)
        # overall: space complexity O(n)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(n)Iteration over the list takes linear time O(n) with insertion into Hashset in O(1) time. Iteration dominates O(n).Hashset grows proportional to the number of unique elements. Worst case O(n)
LookupO(1)O(1)Lookup operation O(1), performed once per element, is still O(1)Lookup operations do not require additional memory allocation. O(1)
OverallO(n)O(n)Iteration dominates over insertion for time complexity linearly for O(n)Memory allocation for Hashset dominates leading to O(n) space complexity.

242. Valid Anagram ::2:: - Easy

Topics: Hash Table, String, Sorting

Intro

Given two strings s and t, return true if t is an anagram of s, and false otherwise.

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Example Input 'S'Example Input 'T'Output
"anagram""nagaram"true
"rat""car"false

Constraints:

String Length: 1 ≤ s.length, t.length ≤ 5 * 104

Characters: s and t consist of lowercase English letters

Abstraction

Given a string, we represent its character count as a Hashmap. We then validate it to the second string character count.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (Merge Sort)O(n log n)O(n)Sorting an individual string O(n log n) dominates as comparing strings by character sequentially is linear O(n)No additional memory is allocated O(1)
HashmapO(n)O(1)Iteration over the list takes linear time O(n), with O(1) lookup and insertion into the map, leading to linear time O(n)We store each each element in a hashmap, which requires memory proportional to the number of elements O(n)
Hashmap (array of 26)O(n)O(1)Same as logic as HashmapSame as logic as Hashmap
BugError
Find the Bug: (Set Frequency Count)Does not take into account frequency count of characters since sets can only have unique elements.

Brute Force (Merge Sort)

    def isAnagram(self, s: str, t: str) -> bool:
        
        # time complexity: comparison of O(1)
        if len(s) != len(t):
            return False
        
        # time complexity: sorting of O(n log n) for sorting each string
        # space complexity: overall of O(n) = temporary arrays O(n) + recursion stack of O(log n)
        def mergeSort(string):
            
            # base case of list with single element ['']
            # time complexity: O(1)
            # space complexity: O(1)
            if len(string) <= 1:
                return string
            
            # time complexity: dividing operation in constant O(1)
            mid = len(string) // 2

            # mergeSort both subarrays
            left = mergeSort(string[:mid])
            right = mergeSort(string[mid:])
            
            # base case of merge two lists with single element merge ([''], [''])
            return merge(left, right)
        
        # Time Complexity: O(n) per merge
        # Space Complexity: O(n) for temporary storage of the merged array
        def merge(left, right):
        
            sorted = []
            i = j = 0

            # time complexity: iterate over two strings combined n length O(n) 
            while i < len(left) and j < len(right):

                # sort characters into sorted while iterating
                if left[i] <= right[j]:
                    sorted.append(left[i])
                    i += 1
                else:
                    sorted.append(right[j])
                    j += 1
            

            # time complexity: O(n) in worst case if either list is empty
            sorted.extend(left[i:]) # If elements remain in `left`
            sorted.extend(right[j:]) # If elements remain in `right`

            return sorted

        # Time Complexity: O(n log n) for sorting each string
        # Space Complexity: O(n) for temporary arrays and recursion stack
        sorted_s = mergeSort(list(s))
        sorted_t = mergeSort(list(t))


        # overall: time complexity: O(n log n)
        # overall: space complexity: O(n)
        return sorted_s == sorted_t
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
SortingO(n log n)O(n)Sorting the strings involves recursively dividing the input and merging them back leading to O(n log n) time complexity. Merge sort derivation here.Temporary arrays are used during merging, leading to O(n) space
ComparisonO(n)O(1)Comparing the sorted strings character by-element takes linear timeNo additional memory is allocated O(1)
OverallO(n log n)O(n)Sorting dominates the time complexity as comparison adds negligible overhead leading to O(n log n) time complexity.Temporary arrays required for merge sort is proportional to input size leading to O(n) space complexity.

Find the Bug: (Set Frequency Count)

    def isAnagram(self, s: str, t: str) -> bool:
        
        # checking if length match
        if len(s) != len(t):
            return False

        # INCORRECT: checks only if all characters of s are in t
        # and not vice versa
        for char in s:
            if char not in t:
                return False
        
        # INCORRECT: does not validate that character frequency counts match
        return True

Solution 1: Hashmap - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:
        
        # space complexity: hashmap of 26 elements constant  O(1) 
        count = defaultdict(int)

        # time complexity: iteration over string of n length O(n)
        for x in s:
           count[x] += 1

        # time complexity: iteration over string of n length O(n)
        for x in t:
           count[x] -= 1

        # time complexity: iteration over string of n length O(n)
        for value in count.values():

            # time complexity: comparison operation in constant O(1)
            if value != 0:
                return False

        # overall: time complexity  O(n)
        # overall: space complexity O(1)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(n)Iterating over the first string takes linear time O(n) with O(1) insertion.Hashmap size depends on the number of unique characters in the input. Worst case O(n)
Editing MapO(n)O(1)Iterating over the second string to edit the Hashset takes linear time O(n)No additional memory is allocated O(1)
VerificationO(n)O(1)Iterating over the Hashmap takes linear time O(n)No additional memory is allocated O(1)
OverallO(n)O(n)Character counting and verification result in O(3n) which Big(O) abstracts to O(n).Memory allocation for Hashmap dominates leading to O(n) space complexity

Solution 2: Array of 26 - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:

        # note: ord() converts a single uni-code character to its integer representation
        # int: 97-122 -> lowercase English alphabet (a-z)
        # we can calculate any char (a-z)'s uni-code integer by subtracting uni-code of 'a'
        # '(a-z)' - 'a': returns a value of 0-25 which we can index into our array
        # this simulates a hashing into our array

        # space complexity: array of 26 elements constant O(1)
        count = [0] * 26

        # time complexity: iteration of O(n)
        for x in s:
           count[ord(x) - ord('a')] += 1

        # time complexity: iteration of O(n)
        for x in t:
           count[ord(x) - ord('a')] -= 1

        # time complexity: iteration of O(n)
        for value in count:

            # time complexity: comparison of O(1)
            if value != 0:
                return False

        # overall: time complexity: O(n)
        # overall: space complexity O(1)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(1)Iterating over the first string to populate the Array takes linear time O(n)Fixed-sized array of 26 elements is used O(1)
Editing ArrayO(n)O(1)Iterating over the second string to edit the Array takes linear time O(n)No additional memory is allocated O(1)
VerificationO(n)O(1)Iterating over the Array takes linear time O(n)No additional memory is allocated O(1)
OverallO(n)O(1)counting and verification result in O(3n) which Big(O) abstracts to O(n)Space usage optimal due to a constant size array O(1)

1. Two Sum I (Input Array Is Not Sorted) ::1:: - Easy

Topics: Array, Hash Table

Intro

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. You may assume that each input would have exactly one solution, and you may not use the same element twice. You can return the answer in any order.

nums[]TargetOutput
[2,7,11,15]9[0,1]
[3,2,4]6[1,2]
[3,3]6[0,1]

Constraints:

Only one valid answer exists

Target = n + m, n/m value: 2 ≤ nums.length ≤ 10^4

Target value: -10^9 ≤ target ≤ 10^9

Abstraction

We need to find two elements that add up to the target and return their indexes.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (Comparing all pairs)
Hashmap 1-pass

Brute Force (Comparing all pairs)

    def twoSum(self, nums: List[int], target: int) -> List[int]:
        
        # time complexity: iteration of O(n)
        for i in range(len(nums)):

            # time complexity: iteration of O(n^2) 
            # 1 + 2 + 3 ... (n - 3) (n - 2) (n - 1) -> 
            # sandwich with reverse sum and cross out like terms -> n(n-1)/2 = O(n^2)
            for j in range(i + 1, len(nums)):

                # time complexity: summation + comparison of O(1)
                if nums[i] + nums[j] == target:
                    return [i, j]

        # overall: time complexity  O(n^2) 
        # overall: space complexity O(1)
        return []
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Outer LoopO(n)O(1)Iterates over the array once for each element O(n)No additional memory is allocated O(1)
Inner LoopO(n2)O(1)For each element in the outer loop, iterates through all remaining elements O(n2)No additional memory is allocated O(1)
Summation + ComparisonO(1)O(1)Summing two numbers takes constant time O(1) and comparison to target takes constant time O(1)No additional space is needed for summation O(1), no additional space is needed for comparison O(1)
OverallO(n2)O(1)The nested loops result in O(n2) time complexity. Derivation of average case is found O(n2) is found here.No additional space was allocated, so it is independent of input size O(1)

Find the Bug: (overwriting duplicates)

    def twoSum(self, nums: List[int], target: int) -> List[int]:
        
        # space complexity: hashmap of (int, index) for list n length O(n)
        tracking = {} 

        # time complexity: iterate over list n length o(n)
        for i, num in enumerate(nums):

            # INCORRECT: Overwrites already existing value in the tracking map
            # nums = [3, 3] target = 6 would fail
            tracking[num] = i

            # find complement
            complement = target - num

            # grab complement 
            if complement in tracking:
                return [tracking[complement], i]

        return []

Solution 1: Hashmap - Hashmap/Logic Mapping

    def twoSum(self, nums: List[int], target: int) -> List[int]:
        
        # map of {complement: index, complement: index...}
        # space complexity: hashmap of list of n length  O(n)
        tracking = {}
 
        # time complexity: iterating over list of size n O(n)
        for i in range (len(nums)):

            # grab curr complement -> (target = nums[i] + complement) 
            # time complexity: subtraction O(1) 
            complement = target - nums[i]

            # time complexity: lookup operation O(1)
            if complement in tracking:
                
                # [curr_index, complement_index]
                return [i, tracking[complement]]
    
            # if miss, track currElement as potential complement
            # time complexity: insert operation O(1)
            tracking[nums[i]] = i

        # overall: time complexity  O(n) 
        # overall: space complexity O(n)
        return []
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(n)Iterating over the first string to populate a Hashmap takes linear time O(n)Hashmap size is proportional to the number of unique integers. Worst case O(n).
LookupO(1)O(1)Each lookup operation takes constant O(1) time.Lookups do not require additional memory.
InsertionO(1)O(n)Each insertion operation takes constant O(1) time.Hashmap size is proportional to the number of unique integers. Worst case O(n).
OverallO(n)O(n)Iterating dominates time complexity leading to linear O(n) time complexity.Memory allocation for Hashmap dominates leading to O(n) space complexity.

49. Group Anagrams ::2:: - Medium

Topics: Array, Hash Table, String, Sorting

Intro

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

InputOutput
["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]
[""][[""]]
["a"][["a"]]

Constraints:

strs[i] consists of lowercase English letters

Abstraction

There are different ways to solve this problem.

We could then use the Hashmaps of character counts as keys to group anagrams in a list of strings: Hashmap(Hashmap(charCount) -> [anagram list string])

We could sort the string and use that as the key: Hashmap ( sorted_string -> [[unsorted anagram string group]])

Lets see:

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (Merge sort)O(n * k log k)O(n * k)Iterating over the Array takes O(n). Sorting each string with merge sort takes O(k log k). Leading to O(n * k log k)Requires storage for sorted strings O(k) for O(n) strings, groups are stored in memory
Hashmap (array of 26) -> Tuple KeyO(n * k)O(n * k)Iterating over the Array takes O(n). Computing character count per string takes O(k). Leading to O(n * k)Creates a Hashmap per string O(n) for k character key map per string O(k). Leading to O(n * k)
Hashmap (array of 26) -> String KeyO(n * k)O(n * k)Iterating over the Array takes O(n). Computing character count per string takes O(k). Leading to O(n * k)Creates a Hashmap per string O(n) for k character key map per string O(k). Leading to O(n * k)
BugError
Find the Bug: Set -> String Key (Set Frequency Count)Does not take into account frequency count of characters since sets can only have unique elements.

Brute Force: (Merge Sort)

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # time complexity: O(k log k) = merge two halves O(k) * num of levels of recursion O(log k) 
        # space complexity: O(k) = temporary array during merging O(k) + recursion stack memory O(log k)
        def mergeSort(word):
        
            # base case: list with 1 element
            # time complexity: O(1)
            # space complexity: O(1)
            if len(word) <= 1:
                return word
            
            mid = len(word) // 2
            

            # space complexity: temporary array O(k) + recursion stack memory O(log k)
            left = mergeSort(word[:mid])
            right = mergeSort(word[mid:])
            
            # space complexity: temporary array O(k)
            sorted = []
            i = j = 0
            
            # time complexity: iteration of both subarrays O(n)
            # space complexity: temporary arrays of O(n)
            while i < len(left) and j < len(right):
                if left[i] < right[j]:
                    sorted.append(left[i])
                    i += 1
                else:
                    sorted.append(right[j])
                    j += 1
            
            # Adding remaining elements from either half
            sorted.extend(left[i:])  # time complexity: O(n)
            sorted.extend(right[j:]) # time complexity: O(n)

            return ''.join(sorted_word)
    
        # space complexity: O(n * k) = for n sorted string key O(n) and original string value of length k O(k)
        anaGroup = {}
        
        # time complexity: O(n * k log k), where n is number of strings and k is maximum length
        for word in strs:
            
            # time complexity: merge sort O(k log k)
            sorted = mergeSort(word)

            # time complexity: lookup + insertion O(1)
            if sorted in hashmap:
                hashmap[sorted].append(word)
            else:
                hashmap[sorted] = [word]
        

        # overall: time complexity 
        # overall: space complexity
        return list(anaGroup.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Merge Sort per stringO(k log k)O(k)Sorting a single string of length k.Temporary arrays O(k) and recursion stack O(log k)
Sorting all stringsO(n * k log k)O(k) (temporary per string sequentially)Sorting n strings, each of length kO(k) per string dominates recursion stack O(log k)
Hashmap insertionO(1)O(n * k)Lookup and insertion are O(1)Memory allocation for sorted n string keys O(n) * original string values O(k)
OverallO(n * k log k)O(n * k)Merge sort O(k log k) for all strings O(n) dominates. Leading to O(n * k log k)Storing sorted string keys O(n) with original string value O(k) dominates. Leading to O(n * k)

Find the Bug: Hashmap Key for Hashmap (Unhashable type)

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # space complexity: stores unique m tuple keys O(m) and lists of k strings O(k), O(m * k)
        anaGroup = {}
        
        # Iterating over all strings O(n)
        for word in strs:
            
            # Counting character frequencies O(k)
            charCount = [0] * 26
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # INCORRECT: Mutable list `charCount` used as key
            if charCount not in anaGroup:
                anaGroup[charCount] = []  # Initialize list for group if key doesn't exist
            
            anaGroup[charCount].append(word)
        
        # overall: time complexity
        # overall: space complexity 
        return list(anaGroup.values())

Find the Bug: Hashmap Key Order (Hashmaps are not sorted)

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # space complexity: stores unique m tuple keys O(m) and lists of k strings O(k), O(m * k)
        groupAna = {}

        # Iterating over all strings O(n)
        for s in strs:
            count = defaultdict(int)

            for c in s:
                count[c] += 1

            tupKey = tuple(count)

            # INCORRECT: hashmaps (count obj) is not sorted
            # so tupKey will vary for "cat" vs "tac"
            if tupKey not in groupAna:
                groupAna[tupKey] = []

            groupAna[tupKey].append(s)

        return list(groupAna.values())

Find the Bug: Set to String Key (Set Frequency Count)

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # space complexity: stores n tuple keys O(n) and lists of original k strings O(k), O(n * k)
        anaGroup = {}
        
        # time complexity: iterating over list of n length O(n)
        for word in strs:
            
            # INCORRECT: Set creates a key excluding duplicates
            charSet = set()  
            for c in word:
                charSet.add(c) 
            
            # set sort string key
            setSort = ''.join(sorted(charSet))
            
            if setSort not in anaGroup:
                anaGroup[setSort] = [] 
            anaGroup[setSort].append(word)
        
        return list(anaGroup.values())

Solution 1: Array of 26 to Tuple Key - Hashmap/Grouping by Criteria

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # Note: Tuples in python are immutable which makes them suitable as keys for a hashmaps
        # Lists are mutable and as such not hashable for hashmaps
        # You should use tuples where the position carries semantic meaning. 

        # [26 char count tuple key -> [anagram group list], ...]
        
        # space complexity: stores unique m tuple keys O(m) and lists of k strings O(k), O(m * k)
        anaGroup = {}

        # time complexity: iterating over all strings O(n) 
        for word in strs:
            
            # space complexity: fixed-sized array for 26 lowercase letters O(1)
            charCount = [0] * 26  
            
            # time complexity: counting chars in string of k length O(k)
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # space complexity: fixed-size tuple of length 26 O(1)
            key = tuple(charCount)  
            
            # time complexity: lookup operation of O(1) 
            if key not in anaGroup:
                anaGroup[key] = []  

            # time complexity: append operation to list O(1)
            anaGroup[key].append(word)  


        # overall: time complexity  O(n * k)  
        # overall: space complexity O(m * k)
        return list(anaGroup.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterating over stringsO(n)O(1)Loop over n strings O(n)No additional memory per iteration O(1)
Counting CharactersO(k) per stringO(1)counting characters for string k length O(k)fixed-sized array for 26 chars O(1)
TupleO(1)O(1)Converting fixed-size array to tuple in constant O(1)fixed-size tuple of 26 elements O(1)
Lookup in hashmapO(1)O(n * k)Lookup operation takes constant O(1)For n string keys of length k O(n * k)
Appending to listO(1)O(n * k)Appending operation takes constant O(1)Hashmap of n keys to list of strings k length O(n * k)
OverallO(n * k)O(n * k)Iteration and character counting dominates with O(n * k)Memory allocation for n string keys with for list of strings k length dominates with O(n * k)

Solution 2: Array of 26 to String Key - Hashmap/Grouping by Criteria

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:

        # Note: using a list to build the string key is more efficient than repeatedly appending to a string, 
        # as it avoids the overhead of creating new string objects on each append
        # list takes O(m) time vs string append which takes O(m^2)

        # space complexity: stores n tuple keys O(n) and lists of original k strings O(k), O(n * k)
        anaGroup = {}

        # time complexity: iterating over n strings O(n)
        for word in strs:

            # space complexity: fixed-sized array of 26 lowercase characters O(1)
            charCount = [0] * 26
            
            # time complexity: counting characters for string k length O(k)
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # time complexity: creating unique char count key from fixed-sized array of 26 O(1)
            # Using a list to build the key
            key_parts = []
            for count in charCount:
                key_parts.append(str(count))
                key_parts.append("#")  # Separator

            # Concatenate once at the end
            key = ''.join(key_parts)

            # time complexity: lookup operation for key O(1)
            if key not in anaGroup:
                anaGroup[key] = []  # Initialize list for key if doesn't exist
            anaGroup[key].append(word)  # Add word to corresponding key group

        # overall: time complexity
        # overall: space complexity 
        return list(anaGroup.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterating over stringsO(n)O(1)Iterating over n strings O(n)No additional memory per iteration O(1)
Counting CharactersO(k) per stringO(1)Counting characters for string k lengthFixed-sized character count array of 26 lowercase letters O(1)
Key GeneratorO(1)O(1)Iterating over fixed-sized 26 character array to generate key O(1)Unique string key generated O(1)
Lookup in hashmapO(1)O(n * k)Lookup operation in hashmaps takes constant O(1)Hashmap of n string keys to list of strings k length O(n * k)
Appending to listO(1)O(1)Appending operation to list takes constant O(1)No additional memory required for append O(1)
OverallO(n * k)O(n * k)Iterating over n strings and counting characters for strings k length dominates O(n * k)Hashmap of n string keys with lists of strings k length dominates O(n * k)

347. Top K Elements in List ::3:: - Medium

Topics: Array, Hash Table, Divide and Conquer, Sorting, Heap (Priority Queue), Bucket Sort, Counting, Quickselect

Intro

Given an integer array nums and an integer k, return the k most frequent element within the array. Test cases are generated such that the answer is always unique. You may return the output in any order

InputkOutput
[1,2,2,3,3,3,3]2[2,3]
[7,7]1[7]

Constraints:

Most Occurrences: 1 ≤ k ≤ number of distinct elements in nums

Integer value: -1000 ≤ nums[i] ≤ 1000

Abstraction

To find the k most frequent elements, we must first create an occurrence counter for each element in the list. Now that we have the count, we just grab the top k highest occurring elements.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (Insertion sort)O(m2)O(n + k)Insertion sort on m unique elements dominates leadin
Hashmap -> Bucket SortO(n)O(n)
MinHeap
Divide and Conquer
Quickselect

Brute Force (Insertion sort)

    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        
        # space complexity: frequency count for all unique integers O(m)
        count = defaultdict(int)

        # time complexity: iterate over list of n integers O(n)
        for num in nums: 

            # time complexity: insertion operation takes constant O(1)
            count[num] += 1


        # time complexity: iteration for conversion to list takes O(m)
        # space complexity: list size is equal to number of unique integers O(m)
        countTuple = list(count.items()) # [(num, freq), ...]

        # time complexity: Insertion sort 
        # Best case O(m), Average O(m) = O(m^2/2), Worst O(m^2)
        for i in range(1, len(countTuple)):
            key = countTup[i][1]
            j = i - 1

            # while left element is greater than key
            while j >= 0 and countTuple[j][1] < key:
                countTuple[j + 1] = countTuple[j]
                j -= 1
            
            countTuple[j + 1] = key # place key

        # time complexity: grab k top occurring ints from frequency list O(k)        
        result = []
        for i in range(k):
            
            # grabbing int key from tuple [(int, count), ...]
            result.append(countTuple[i][0]) 

        # overall: time complexity  O(m^2) = O(n + m^2 + k)
        # overall: space complexity O(m)
        return result
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(m)Iterating over array of n integers O(n)Hashmap with m unique elements results in O(m) memory allocation with O(n) in worst case
Conversion to listO(m)O(m)Iterating over hashmap to create list takes O(m)List of tuples stores m (int, count) pairs O(m)
Insertion sortBest O(m), O(m2) average, O(m2) worstInsertion sort on tuples of (int, count) sorting by count results in usual time complexity of insertion sortNo additional memory allocation required for in-place insertion sort O(1)
Grabbing top-k elementsO(k)O(k)Iterating over sorted list for k elements O(k)Storing top k elements into result array O(k)
OverallO(m2)O(m)Insertion sort dominates leading to O(m2)Hashmap and list for m unique elements dominates leading to O(m)

Solution 1: MaxHeap Track N Elements - HashMap/Algorithm

    def topKFrequent(self, nums: List[int], k:int) -> List[int]:

        # Note: A heap only guarantees that the root is the max (max heap) or min (min heap).
        # The heap data structure only allows removing the root efficiently (O(log n))
        # The heap property says nothing about the relative order of leaves or internal nodes.
        # The smallest element could be anywhere in the leaves or internal nodes, not necessarily a leaf.
        # Thus for maxHeap, we need to add all of the elements to the tree
        # since we can't removed the smallest leaf as we go

        # space complexity: count for m unique elements worst case n elements O(n)
        count = defaultdict(int)

        # time complexity: iterate over list of n length O(n)
        for num in nums:
            count[num] += 1

        # space complexity: maxHeap must store all n elements O(n)
        maxHeap = []

        # time complexity: iteate over m unique elements worst case over n elements O(n)
        for num, freq in count.items():

            # time complexity: push onto heap O(log m) worst case O(log n) leading to best case O(m log m) worst case O(n log n) 
            heapq.heappush(maxHeap, (-freq, num))

        # space complexity: list of k elements worst case n elements best O(k) worst O(n)
        result = []

        # cannot iterate over array, as order is not guaranteed left to right
        # time complexity: iterate for k elements worst case n elements best O(k) worst O(n)
        for _ in range(k):

            # time complexity: pop top element O(log n) for k elements, worst case O(log n) for n elements, best case O(k log n) worst O(n log n)
            freq, num = heapq.heappop(maxHeap)
            result.append(num)

        # same as above
        # res = [num for _, num in (heapq.heappop(maxHeap) for _ in range(k))]

        # overall: time complexity O(n log n)
        # overall: space complexity O(n)
        return result
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Counting FrequenciesO(n)O(n)iterate trough input array of n length O(n)Memory allocation for frequency count for n unique elements O(n)
Building MaxHeapO(n log n)O(n)Push n elements O(n) using bubbleUp O(log n) leading to O(n log n)Memory allocation for array representing heap O(n)
Extracting K elementsbest: O(k log n) worst: O(n log n)best: O(k) worst O(n)Pop k elements O(k) using bubbleDown O(log n) leading to O(k log n)Memory allocation to store k elements O(k)
overallO(n log n)O(n)Heap construction dominates leading to O(n log n)Memory allocation for hashmap and heap dominate leading to O(n)

Solution 2: MinHeap Track K Elements - HashMap/Algorithm

    def topKFrequent(self, nums: List[int], k:int) -> List[int]:

        # Note: A heap only guarantees that the root is the min (min heap) or max (max heap).
        # The heap data structure only allows removing the root efficiently (O(log n))
        # The heap property says nothing about the relative order of leaves or internal nodes.
        # The largest element could be anywhere in the leaves or internal nodes, not necessarily a leaf.
        # Thus for a minHeap, the root always holds the smallest element.
        # If we only remove the smallest element when heap as exceeded size k, 
        # the heap will always contains the k largest frequencies seen so far.

        # space complexity: count for m unique elements worst case n elements O(n)
        count = defaultdict(int)

        # time complexity: iterate over list of n length O(n)
        for num in nums: 
            count[num] += 1

        # space complexity: minHeap tracks k elements O(k)
        minHeap = []

        # time complexity: worst case iterate over n unique elements O(n)
        for num, freq in count.items(): 

            # time complexity: push onto heap O(log n) for n elements, leading to O(n log n) 
            heapq.heappush(minHeap, (freq, num)) 
            
            # if heap grows to size k + 1
            if len(minHeap) > k:

                # pop smallest element, heap will be back to size k
                # time complexity: # pop smallest O(log k) for worst case n elements, leading to O(n log k)
                heapq.heappop(minHeap) 
        
        # can iterate over heap array as only holding top k elements
        # time complexity: grab all k elements from minHeap worst case grab n elements O(n)
        result = [num for freq, num in minHeap]

        # for _, num in minHeap:
        #    result.append(num) 
        # same as above

        # overall: time complexity O(n log n)
        # overall: space complexity O(n)
        return result
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Counting FrequenciesO(n)O(n)iterate trough input array of n length O(n)Memory allocation for frequency count for n unique elements O(n)
Building MinHeapbest: O(n log k) worst: O(n log n)best: O(k) worst: O(n)Iterates over list of n length O(n) and performs bubbleUp on heap with height k O(log k), leading to O(n log k)Memory allocation to represent heap of k length O(k)
Extracting top kbest: O(k) worst: O(n)best: O(k) worst: O(n)Iterate over heap for k elements O(k)Memory allocation to store top k elements O(k)
Overallbest: O(n log k) worst: O(n log n)O(n)Worst case iterating over count for n unique elements O(n) and pushing values to heap to track n values O(log n), leading to O(n log n)Memory allocation to track in worst case n unique element counts O(n)

Solution 3: Descending QuickSelect BinarySearch - Hashmap/Algorithm

    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    
        # Note: QuickSelect is a modified quicksort
        # Once the finalPartitionMarker is placed in its correct position, 
        # elements to the left and right are guaranteed to be smaller/larger
        # This avoids sorting the entire array and isolates the elements we need

        # In-place partition of array based on pivot value
        def partitionSection(left, right, randPivotElemIndex):
            
            # time complexity: partition processes m elements, n in worst case O(n)
            # Frequency of pivot element
            pivotElemFreq = frequency[unique[randPivotElemIndex]]

            # Move pivot to end
            # tuple unpacking in python, allows to swap two variables in a single line
            unique[randPivotElemIndex], unique[right] = unique[right], unique[randPivotElemIndex]
            
            # Index to partition larger elements to the left side of the section
            partitionIndex = left

            # Partition elements with freq more than pivotElemFreq to left
            # time complexity: iterates over unique elements between left and right segment O(m)
            for i in range(left, right):

                # Higher frequency -> "greater"
                if pivotElemFreq < frequency[unique[i]]:  

                    # swap larger element to left, sorting high to low
                    unique[i], unique[partitionIndex] = unique[partitionIndex], unique[i]
                    partitionIndex += 1

            # swap pivot to its final correct partitioned index
            # now elements to the right and left follow the conditions according to pivot element
            unique[partitionIndex], unique[right] = unique[right], unique[partitionIndex]
            return partitionIndex

        # time complexity: average recursion depth O(log m)
        # space complexity: recursion stack for in place partitioning on average O(log m)
        def quickSelectBinarySearchHelper(left, right, finalPartitionMarker):
            
            # Base Case: see 'For descending' below
            if left == right:
                return

            # Randomly choose a pivot index
            # differs from QuickSort "median-of-three" approach
            # random pivot is focused on finding the k-th smallest or largest
            # rather than fully sorting the array, random pivot avoids degrading to worst case O(n^2)
            randPivotElemIndex = random.randint(left, right)

            # Partition the array
            resultPivotElemIndex = partitionSection(left, right, randPivotElemIndex)
            
            # Base Case: see 'For descending' below
            if finalPartitionMarker == resultPivotElemIndex:
                return 
            
            # Binary Search Modification:
            # Selects next partition pivot
            # Recursively QuickSelect on partitioned section (left or right)
            # where finalPartitionMarker belongs to

            # finalPartitionMarker is in the left partition of resultPivotElemIndex
            elif finalPartitionMarker < resultPivotElemIndex:
                quickSelectBinarySearchHelper(left, resultPivotElemIndex - 1, finalPartitionMarker)
            
            # finalPartitionMarker is in the right partition of resultPivotElemIndex
            else:
                quickSelectBinarySearchHelper(resultPivotElemIndex + 1, right, finalPartitionMarker)

        # time complexity: iterate over list of n length O(n)
        frequency = defaultdict(int)
        for num in nums:
            frequency[num] += 1

        # QuickSelect BinarySearch to find the k most frequent elements
        unique = list(frequency.keys())
        n = len(unique)
        finalPartition = k - 1
        l, r = 0, n - 1

        # For descending:
        # say we have a list of 6 elements, we are trying to grab the largest 2 elements
        # if we do 2 - 1 we get 1.
        # so if we find we have partitioned and set index 1
        # we know that the elements 0 and 1 are the 2 largest elements
        # so we splice [:2] = [0, 1]
        quickSelectBinarySearchHelper(l, r, finalPartition)

        # overall: time complexity
        # overall: space complexity 
        return unique[:k]

Solution 4: Ascending QuickSelect BinarySearch - Hashmap/Algorithm

    def topKFrequent(self, nums: List[int], k: int) -> List[int]:

        # Note: QuickSelect is a modified quicksort
        # Once the finalPartitionMarker is placed in its correct position, 
        # elements to the left and right are guaranteed to be smaller/larger
        # This avoids sorting the entire array and isolates the elements we need

        # In-place partition of array based on pivot frequency (low to high sorting)
        def partitionSection(left, right, randPivotElemIndex):
            
            # Frequency of pivot element
            pivotElemFreq = frequency[unique[randPivotElemIndex]]

            # Move pivot to end
            # tuple unpacking in python, allows to swap two variables in a single line
            unique[randPivotElemIndex], unique[right] = unique[right], unique[randPivotElemIndex]
            
            # Index to place smaller elements
            partitionIndex = left

            # Partition elements with freq less than pivotElemFreq to left
            # time complexity: iterates over unique elements between left and right segment O(m)
            for i in range(left, right):

                # Lower frequency -> "lower"
                if frequency[unique[i]] < pivotElemFreq:

                    # swap larger element
                    unique[i], unique[partitionIndex] = unique[partitionIndex], unique[i]
                    partitionIndex += 1

            # swap pivot to its final correct partitioned index
            # now elements to the right and left follow the conditions according to pivot element
            unique[partitionIndex], unique[right] = unique[right], unique[partitionIndex]
            return partitionIndex

        # QuickSelect helper for low to high sorting
        # Avg recursion depth: O(log m), space: O(log m)
        def quickSelectBinarySearchHelper(left, right, finalPartitionMarker):
            
            # Base Case: see 'For ascending' below
            if left == right:
                return

            # Randomly choose a pivot index
            # differs from QuickSort "median-of-three" approach
            # random pivot is focused on finding the k-th smallest or largest
            # rather than fully sorting the array, random pivot avoids degrading to worst case O(n^2)
            partitionElemIndex = random.randint(left, right)
            
            # Partition the array
            finalPartitionElemIndex = partitionPivot(left, right, partitionElemIndex)

            # Base Case: see 'For ascending' below
            if finalPartitionMarker == finalPartitionElemIndex:
                return

            # Binary Search Modification:
            # Selects next partition pivot
            # Recursively QuickSelect on partitioned section (left or right)
            # where finalPartitionMarker belongs to

            # finalPartitionMarker is in the left partition of resultPivotElemIndex
            elif finalPartitionMarker < finalPartitionElemIndex:
                quickSelectBinarySearchHelper(left, finalPartitionElemIndex - 1, finalPartitionMarker)
            
            # finalPartitionMarker is in the right partition of resultPivotElemIndex
            else:
                quickSelectBinarySearchHelper(finalPartitionElemIndex + 1, right, finalPartitionMarker)

        # time complexity: iterate over list of n length O(n)        
        frequency = defaultdict(int)
        for num in nums:
            frequency[num] += 1

        # Quickselect to find the k most frequent elements
        unique = list(frequency.keys())
        n = len(unique)
        finalPartition = n - k
        l, r = 0, n - 1

        # For ascending:
        # say we have a list of 6 elements, we are trying to grab the largest 2 elements
        # if we do 6 - 2 we get 4.
        # so if we find we have partitioned and set index 4
        # we know that the elements 4 and 5 are the 2 largest elements
        # so we splice [6-2:] = [4:] = [4, 5]
        # quickSelectBinarySearchHelper(left, right, finalPartitionMarker)
        quickSelectBinarySearchHelper(l, r, finalPartition)

        # overall: time complexity
        # overall: space complexity
        return unique[n - k:]
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Frequency CountO(n)O(n)Iterate over list of n length O(n)Frequency count for at most n keys O(n)
Unique listO(m)O(m)Grab unique keys from frequency map O(m)Unique list contains m unique elements O(m)
Single PartitionO(m)O(1)Iterates over m unique elements in portion for a single partition O(m)No additional memory allocation for in-place partition O(1)
QuickselectO(m log m)O(log m)Recursion depth of O(log m) for O(m) elements per level O(m log m)Recursion stack depth of O(log m)
OverallO(n + m log m)O(m)Iterating over list of n length competes with Quickselect m number of unique elements O(m log m), depending if m << n or m ≈ n, leading to O(n + m log m)Frequency count dominates leading to O(m)

Solution 5: BucketSort by Count - Hashmap/Algorithm

    def topKFrequent(self, nums: List[int], k:int) -> List[int]:
            
        # space complexity: frequency count for unique integers O(m) 
        # time complexity: iterate over list of n integers O(n)
        count = defaultdict(int)
        for key in nums:

            # time complexity: insertion operation takes constant O(1)
            count[key] += 1

        # numBuckets equal to length to account for max possible frequency count
        # the case where list is full of only 1 element O(n)
        # + 1 for case of element with 1 list, need bucket of frequency 0 and bucket of frequency 1
        numBuckets = len(nums) + 1

        # below is saying: create a list of empty lists numBuckets amount of times 
        # time complexity: iterating to set empty list for each bucket
        # space complexity: creating O(n) bucket lists
        freqBuckets = [[] for i in range(numBuckets)]

        # time complexity: iterate over frequency list for m unique integer tuples (int, occurrences) O(m) 
        for int, occurrences in count.items():

            # time complexity: insert operation takes constant O(1)
            freqBuckets[occurrences].append(int)

        # space complexity: grabbing top k integers O(k)
        res = []

        # time complexity: iterate over n buckets O(n)
        for i in range(len(freqBuckets) - 1, 0, -1):
            
            # time complexity: iterate over all entries in current bucket O(n)
            for num in freqBuckets[i]:
                
                # time complexity: insert operation takes constant O(1)
                res.append(num)
                
                # time complexity: continue while less than k integers have been grabbed O(k)
                if len(res) == k:
                    return res

        # overall: time complexity O(n)
        # overall: space complexity O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iteration + InsertionO(n)O(m)Iteration over array of n integers O(n)Hashmap for m unique integers O(m)
Bucket InitializationO(n)O(n)Iterating over list length to create n empty buckets O(n)At least n buckets needed for worst case scenario of list filled with only 1 integer O(n)
Bucket PopulationO(m)O(m)Inserting m unique integers into their respective bucketsBuckets will store m unique integers in their respective buckets O(m)
Result ExtractionO(n)O(k)Iterating over n buckets to collect k integers, worst case all buckets must be traverse O(n)Top k integers must be stored in result list O(k)
OverallO(n)O(n)Iterating over n buckets dominates leading to O(n)Initializing n buckets dominates leading to O(n)

238. Product of Array Except Self ::2:: - Medium

Topics: Array, Prefix Sum

Intro

Given an integer array nums, return an array answer such that answer[i] is equal to the product of all the elements of nums except nums[i]. The product of any prefix or suffix of nums is guaranteed to fit in a 32-bit integer. You must write an algorithm that runs in O(n) time and without using the division operation.

InputOutput
[1,2,3,4][24,12,8,6]
[-1,1,0,-3,3][0,0,9,0,0]

Constraints:

The product of any prefix or suffix of nums is guaranteed to fit in a 32-bit integer

Abstraction

Given a list of nums, return a list of nums that is the product of the array except itself.

So, for some num n, the result should be the product of all the numbers to the left of it, times the product of all the numbers to the right of it:

Array:           [   2,    3,    4,    5,    6  ]

Prefix:          [   1,    2,    6,   24,  120  ]
x
Postfix:         [ 360,  120,   30,    6,    1  ]
=
Result:          [ 360,  240,  180,  120,  120  ]

To calculate the prefix, we can just iterate over the array multiplying each integer. To calculate the postfix, we can just iterate over the array backwards multiplying each integer. To calculate the result array, we can just iterate over both prefix and postfix arrays multiplying corresponding index values.

This would lead to O(n) = O(3n) for time and O(n) = O(3n) space. Now lets see how we can do it in O(1) for space.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force (all pairs comparison)O(n2)O(n)For each pair of elements, we compare them, leading to quadratic number of comparison O(n2)Memory allocation for result array O(n)
Prefix & PostfixO(n)O(n)Three iterations over the array for prefix, postfix, and result calculation O(n) = O(3n)Memory allocation for prefix, postfix, and result arrays O(n) = O(3n)
Prefix & PostfixO(n)O(1)Two iterations over the array for prefix and postfix calculations O(n)

Brute Force (all pairs comparison)

    def productExceptSelf(self, nums: List[int]) -> List[int]:

        # space complexity: result array of length n O(n)
        res = [0] * len(nums)

        # time complexity: iterate over list of n integers O(n) 
        for i in range(len(nums)):

            # starting value
            product = 1

            # time complexity: iterate over list of n integers O(n)
            for j in range(n):

                # skip current index i while multiplying
                if i != j:

                    # time complexity: multiplication operation takes constant O(1)
                    product *= nums[j]
                
            # time complexity: insert operation takes constant O(1)
            res[i] = product
        
        # overall: time complexity  O(n^2)
        # overall: space complexity O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Result arrayO(1)O(n)Constant time to instantiate array of length n O(1)Array to store n elements O(n)
Outer loopO(n)O(1)Iteration over the list takes linear time O(n)No additional memory is allocated O(1)
Inner loopO(n2)O(1)For each element in the outer loop, iterates through all remaining elements O(n2)No additional memory is allocated O(1)
OverallO(n2)O(1)The nested loops result in O(n2) time complexity. Derivation of average case is found O(n2) is found here.Memory allocation for array to store n result elements O(n)

Solution 1: Prefix and Postfix O(n) - Array/In Place Transformations

    def productExceptSelf(self, nums: List[int]) -> List[int]:
        
        # space complexity: prefix, postfix, and result arrays to calculate and store final value for n integers O(n) = O(3n)
        prefix = [1] * len(nums)
        postfix = [1] * len(nums)
        res = [1] * len(nums)

        # Compute prefix products
        # time complexity: iteration over list of n length O(n)
        for i in range(1, len(nums)):
            
            # time complexity: lookup + multiplication operations take constant O(1)
            # (prefix of i) = (prefix of i - 1) * (nums[i - 1])
            prefix[i] = prefix[i - 1] * nums[i - 1]

        # Compute postfix products
        # time complexity: iteration over list of n length O(n)
        for i in range(n - 2, -1, -1):

            # time complexity: lookup + multiplication operations take constant O(1)
            # (postfix of i) = (postfix of i + 1) * (nums[i + 1])
            postfix[i] = postfix[i + 1] * nums[i + 1]

        # Combine prefix and postfix products
        # time complexity: iteration over list of length n O(n)
        for i in range(n):

            # time complexity: lookup + multiplication operations take constant O(1)
            # (product except self of i) = (prefix of i) * (postfix of i)
            res[i] = prefix[i] * postfix[i]


        # overall: time complexity O(n)
        # overall: space complexity O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Prefix ComputationO(n)O(n)Iterate over list of n length O(n)Memory allocation for prefix array of n length O(n)
Postfix ComputationO(n)O(n)Iterate over list of n length O(n)Memory allocation for postfix array of n length O(n)
Product except self computationO(n)O(n)Iterate over prefix/postfix list of n length O(n)Memory allocation for result array of n length O(n)
OverallO(n)O(n)Iteration over list of n length takes linear time leading to O(n)Memory allocation for list of n length leading to O(n)

Solution 2: Prefix and Postfix Optimal O(1) - Array/In Place Transformations

    def productExceptSelf(self, nums: List[int]) -> List[int]:
        
        # instantiate to 1, start prefix calculation at res[1], prefix of res[0] is always 1
        # space complexity: array to store results for n integers O(n) 
        res = [1] * len(nums)

        # Compute prefix products in res
        # time complexity: iterate over list of size n O(n)
        for i in range(1, len(nums)):

            # time complexity:
            # (prefix of i) = (prefix of i - 1) * (num at i - 1)
            res[i] = res[i - 1] * nums[i - 1]

        # accumulate running postfix through reverse iteration
        # postfix starts at 1, the postfix of len(n) - 1
        postfix = 1

        # start iteration at (len(n)-1), postfix of last integer will always be 1
        # time complexity: iterate over list of size n in reverse O(n)
        for i in range(len(nums) - 1, -1, -1):
            
            # (product except i) = (prefix of i) * (postfix of i)
            res[i] *= postfix
            
            # (postfix of i - 1) = (postfix of i) * (num at i) 
            postfix *= nums[i]

        # overall: time complexity O(n)
        # overall: space complexity O(1)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Prefix ComputationO(n)O(1)Iteration over list of n length O(n)Prefix value stored directly in result array O(1)
Postfix Computation + Product except self computationIteration over list of n length O(n)Postfix value stored in integer, product except self stored in result array O(1)
OverallO(n)O(1)Iteration over list of n length O(n)Prefix, postfix, and product except self all stored in result array or an integer leading to constant O(1)

36. Valid Sudoku ::2:: - Medium

Topics: Array, Hash Table, Matrix

Intro

Determine if a 9 x 9 Sudoku board is valid. Only the filled cells need to be validated according to the following rules: Each row must contain the digits 1-9 without repetition. Each column must contain the digits 1-9 without repetition. Each of the nine 3 x 3 sub-boxes of the grid must contain the digits 1-9 without repetition.

Constraints:

A Sudoku board that is partially filled could be valid, but not necessarily solvable. Only the currently filled cells need to be validated.

InputOutput
a sudoku table is too big to put here loljust look at -> sudoku board example

Abstraction

With sudoku, valid boards , we have rows are looking for duplicates much like the original 217. Contains Duplicates.

The only difference here, is that we are looking for duplicates in 3 sets along the rows, columns, and 3x3 squares.

So we simply create sets for all of these groups and as we traverse the board, we check for duplicates

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute ForceO(n2) = O(3 * n2)
HashmapO(n2)O(n)

Brute Force (check row, column, box)

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        
        # time complexity: iterating over row, col, or box (flattened) of n length (+ O(n)) not (* O(n))
        def isValidLine(line: List[str]) -> bool:
            seen = set()
            for num in line:
                if num != ".":
                    if num in seen:
                        return False
                    seen.add(num)
            return True

        # time complexity: iterating over board with n rows of length n and validating O(n^2)
        for row in board:
            if not isValidLine(row):
                return False

        # time complexity: iterating over board with n columns of length n and validating O(n^2)
        for c in range(9):
            column = []
            for r in range(9):
                column.append(board[r][c])
            if not isValidLine(column):
                return False

        # time complexity: iterating over n boxes of flattened length n <- (√n x √n) and validating O(n^2)
        for box_row in range(0, 9, 3):
            for box_col in range(0, 9, 3):
                line = []
                for r in range(box_row, box_row + 3):
                    for c in range(box_col, box_col + 3):
                        line.append(board[r][c])

                if not isValidLine(line):
                    return False

        # overall: time complexity O(n^2) = (3 * n^2)
        # overall: space complexity O(1)
        return True

Solution 1: Defaultdict Matrix - Hashmap/Representation

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        
        # space complexity: hashsets for rows, columns, and flattened boxes of n length O(n) = O(3n)
        cols = defaultdict(set)
        rows = defaultdict(set)
        grids = defaultdict(set)

        # time complexity: iterating over all cells (r * c) O(n^2)
        for r in range(9):
            for c in range(9):
                
                tmp = board[r][c]
                if tmp != ".":

                    # indexing box sets by tuple key (r/3, c/3)
                    gridTuple = (r // 3, c // 3)

                    # time complexity: lookup operation for element to corresponding sets in constant O(1)
                    if (tmp in rows[r] or 
                        tmp in cols[c] or 
                        tmp in grids[gridTuple] ):
                        return False
                    
                    # time complexity: insert operation for new element to corresponding sets in constant O(1)
                    cols[c].add(tmp)
                    rows[r].add(tmp)
                    grids[gridTuple].add(tmp)

        # overall: time complexity O(n^2)
        # overall: space complexity O(n)
        return True

Solution 2: [[]] Matrix - Hashmap/Representation

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        
        # space complexity: hashsets for rows, columns, and flattened boxes of n length O(n) = O(3n)
        rows = [[], [], [], [], [], [], [], [], []]
        col = [[], [], [], [], [], [], [], [], []]
        grids = [[], [], [], [], [], [], [], [], []]

        # time complexity: iterating over all cells (r * c) O(n^2)
        for r in range(9):
            for c in range(9):
                
                tmp = board[r][c]
                if tmp != ".":  

                    # indexing box sets by unique int calculation 
                    #  (j//3) = 0, 1, 2
                    #  (3 * (i//3)) = 0, 3, 6
                    #    0 1 2
                    # 0  0 1 2
                    # 3  3 4 5
                    # 6  6 7 8
                    # leads to 0-8 index indexing
                    gridKey = (3 * (i//3)) + (j//3)

                    # time complexity: lookup operation for element to corresponding sets in constant O(1)
                    if (tmp in rows[r] or 
                        tmp in col[c]  or 
                        tmp in grids[gridKey]):
                        return False

                    # space complexity: lookup operation for new element to corresponding sets in constant O(1)
                    col[c].append(tmp)
                    rows[r].append(tmp)
                    grids[gridKey].append(tmp)
        
        # overall: time complexity O(n^2)
        # overall: space complexity  O(n)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Row, Col, Box InitializationO(1)O(n2)Memory allocation for n rows, cols, boxs, of n length O(n2)
Iterating over all cellsO(n2)O(1)Iterating over n x n board O(n2)No additional memory allocation for iteration O(1)
LookupO(1)O(1)Lookup operation takes constant O(1)No additional memory allocation for lookups O(1)
InsertionO(1)O(n2)Insertion operation takes constant O(1)List of n lists each of n length O(n2)
OverallO(n2)O(n2)Iterating over all cells in n x n board dominates, leading to O(n2)List of n lists of n length dominates leading to O(n2)

Note: Solution 1.2 is faster than Solution 1:

While sets are efficient for large data due to constant time insert, lookup, delete, for a fixed-sized 9x9 board like this, the overhead of defaultdict and set operations is slower than the [[]] solution.

128. Longest Consecutive Sequence ::3:: - Medium

Topics: Array, Hash Table, Union Find

Intro

Given an array of integers nums, return the length of the longest consecutive sequence of elements. A consecutive sequence is a sequence of elements in which each element is exactly 1 greater than the previous element You must wrtie an algorithm that runs in O(n) time.

InputOutput
[2, 20, 4, 10, 3, 4, 5]4
[0, 3, 2, 5, 4, 6, 1, 1]7

Constraints:

Abstraction

Give a list of integers, find the longest sequence of increasing integers A longest consecutive sequence on a number line would look like this:

Sequence of length 4:     [ 1,  2,  3,  4,  2  ]
Sequence of length 3:     [ 3,  5,  9,  10, 4  ]

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force
Hashmap

Brute Force

    def longestConsecutive(nums: List[int]) -> int:
        
        longest = 0

        # time complexity: Iterate over list of n length O(n)
        for num in nums:

            currLen = 1
            
            # time complexity: Iterate over potential n consecutive integers O(n)
            # time complexity: Lookup in list takes O(n)
            # time complexity: O(n^2)
            while num + currLen in nums:
                currLen += 1

            # compare to global max
            longest = max(longest, currLen)

        # Overall time complexity: O(n^3) for worst-case where we scan sequences n times for n numbers
        # Space complexity: O(1)
        return longest
AspectTime ComplexitySpace ComplexityTime RemarkSpace Remark
IterateO(n)O(1)Iterate over list of n length O(n)No additional memory allocation for iteration O(1)
Iterate consecutive sequenceO(n)O(1)Iterate over potential n consecutive integers O(n)No additional memory allocation for iteration O(1)
List lookupO(n)O(1)Lookup in list takes O(n)No additional memory allocation for list lookup O(1)
OverallO(n3)O(1)Iterating * Iterating * Lookup dominate leading to O(n3)No additional memory allocation O(1)

Solution 1: HashMap Continent Boundaries - HashMap/Representation

    def longestConsecutive(nums: List[int]) -> int:
        
        # space complexity: hashmap of list of n length O(n)
        seqLen = defaultdict(int)
        longest = 0

        # Handle duplicates
        nums = set(nums)  

        # Iterate over unique numbers
        for num in nums:
            # Skip if num is already part of an existing sequence
            # Fetch lengths of neighboring sequences
            leftContinentLenFromBoundary = seqLen[num - 1]
            rightContinentLenFromBoundary = seqLen[num + 1]

            # Calculate new sequence length
            bridgedLen = 1 + leftContinentLenFromBoundary + rightContinentLenFromBoundary

            # Update new continent boundaries
            seqLen[num - leftContinentLenFromBoundary] = bridgedLen
            seqLen[num + rightContinentLenFromBoundary] = bridgedLen

            # Update the global max
            longest = max(longest, bridgedLen)

        return longest
AspectTime ComplexitySpace ComplexityTime RemarkSpace Remark
Iteration + InsertionO(n)O(n)Iterating over list of n length O(n)Memory allocation for sequence length for n integers O(n)
LookupsO(1)O(1)Lookup operation takes constant O(1)No additional memory allocation needed for lookup O(1)
Updating sequence countO(1)O(1)Boundary updates for left most, curr, and right most in constant O(1)Insert operation takes constant O(1)
OverallO(n)O(n)Iterating over list of n length dominates O(n)Memory allocation for n sequences O(n)

Solution 2: Set Rummy Run - Hashmap/Representation

def longestConsecutive(self, nums: List[int]) -> int: 
    
    # space complexity: set for list of n length O(n)
    numSet = set(nums)
    longest = 0

    # time complexity: iterate over list of n length O(n)
    for num in numSet:

        # found start of a run
        # time complexity: lookup operation takes constant O(1)
        if (num - 1) not in numSet: 

            # rummy run until missing an element
            currLen = 1
            # time complexity: iterate over m sequence worst case n sequence O(n), lookup O(1), leading to O(n)
            while (num + currLen) in numSet:
                currLen += 1 

            # validate with global max sequence
            longest = max(longest, currLen)

    # overall: time complexity O(n)
    # overall: space complexity O(n)
    return longest        
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
HashsetO(n)O(n)Converting list to set uses insert operation O(1) for n linear O(n)Memory allocation for list of n length O(n)
Outer LoopO(n)O(1)Iterating over set of n length O(n)No additional memory allocation for iteration
Start of sequence checkO(1)O(1)Lookup operation takes constant O(1)No additional space of lookup operation O(1)
while loop sequenceO(n)O(1)Iterate white sequence is valid, note all numbers are checked only once due to start of sequence check O(n)No additional memory allocation for while loop O(1)
OverallO(n)O(n)Iterating over hashset of n length dominates leading to O(n)Memory allocation for hashset of list n length dominates o(n)

Solution 3: Union Find Tree - HashMap/Algorithm

    def longestConsecutive(self, nums: List[int]) -> int:
        
        if len(nums) == 0:
            return 0

        # Union-Find Initialization:
        
        # Stores parent of each number
        # space complexity: hashmap of size n O(n)
        parent = {}  

        # Stores the size of connected components
        # space complexity: hashmap of size n O(n)
        size = {}    

        # Find operation with path compression
        # time complexity: Amortized O(α(n)) per operation
        def find(x):

            # if the parent of x is not itself, we have not reached the representative  
            if parent[x] != x:

                # Path compression
                # set parent of curr, to parent of parent
                parent[x] = find(parent[x])

            # return representative of group
            # either return self
            # or return parent of parent
            return parent[x]

        # Union operation with size optimization
        # time complexity: Amortized O(α(n)) per operation
        def union(x, y):

            # grab representatives of both x and y
            rootX = find(x)
            rootY = find(y)

            # trees are not already connected
            if rootX != rootY:

                # grab tree sizes
                xSize = size[rootX]
                ySize = size[rootY]

                # x tree is larger
                if ySize < xSize:
                    # attach smaller tree to larger tree
                    parent[rootY] = rootX
                    # add smaller tree size to larger tree
                    size[rootX] += size[rootY]

                # y tree is larger
                else:
                    # attach smaller tree to larger tree
                    parent[rootX] = rootY
                    # add smaller tree size to larger tree
                    size[rootY] += size[rootX]

        # Initialize Union-Find structure: 
        # Set all parents to self
        # time complexity: iterate over list of n length O(n)
        for num in nums:
            if num not in parent:
                parent[num] = num
                size[num] = 1

        # Join consecutive numbers: 
        # Union operation on sequences
        # time complexity: iterate over list of n length O(n)
        for num in nums:
            if num + 1 in parent:
                union(num, num + 1)
        
        # return largest group

        # overall: time complexity O(n)
        # overall: space complexity O(n)
        return max(size.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
InitializationO(n)O(n)Insert n elements into parent and size hashmaps O(n)Memory allocation for parent and size for n elements O(n)
Union-Find OperationsAmortized O(α(n))O(1)α(n) is nearly constant for realistic input sizesNo additional memory allocation for Union-Find operations O(1)
Join OperationO(n)O(1)
Grab LargestO(n)O(1)Grab max size in list of n elements O(n)No additional memory allocation for grab
OverallO(n)O(n)Initialization dominates, leading to O(n)