Jc-alt logo
jc
data structures and algorithms

LeetCode: Arrays and Hashing

LeetCode: Arrays and Hashing
51 min read
#data structures and algorithms
Table Of Contents

Arrays & Hashing Intro:

LeetCode problems with solutions using hashmaps.

What is a Hashmap

A dictionary is just a direct mapping of keys->values:

Dictionary: ['a': 2, 'b': 2, 'c' : 3, ...]

A Hashmap is just a dictionary that maps keys->values using a hash function. Hashmaps are a common use case for hashing. We choose hashing to maximize randomness and minimize collisions between elements to keys.

Hashing With A Function

Hashing is simply a function. It takes in an input and spits out an output. Here is the function mod, which takes in an integer and spits out an integer:

  • 1 mod (3) = 1
  • 2 mod (3) = 2
  • 3 mod (3) = 0
  • 4 mod (3) = 1

Now this is could be our function for our hashmap, but it would lead to high collisions and low randomness as there are only 3 possible key results: 0, 1, and 2. So hashmaps are only efficient as the chosen hash function.

Choosing a Hash Function

With a good hash function: Insert, Lookup, and Delete take O(1). In this case, every element gets its own unique key and so a lookup using this function would be constant O(1).

With a bad hash function: Insert, Lookup, and Delete take O(n). Lets say our function is hash():

Handling Collisions

Even with a good hash function, we may run into collisions sometimes. In those cases, we handle collision using chaining with a linked list of elements.

Here, there are 5 keys or buckets we could hash to.

    hash()  ->  [key mod (5)]
    ----------------------------
    hash(10) = 10 % 5 = 0
    hash(22) = 22 % 5 = 2
    hash(31) = 31 % 5 = 1
    hash(14) = 14 % 5 = 4
    hash(17) = 17 % 5 = 2 (collision!)
Key (index)Value (element)
0[10]
1[31]
2[22, 17]
3[]
4[14]

Balancing Hash Table With Load Factor

Load factors are calculated to ensure a hash table maintains its O(1) time complexity.

Balancing occurs when a hash table has passed its set load factor.

Load Factor = n/m Where n = num of elements, m = num of keys

Thus, when certain fraction of the table size, say 75% full with 3 elements and 4 potential keys. The table must increase in size and rehash/reinsert every element.

The efficiency of a hash table decreases significantly without balancing, as this leads to increased collisions as the table fills up, which leads to time needed to resolve collisions, as we traverse through list to find the element in linear time O(n). Operations like search, insert, and delete, degrade from O(1) on average to O(n) in the worst case.

Usually, when the load factor is reached, the table size doubles to 2n. 0.75 or 75% full is a common load factor for a hash table.

Upon balancing, we need to rehash every element leading to n/2 additional space:

    New table size - current number of elements 
    (double table size * load factor) - current n elements
    (2 n * 0.75) 
    1.5 n  = New Load Factor Element Count

    New Load Factor Element Count - Current Element Count
    1.5 n - n 
    0.5
    n/2 = Additional space until next rebalance

Rehashing is expensive as we need to rehash every element with the new hash function into their new bucket, leading to O(n) for resizing and reinserting n elements.

Array Application: In place Transformations

We can perform transformations or reorderings on the array itself without using extra space.

Ex: Rotate an array to the right by k steps.

    def rotate(nums: List[int], k: int) -> None:
        
        n = len(nums)
        k %= n  # handle cases where k > n
        
        # Opposite ends reverse subarray [start, end]
        def reverse(start: int, end: int) -> None:
            
            # Reverse until hit middle of subarray
            while start < end:
                nums[start], nums[end] = nums[end], nums[start]
                start += 1
                end -= 1

        # Reverse entire array
        reverse(0, n - 1)
        # Reverse first k elements [0, k)
        reverse(0, k - 1)
        # Reverse remaining elements [k, n)
        reverse(k, n - 1)

HashMap Application: Representations

We can represent objects or data based on specific criteria.

Ex: Representing a string by character frequency

    def freqCount():

        # object
        s = "aabbcc"

        # representation
        freq = {}

        # mapping
        for char in s:
            freq[char] = freq.get(char, 0) + 1

    # freq = {'a': 2, 'b': 2, 'c': 2}

HashMap Application: Grouping by Criteria

We can group elements based on a defined criterion, such as sorting or categorization, using hashing to push values into the corresponding bucket.

Ex: Grouping string anagrams

    def groupAnagrams(strs):
        
        # groups
        anagrams = {}

        # for list
        for word in strs:

            # create key
            key = "".join(sorted(word))
            if key not in anagrams:
                anagrams[key] = []
            
            # hash key and put value into bucket
            anagrams[key].append(word)

    # anagrams = {'aet': ['eat', 'tea', 'ate'], 'ant': ['tan', 'nat'], 'abt': ['bat']}

HashMap Application: Memoization in Dynamic Programming

We can store solutions to sub problems to avoid redundant calculations.

Ex: Fibonacci number computation with memoization

    def fib(n, memo={}):
        if n <= 1:
            return n
        if n not in memo:
            memo[n] = fib(n - 1, memo) + fib(n - 2, memo)
        return memo[n]

HashMap Application: Backtracking with Caching for Pruning

We can cache previous paths or states to prune the search space effectively as we explore.

Ex: Subset Sum with caching:

    def subsetSum(nums, target):

        result = []

        # Cache to avoid exploring previously visited paths
        cache = {}

        def dfs_backtrack(current, index, total):
            
            # Check if already explored state
            state = (tuple(current), total)
            if state in cache:
                return

            # Mark as explored
            cache[state] = True

            # Early Prune -> no potential valid path is current total exceeds target
            if total > target:
                return
            
            # Valid solution -> no need to explore further, backtrack
            if total == target:
                result.append(list(current))
                return

            # Explore -> branches from current state
            for i in range(index, len(nums)):
                
                # Build
                current.append(nums[i])
                # Explore
                dfs_backtrack(current, i + 1, total + nums[i])
                # Backtrack
                current.pop()

        dfs_backtrack([], 0, 0)
        return result

HashMap Application: Representing Relationships

We can model relationships between entities.

Ex: Adjacency list for a graph

    def graph():

        # List of edges in the graph
        edges = [(1, 2), (2, 3), (1, 3)]

        # Adjacency list
        graph = {}

        # Build the adjacency list
        for u, v in edges:

            # If the node 'u' or 'v' is not yet in the graph, initialize it
            if u not in graph:
                graph[u] = []
            if v not in graph:
                graph[v] = []

            # Add 'v' to the list of neighbors for 'u' and 'u' to 'v'
            graph[u].append(v)
            graph[v].append(u)

    # graph = {1: [2, 3], 2: [1, 3], 3: [2, 1]}

HashMap Application: Index with Data Key

We can index into arrays based on data or data structures.

Ex: Store integer complements for quick lookup:

    def twoSum(nums, target):

        # Complement -> index
        complement_map = {}

        for i, num in enumerate(nums):
            
            # Calculate complement
            complement = target - num
            
            # Lookup(complement)
            if num in complement_map:
                return [complement_map[num], i]
            
            # Put(complement)
            complement_map[complement] = i

        return []

HashMap Application: Algorithm

There are cases where problem that seems to require a hashmap have an existing algorithm made for that problem.

Ex: Boyer Moore Voting Algorithm

def majorityElement(nums: List[int]) -> int:
    
    # Find element that appears more then floor(n/2) times

    # votes for candidate
    count = 0

    # current candidate
    candidate = None

    for num in nums:
        
        # Reset candidate
        if count == 0:
            candidate = num

        if num == candidate:
            count += 1
        else:
            count -= 1

    # Confirm the candidate ( optional if majority element is guaranteed)
    return candidate

217. Contains Duplicate ::4:: - Easy

Topics: Array, Hash Table, Sorting

Intro

Given an integer array nums, return true if any value appears at least twice in the array, return false if every element is distinct.

Example InputOutput
nums = [1,2,3,1]true
nums = [1,2,3,4]false
nums = [1,1,1,3,3,4,3,2,4,2]true

Constraints:

1 ≤ nums.length ≤ 105

-109 ≤ nums[i] ≤ 109

Abstraction

Given a list of elements, check for duplicates.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
HashmapO(n)O(n)Iterate over array O(n) + get, put in O(1)Dictionary size relative to input O(n)
SetO(n)O(n)Iterate over array O(n) + get, in in O(1)Set size relative to input O(n)
Sort + IterateO(n log n)O(1)TimSort over array with Worst/Avg: O(n log n), Best: O(n)No additional memory allocation O(1)

Solution 1: Hashmap [TC Opt] - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:

        # sc: hashmap relative to input O(n)
        count = defaultdict(int)

        # tc: iterate over list O(n)
        for num in nums:

            # tc: in operation constant O(1)
            if count[num] >= 1:
                return True
            # tc: put operation constant O(1)
            count[num] += 1

        # overall: tc O(n) 
        # overall: sc O(n)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
IterationO(n)O(n)Iterate over array in linear O(n) + get, put in O(1)Dictionary relative to input O(n)
OverallO(n)O(n)Iteration over input dominates, O(n)Allocation for dictionary dominates, O(n)

Solution 2: Seen Set [TC Opt] - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:
        
        # sc: set relative to input O(n)
        seen = set()

        # tc: iterate over list O(n)
        for n in nums:

            # tc: in operation O(1)
            if n in seen:
                return True
            # tc: add operation O(1)
            seen.add(n)
            
        # overall: tc O(n)
        # overall: sc O(n)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
IterationO(n)O(n)Iteration over list in linear O(n) + in, add operation O(1)Set size relative to input O(n)
OverallO(n)O(n)Iteration over list dominates, O(n)Allocation relative to input dominates, O(n)

Solution 3: Unique Set Length Comparison - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:
        
        # Create unique set
        # tc: iterate over original n list O(n)
        # sc: set relative to input O(n)
        unique = set(nums)

        # Compare if unique set is different than original list
        # tc: length operation constant O(1)
        res = len(unique) != len(nums)
            
        # overall: tc O(n)
        # overall: sc O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
SetO(n)O(n)Creating new set requires iteration over list in linear O(n) + in, add operation O(1)Set size relative to input O(n)
OverallO(n)O(n)Iteration over list dominates, O(n)Allocation relative to input dominates, O(n)

Solution 4: Sort Iterate Comparison [SC Opt] - Hashmap/Representation

    def containsDuplicate(self, nums: List[int]) -> bool:
        
        # tc: python (2.3-3.10) TimSort, Python (3.11+) PowerSort Worst/Avg: O(n log n), Best: O(n)
        # sc: sorts in place, no extra memory O(1)
        nums.sort()

        # tc: iterate over list O(n)
        for i in range(1, len(nums)):

            # tc: comparison operation O(1)
            if nums[i] == nums[i - 1]
                return True

        # overall: tc O(n log n)
        # overall: sc O(1)
        return False
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
SortingO(n log n)O(1)TimSort over array with Worst/Avg: O(n log n), Best O(n)In place sorting, no extra memory O(1)
IterationO(n)O(1)Iteration over list in linear O(n) + (in, add) operation O(1)No additional memory allocation O(1)
OverallO(n)O(n)Iteration over list dominates, O(n)In place sorting, no additional memory allocation O(1)

242. Valid Anagram ::4:: - Easy

Topics: Hash Table, String, Sorting

Intro

Given two strings s and t, return true if t is an anagram of s, and false otherwise. An Anagram is a word or phrase formed by rearranging the letters of a different word using all original letters exactly once.

Example Input 'S'Example Input 'T'Output
"anagram""nagaram"true
"rat""car"false

Constraints:

1 ≤ s.length, t.length ≤ 5 * 104

s and t consist of lowercase English letters

Abstraction

Given two strings determine if they have the same character count.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
HashmapO(n)O(1)Two iterations over each word, linear O(n) + lookup, put in constant time O(1)Dictionary will be proportional to input, space O(n)
Array of 26O(n)O(1)Single iteration over both words at same time O(n) + lookup, put in constant time O(1)Array will be constant 26 elements, space O(1)

Solution 1: Hashmap Foreach Double Pass [TC Opt] - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:
        
        # sc: hashmap of 26 elements O(1) 
        count = defaultdict(int)

        # tc: two iteration over string length O(n) * 2 ~= O(n)
        for x in s:
           count[x] += 1
        for x in t:
           count[x] -= 1

        # tc: iteration over char count O(n)
        for value in count.values():

            # tc: get() in constant O(1)
            if value != 0:
                return False

        # overall: tc O(n)
        # overall: sc O(1)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
IterationO(n)O(n)Iterate over two strings in linear O(n) + put, get in O(1)Dictionary relative to input O(n)
VerificationO(n)O(1)Iterating over char count in linear O(n) + get in O(1)No additional memory O(1)
OverallO(n)O(n)Iteration over input dominates, O(n)Allocation for dictionary relative to input dominates, O(n)

Solution 2: Array of 26 Indexing Single Pass [SC Opt] - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:

        # Note:
        # ord() converts unicode char to int representation
        # and use int to index into array

        # sc: array of 26 constant O(1)
        count = [0] * 26

        # Check: mismatch length
        if len(s) != len(t):
            return False

        # tc: single iterate over both strings at same time O(n)
        for i in range(len(s)):
            count[ord(s[i]) - ord('a')] += 1
            count[ord(t[i]) - ord('a')] -= 1

        # tc: iterate over count array length 26 O(1)
        for value in count:
            if value != 0:
                return False

        # overall: tc O(n)
        # overall: sc O(1)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
IterationO(n)O(1)One iteration over both strings in linear O(n) + put, get in O(1)Array of constant 26 size O(1)
VerificationO(n)O(1)Iterating over count array length 26 in constant O(1)No additional memory O(1)
OverallO(n)O(1)Iteration over input dominates, O(n)Allocation for array is constant O(1)

Solution 3: [Follow up] Unicode Extension Normalization For Hashmap Foreach Double Pass - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:
        
        # Note:
        # Why Normalization is required in real life?
        # Because Unicode allows characters to be represented in multiple valid ways.
        # Why? Doesn't that make it more complicated?
        # Yes, but the alternative would be needing millions of characters:
        # Some languages have hundred of diacritics (e.g, dots, tildes, strokes) 
        # that can be superimposed to create the letters for their alphabet.
        # When Unicode was created, it had to support existing encodings (Latin-1, MacRoman, Windows-1252)
        # which have pre-composed/defined characters (é, ñ, ü) as well as other encodings as mentioned above 
        # with use diacritics.
        # The solution is to allow a composing system, that instead of pre-encoding every possible combination (infinite)
        # instead we use base characters + combining marks. (letter + marks)
        
        # Thus, we need normalization to allow visually equal characters which are built/composed in different
        # ways to be equal.

        # Real world examples:
        # - Text copied from different platforms or editors
        # - User input from different keyboards and OSes
        # - Databases storing mixed Unicode forms
        # - International and multilingual applications
        # - Prevent subtle equality bugs and security issues
        # - etc.
        
        # Without normalization (built/composed differently do not equal, even if visually equal):
        #   "é" != "e\u0301"
        
        # With normalization (build/composed differently do equal, because they are visually equal):
        #   "é" == "e\u0301"


        # NFC = Normalization Form C, where the C means Canonical Composition
        # Combines characters into their precomposed forms when possible.
        # So e +  ́ = é.
        # allows for even comparison

        # NFD = Normalization Form C, where the C means Canonical Decomposition
        # so the inverse

        # tc: iteration relative to input O(n)
        # sc: 2 string copies relative to input O(n)
        import unicodedata
        s = unicodedata.normalize("NFC", s)
        t = unicodedata.normalize("NFC", t)

        # early len check
        if len(s) != len(t):
            return False

        # sc: dictionary possibly up to number of distinct characters O(k)
        count = defaultdict(int)

        # tc: O(n) over s
        for c in s:
            count[c] += 1

        # tc: O(n) over t
        for c in t:
            count[c] -= 1

        # tc: O(k) over unique keys (≤ n)
        for value in count.values():
            if value != 0:
                return False

        # overall: tc O(n)
        # overall: sc O(k) <= O(n)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
NormalizationO(n)O(n)Unicode normalization scans full stringNormalized string creates a copy
IterationO(n)O(k)Two linear iterations over inputDictionary grows with unique characters
VerificationO(k)O(1)Iterate over dictionary valuesNo additional memory allocation
OverallO(n)O(n)Linear iterations dominate, O(n)Worst case unique chars proportional to input O(n)

Solution 4: [Follow up] Unicode Extension Normalization For Array of 26 Indexing Single Pass - Hashmap/Representation

    def isAnagram(self, s: str, t: str) -> bool:

        # Note: Unicode cannot be extended for fixed array of 26 solution 

        # English lowercase letters are bounded: only 26
        # Unicode is not bounded:
        # > 140,000+ code points
        # multiple languages
        # emojis
        # symbols
        # combining characters

        # Thus, no fixed small upper bound we can use for our fixed array single pass solution
        # We need a hashmap / dictionary approach for variable size flexibility

        N/A
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
N/Axxxx

1. Two Sum I Not Sorted ::1:: - Easy

Topics: Array, Hash Table

Intro

Given an array of integers nums and an integer target, return indices of two numbers such that they add up to target. You can assume each test case only has one solution and you may cant use the same element twice. Answer can be returned in any order.

nums[]TargetOutput
[2,7,11,15]9[0,1]
[3,2,4]6[1,2]
[3,3]6[0,1]

Constraints:

Only one valid answer exists

2 ≤ nums.length ≤ 10^4

-10^9 ≤ target ≤ 10^9

Abstraction

We need to find two elements that add up to the target and return their indexes.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Hashmap of ComplementsO(n)O(n)Iterate over array O(n)Allocation relative to input O(n)

Solution 1: [Follow up] Hashmap Tracking Complement Target [TC Opt] - Hashmap/Index with Data Key

    def twoSum(self, nums: List[int], target: int) -> List[int]:
        
        # Note:
        # dictionary is complement -> index 

        # sc: dictionary size relative to input O(n)
        tracking = {}
 
        # tc: iterate over list O(n)
        for i in range(len(nums)):

            # find the complement we should be tracking
            complement = target - nums[i]

            # tc: in operation O(1)
            if complement in tracking:
                return [i, tracking[complement]]
    
            # tc: put operation O(1)
            tracking[nums[i]] = i

        # overall: tc O(n) 
        # overall: sc O(n)
        return []
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
IterationO(n)O(n)Iterate over list in linear O(n) + in, put O(n)Dictionary relative to input O(n)
OverallO(n)O(n)Iterate over list dominates, O(n)Allocation relative to input dominates, O(n)

49. Group Anagrams ::2:: - Medium

Topics: Array, Hash Table, String, Sorting

Intro

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word formed by rearranging the letters of a different word using all the original letters exactly once.

InputOutput
["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]
[""][[""]]
["a"][["a"]]

Constraints:

strs[i] consists of lowercase English letters

Abstraction

Group the matching anagrams together.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Array to Tuple KeyO(n * k)O(n * k)Iterate over list O(n) * iterate over string k length O(k)Dictionary relative to input O(n * k)
Array to String KeyO(n * k)O(n * k)Iterate over list O(n) * iterate over string k length O(k)Dictionary relative to input O(n * k)

Bug: Mutable object, list, cannot be hashed

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        anaGroup = {}
        
        for word in strs:
            
            charCount = [0] * 26
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # BUG: 
            # Attempted to hash mutable list charCount
            if charCount not in anaGroup:
                anaGroup[charCount] = []
            
            anaGroup[charCount].append(word)
        
        return list(anaGroup.values())

Bug: Cannot use unsorted hashmap to tuple as key

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        groupAna = {}

        for s in strs:
            count = defaultdict(int)

            for c in s:
                count[c] += 1

            tupKey = tuple(count)

            # BUG: 
            # hashmaps are not sorted
            # so tuple will vary (e.g., "cat" vs "tac")
            if tupKey not in groupAna:
                groupAna[tupKey] = []

            groupAna[tupKey].append(s)

        return list(groupAna.values())

Solution 1: Array to Tuple Count Key [TC Opt] - Hashmap/Grouping by Criteria

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        # Note: 
        # Tuples immutable, allowing them to be hashed
        # Array allows for constant ordered representation

        # sc: store k chars for n words O(n * k)
        anaGroup = {}

        # tc: iterate over list O(n) 
        for word in strs:
            
            # sc: array for 26 O(1)
            charCount = [0] * 26  
            
            # tc: iterate over string k length O(k)
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # sc: array of 26 to tuple O(1)
            key = tuple(charCount)  
            
            # tc: in operation O(1) 
            if key not in anaGroup:
                anaGroup[key] = []  

            # tc: put operation O(1)
            anaGroup[key].append(word)  


        # overall: tc O(n * k)
        # overall: sc O(n * k)
        return list(anaGroup.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterate listO(n)O(1)Iterate over list O(n)No allocation for iteration O(1)
Iterate stringO(k)O(1)Iterate over string k length O(k)Array of 26 per iteration O(1)
Store string in anagram groupO(1)O(n * k)Put operation constant O(1)Dictionary stores n words of k length O(n * k)
OverallO(n * k)O(n * k)Iteration over list and strings dominate, O(n * k)Allocation for all strings dominates, O(n * k)

Solution 2: Array to String Delimited Count Key [TC Opt] - Hashmap/Grouping by Criteria

    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:

        # Note:
        # Appending to a list and joining once is more efficient than repeatedly
        # appending to a string. Strings are immutable, so each concatenation
        # creates a new string and copies all existing characters.
        # tc: list append + join: O(m), repeated string concat: O(m^2)

        # sc: stores n tuple keys O(n) and lists of original k strings O(k), O(n * k)
        anaGroup = {}

        # tc: iterate over list O(n)
        for word in strs:

            # sc: array of 26 constant O(1)
            charCount = [0] * 26
            
            # tc: iterate over string O(m)
            for char in word:
                charCount[ord(char) - ord('a')] += 1
            
            # tc: array of 26 to list O(1)
            values = []
            for count in charCount:
                values.append(str(count))
                # delimiter
                values.append("#")  

            # tc: concat list of 26 to string O(1)
            groupCountKey = ''.join(values)

            # tc: in operation O(1)
            if groupCountKey not in anaGroup:
                anaGroup[groupCountKey] = []

            # tc: put operation O(1)
            anaGroup[groupCountKey].append(word)

        # overall: tc O(n * k)
        # overall: sc O(n * k)
        return list(anaGroup.values())
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterate listO(n * m)O(1)Iterate over list O(n)No allocation for iteration O(1)
Iterate stringO(k)O(1)Iterate string of k length O(k)Array of 26 O(1)
OverallO(n * k)O(n * k)Iterating over list and strings dominate, O(n * k)Allocation for all strings dominates, O(n * k)

347. Top K Elements in List ::3:: - Medium

Topics: Array, Hash Table, Divide and Conquer, Sorting, Heap (Priority Queue), Bucket Sort, Counting, Quickselect

Intro

Given an integer array nums and an integer k, return the k most frequent element within the array. Test cases are generated such that the answer is always unique. You may return the output in any order

InputkOutput
[1,2,2,3,3,3,3]2[2,3]
[7,7]1[7]

Constraints:

1 ≤ k ≤ number of distinct elements in nums

-1000 ≤ nums[i] ≤ 1000

Abstraction

To find the k most frequent elements, we must first create an occurrence counter for each element in the list. Now that we have the count, we just grab the top k highest occurring elements.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
MinHeapO(n log k)O(n)Iterating over freq counts * heap operations dominate, O(n log k)Allocation for freq count of n elements dominates, O(n)
QuickSelectAverage: O(n) Worst: O(n2)O(n)Good pivot splits array in half on average O(n), Bad pivot each iteration, smallest or largest, causes O(n2)Allocation for freq count of n elements dominates, O(n)
BucketSortO(n)O(n)Iterating over list dominates, O(n)Allocating n buckets dominates, O(n)

Bug: Inverted Binary Search if statement:

    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        
        def partitionSection(left, right, randIndex):
            freqPartition = frequency[unique[randIndex]]
            leftPartition = left
            unique[randIndex], unique[right] = unique[right], unique[randIndex]

            for i in range (left, right):
                if frequency[unique[i]] > freqPartition:
                    unique[i], unique[leftPartition] = unique[leftPartition], unique[i]
                    leftPartition += 1
            
            unique[right], unique[leftPartition] = unique[leftPartition], unique[right]

            return leftPartition

        def quickSelectBinarySearchHelper(left, right, finalMarker):
            
            # ran out of space
            if left == right:
                return

            randIndex = random.randint(left, right)
            resIndex = partitionSection(left, right, randIndex)

            if resIndex == finalMarker:
                return

            # BUG:
            # this format is confusing
            # we are searching for the finalMarker, so always have that on 
            # the left.
            # The correct order should be:
            # if the finalMarker is less than the result, then we shift right
            # but because of the confused order, we are doing the reverse
            # INSTEAD:
            # elif finalMarker < resIndex 
            #   quickSelectBinarySearchHelper(left, resIndex-1, finalMarker)
            elif resIndex < finalMarker:
                quickSelectBinarySearchHelper(left, resIndex-1, finalMarker)
            else:
                quickSelectBinarySearchHelper(resIndex+1, right, finalMarker)

        frequency = defaultdict(int)
        for n in nums:
            frequency[n] += 1
        
        unique = list(frequency.keys())
        n = len(unique)
        l, r = 0, n-1

        finalIndex = k-1
        quickSelectBinarySearchHelper(l, r, finalIndex)
        return unique[:k]

Solution 1: MinHeap Track K Elements [TC Opt] - HashMap/Algorithm

    def topKFrequent(self, nums: List[int], k:int) -> List[int]:

        # MinHeap: 
        # Heap only guarantees that the root is the min or max, and
        # provides no guarantee about the relative order of nodes. 
        # The minHeap root will hold the smallest element,
        # as we iterate we will add elements, 
        # and we only remove the root/smallest element when heap exceeds size k.
        # This ensures the heap will hold the k most frequent
        # elements we have seen so far

        # sc: freq count for list, m unique elements in worst case O(n)
        freq = defaultdict(int)

        # Calculate freq count for each element
        # tc: iterate over list O(n)
        for num in nums: 
            freq[num] += 1

        # sc: minHeap holds smallest k elements O(k)
        minHeap = []

        # tc: iterate over freq count, m unique elements in worst case O(n)
        for (num, freq) in freq.items(): 

            # tc: push operation O(log n)
            heapq.heappush(minHeap, (freq, num)) 
            
            # if heap grows to size k + 1, pop least freq root so we keep the k most freq elements
            if len(minHeap) > k:
t6
                # tc: pop smallest element O(log k)
                heapq.heappop(minHeap) 
        

        # alternative expanded loop:
        # tc: grab top k occurring elements from minHeap, worst case grab n elements O(n)
        result = []
        for (_, num) in minHeap:
            result.append(num) 

        # alternative shorter notation
        # result = [num for freq, num in minHeap]

        # overall: tc O(n log n)
        # overall: sc O(n)
        return result
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterate ListO(n)O(n)Iterate over list O(n)Allocation for n elements O(n)
Iterate Freq Counts + MinHeapO(n log k)O(k)Iterate over counts * Push/BubbleUp, Pop O(log k) for heap of size kAllocation for heap of max k elements O(k)
MinHeap Grab kO(k)O(k)Iterate over minHeap of k size O(k)Result list of size k O(k)
OverallO(n log k)O(n)Iterating over freq counts * heap operations dominate, O(n log k)Allocation for freq count of n elements dominates, O(n)

Solution 2: QuickSelect BinarySearch High to Low Sort [TC Opt] - Hashmap/Algorithm

    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    
        # QuickSelect Modified QuickSort:
        # QuickSort sorts based on a partitioned element, whereas when QuickSort
        # complete, everything to the left of the partitioned element is less than it
        # and everything to the right of the partitioned element is greater than it.
        # QuickSelect uses a finalPartitionMarker marks the final expected correct position, 
        # where at which point it stops and elements to the left and right of finalPartitionMarker
        # are guaranteed to be smaller/larger.
        # This avoids sorting the entire array by only sorting as many elements until the 
        # finalPartitionMarker is in the correct place which
        # isolates the minimum elements needed to complete the task

        # Helper: 
        # In place partition of subarray [left,right] relative to random pivot index value
        def partitionSection(left, right, randPivotElemIndex):
            
            # tc: subarray m elements, n in worst case O(n)
            
            # Frequency count of random pivot element
            pivotElemFreq = frequency[unique[randPivotElemIndex]]

            # Move random pivot element to end
            # Tuple unpacking in python allows swapping two variables
            unique[randPivotElemIndex], unique[right] = unique[right], unique[randPivotElemIndex]
            
            # Left index to partition/move larger elements to the left boundary of subarray
            partitionIndex = left

            # tc: iterate over subarray, n in worst case O(n)
            for i in range(left, right):

                # Partition elements with frequencies above the pivotElem Freq (pivotElemFreq < freq),
                # move them to left (move larger elements greater than the pivot to the left of the pivot)
                if pivotElemFreq < frequency[unique[i]]:  
                    
                    # Slicing:
                    # Placing larger freq element on left [high... low] 
                    # allows [:k] to grab the first top k freq elements
                    unique[i], unique[partitionIndex] = unique[partitionIndex], unique[i]
                    partitionIndex += 1

            # set pivot to correct partitioned index,
            # elements left and right of pivot follow [greater... pivot ... lesser]
            unique[partitionIndex], unique[right] = unique[right], unique[partitionIndex]
            return partitionIndex

        # Helper: 
        # Wrapper function that continues to pick random partition element until we have 
        # placed some element at the "finalPartitionMarker" index
        # tc: average recursion depth O(log m)
        # sc: recursion stack for in place partitioning on average O(log m)
        def quickSelectBinarySearchHelper(left, right, finalPartitionMarker):
            
            # Base Case:
            if left == right:
                return

            # Random pivot index:
            # Differs from QuickSort "median-of-three" approach.
            # Random pivot focuses on finding the kth smallest or largest,
            # rather than fully sorting the array, and using random pivot avoids 
            # degrading to worst case O(n^2)
            randPivotElemIndex = random.randint(left, right)

            # Partition subarray
            resultPivotElemIndex = partitionSection(left, right, randPivotElemIndex)
            
            # Base Case: Random partition marker ended up in correct place
            if finalPartitionMarker == resultPivotElemIndex:
                # -> allows us to grab k top freq count by [:k]
                return 
            
            # Binary Search Modification:
            # Selects next subarray and partition pivot
            # resultPivotElemIndex must recurse towards side where finalPartitionMark is on

            # Result was to the right of final expected, 
            # search to the left of the result: [left, resultPivot-1]
            elif finalPartitionMarker < resultPivotElemIndex:
                quickSelectBinarySearchHelper(left, resultPivotElemIndex - 1, finalPartitionMarker)
            
            # Result was to the left of final expected,
            # search to the right of the result: [resultPivot+1, right]
            else:
                quickSelectBinarySearchHelper(resultPivotElemIndex + 1, right, finalPartitionMarker)

        # Calculate freq count for each element
        # tc: iterate over list O(n)
        # sc: allocate for list, n unique elements worst case O(n)
        frequency = defaultdict(int)
        for num in nums:
            frequency[num] += 1

        # Top Kth Elements Indexing: Top 2 elements are indexes [0, 1]
        # [9 7 4 3 1]
        #  0 1 2 3 4
        finalPartition = k - 1

        # Unique count to sort by during partitioning
        unique = list(frequency.keys())
        n = len(unique)
        l, r = 0, n - 1

        # For High to Low (descending) indexing:
        # with list of 6 elements, grab largest 2 elements
        # 2 - 1 = 1 (our 2nd largest)
        # splice [:2] = [0, 1]
        quickSelectBinarySearchHelper(l, r, finalPartition)

        # Pivot is now in the correct partitioned index.
        # Elements left and right of pivot follow [greater... pivot ... lesser]
        # Slice the first k to grab the top k elements

        # overall: tc O(n)
        # overall: sc O(n)
        return unique[:k]
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterate ListO(n)O(n)Iterate over list O(n)Allocation for n elements O(n)
QuickSort PartitionAverage: O(n) Worst: O(n2)O(1)Good pivot splits array in half on average O(n), Bad pivot each iteration, smallest or largest, causes O(n2)Partition of freq count array occurs in place O(1)
QuickSort RecursionO(k)O(k)Iterate over buckets for k steps O(k)Result list of size k O(k)
Result SpliceO(k)O(k)Splicing array for first k elementsResult list of k elements
OverallAverage: O(n) Worst: O(n2)O(n)Bad pivot partition dominates O(n2)Allocation for freq count of n elements dominates, O(n)

Solution 3: BucketSort by Count [TC Opt] - Hashmap/Algorithm

    def topKFrequent(self, nums: List[int], k:int) -> List[int]:
        
        # BucketSort:
        # Grouping values by freq count into buckets.
        # Allows us to iterate over most freq buckets and grab top k elements

        # Calculate freq count for each element
        # tc: iterate over list of n integers O(n)
        # sc: frequency count for unique integers O(m) 
        count = defaultdict(int)
        for key in nums:
            count[key] += 1

        # numBuckets index = items with that frequency count
        # numBuckets must account for max freq count case, where list is full of only 1 element
        # numBuckets must account for empty list, (but still needs a bucket index 0), so we need
        # len(num)+1,
        # empty: 1 bucket: [0] representing empty group
        # else: list of len 5, 6 buckets: [0, 1, 2, 3, 4, 5] each representing length group,
        # with max freq count of 5
        numBuckets = len(nums) + 1

        # list of empty lists, numBucket amount of times 
        # tc: iterate over list length O(n)
        # sc: create len(list)+1 buckets O(n)
        freqBuckets = [[] for i in range(numBuckets)]

        # tc: iterate over frequency list for m unique integer tuples (int, occurrences) O(m) 
        for int, occurrences in count.items():
            freqBuckets[occurrences].append(int)

        # sc: grabbing top k integers, n worst case O(k)
        res = []

        # tc: iterate over len(list)+1 buckets O(n)
        for i in range(len(freqBuckets) - 1, 0, -1):
            
            # tc: iterate over all elements in curr bucket O(m)
            for num in freqBuckets[i]:
                
                # tc: insert operation O(1)
                res.append(num)
                
                # tc: continue while less than k elements grabbed O(k)
                if len(res) == k:
                    return res

        # overall: tc O(n)
        # overall: sc O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Iterate listO(n)O(n)Iteration over list O(n)Allocation for n elements O(n)
Bucket CreationO(n)O(n)Iterate len(list)+1 steps O(n)Allocate len(list)+1 buckets O(n)
Bucket PopulationO(n)O(n)Inserting n integers into buckets O(n)Buckets store n integers O(m)
Bucket Grab kO(k)O(k)Iterate over buckets for k steps O(k)Result list of size k O(k)
OverallO(n)O(n)Iterating over list dominates, O(n)Allocating n buckets dominates, O(n)

238. Product of Array Except Self ::2:: - Medium

Topics: Array, Prefix Sum

Intro

Given an integer array nums, return an array answer such that answer[i] is equal to the product of all the elements of nums except nums[i]. The product of any prefix or suffix of nums is guaranteed to fit in a 32-bit integer. You must write an algorithm that runs in O(n) time and without using the division operation.

InputOutput
[1,2,3,4][24,12,8,6]
[-1,1,0,-3,3][0,0,9,0,0]

Constraints:

Product of any prefix or suffix is guaranteed to fit into 32 bit integer

Abstraction

Given a list of nums, return a list of nums that is the product of the array excluding itself. For num n, the result should be the product of all the numbers to the left of it, times the product of all the numbers to the right of it.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Prefix & PostfixO(n)O(n)Three iterations, prefix, postfix, result, O(3n) ~= O(n)Allocation for 3 arrays O(3n) ~= O(n)
Prefix & PostfixO(n)O(1)Two iterations, prefix, postfix, O(n)Allocation for 1 array O(n)

Solution 1: Prefix, Postfix, Result Arrays - Array/In Place Transformations

    def productExceptSelf(self, nums: List[int]) -> List[int]:
        
        n = len(nums)

        # sc: prefix, postfix, result, arrays O(n)
        prefix = [1] * n
        postfix = [1] * n
        res = [1] * n

        # Prefix
        # tc: iterate list O(n)
        for i in range(1, n):
            
            # calculate prefix of i = (prefix of i - 1) * (nums[i - 1])
            prefix[i] = prefix[i - 1] * nums[i - 1]

        # Postfix
        # postfix starts at 1 = the postfix of len(n) - 1
        # start reverse iteration at: len(n) - 2
        # tc: iterate list O(n)
        for i in range(n-2, -1, -1):

            # calculate postfix of i = (postfix of i + 1) * (nums[i + 1])
            postfix[i] = postfix[i + 1] * nums[i + 1]

        # Result
        # tc: iterate list O(n)
        for i in range(n):

            # calculate product except self for i = (prefix of i) * (postfix of i)
            res[i] = prefix[i] * postfix[i]


        # overall: tc O(n)
        # overall: sc O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
PrefixO(n)O(n)Iterate over list O(n)Allocation for prefix array O(n)
PostfixO(n)O(n)Iterate over list O(n)Allocation for postfix array O(n)
ResultO(n)O(n)Iterate over list O(n)Allocation for result array O(n)
OverallO(n)O(n)Iteration over list dominates, O(n)Allocation for list dominates, O(n)

Solution 2: Result Array - Array/In Place Transformations

    def productExceptSelf(self, nums: List[int]) -> List[int]:
        
        n = len(nums)

        # sc: result array O(1) 
        res = [1] * n

        # Running Prefix: 
        # Generated through iteration with a starting prefix for the first element being 1, 
        # ignore and start at element 2 (index 1)

        # tc: iterate list O(n)
        for i in range(1, n):

            # (prefix of i) = (prefix of i - 1) * (num at i - 1)
            res[i] = res[i - 1] * nums[i - 1]

        # Running Postfix: 
        # Generated through reverse iteration with a starting postfix for the last element is 1, 
        # start at the last element (index n-1)
        postfix = 1

        # tc: iterate list in reverse O(n)
        for i in range(n-1, -1, -1):
            
            # calculate product except self for i = (prefix of i) * (postfix of i)
            res[i] *= postfix
            
            # update postfix = (postfix of i) * (num at i) 
            postfix *= nums[i]

        # overall: tc O(n)
        # overall: sc O(1)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
PrefixO(n)O(1)Iteration over list O(n)Allocation for result array O(1)
ResultO(n)O(1)Iteration over list O(n)Allocation for result array O(1)
OverallO(n)O(1)Iteration over list dominates, O(n)Allocation for result array dominates, O(1)

36. Valid Sudoku ::2:: - Medium

Topics: Array, Hash Table, Matrix

Intro

Determine if a 9 x 9 Sudoku board is valid. Only the filled cells need to be validated according to the following rules: Each row must contain the digits 1-9 without repetition. Each column must contain the digits 1-9 without repetition. Each of the nine 3 x 3 sub-boxes of the grid must contain the digits 1-9 without repetition.

Constraints:

Only the currently filled cells need to be validated, regardless is sudoku board is actually solvable or not.

InputOutput
a sudoku table is too big to put here loljust look at -> sudoku board example

Abstraction

Abstract board into sets of rows, cols, and boxes, and validate whether duplicate values exist in any set.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
SetO(n2)O(n2)Iterate over grid dominates, O(n2)Allocation for grid cells O(n2)
ArrayO(n3)O(n2)Iterate over grid * linear array in operation dominates, O(n3)Allocation for grid dominates, O(n2)

Bug: Initialize DefaultDict of Sets Incorrectly

    def isValidSudoku(self, board: List[List[str]]) -> bool:

        # BUG:
        # defaultdict argument must be callable or None
        # --------
        # Differences: [], list, set:
        # []: instance of the list, not callable
        # list: class creates a new list object when called, callable
        # set: class creates a new set object when called, callable

        # Instead:
        # defaultdict(set)
        
        rows = defaultdict([])
        cols = defaultdict([])
        boxs = defaultdict([])

        for r in range(9):
            for c in range(9):
                tmp = board[r][c]

                if tmp != ".":

                    boxKey = (r//3, c//3)

                    if (tmp in rows[r] or
                        tmp in cols[c] or
                        tmp in boxs[boxKey]):
                        return False

                    rows[r].add(tmp)
                    cols[c].add(tmp)
                    boxs[boxKey].add(tmp)
        return True

Bug: 81 Tuples instead of 9 Tuples for Boxes

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        rows = defaultdict(set)
        cols = defaultdict(set)
        boxs = defaultdict(set)


        for r in range(9):
            for c in range(9):
                tmp = board[r][c]

                if tmp != ".":

                    # BUG:
                    # will create 81 possible boxes (9 rows * 9 columns)
                    # Instead:
                    # boxKey = tuple((r//3, c//3))
                    # creates 9 boxes 
                    # (9//3), (9//3) = (1-3, 1-3) = 9 possibilities
                    boxKey = tuple((r, c))

                    if (tmp in rows[r] or
                        tmp in cols[c] or
                        tmp in boxs[boxKey]):
                        return False

                    rows[r].add(tmp)
                    cols[c].add(tmp)
                    boxs[boxKey].add(tmp)

        return True

Solution 1: DefaultDict With // Floor Division For (r,c) Immutable/Hashable Tuple Key - Hashmap/Representation

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        
        # Dictionaries And Immutable/Hashable Keys:
        # Since dictionaries use hashing, we can use any object that is 
        # both immutable and hashable as the key, 
        # which allows us to create tuples for our keys

        # sc: dictionary for rows, columns, boxes O(n)
        cols = defaultdict(set)
        rows = defaultdict(set)
        grids = defaultdict(set)

        # tc: iterate over grid (r * c) O(n^2)
        for r in range(9):
            for c in range(9):
                
                cell = board[r][c]
                if cell != ".":

                    # index between 9 box sets by tuple key (r/3, c/3)
                    gridTuple = (r // 3, c // 3)

                    # tc: lookup operation O(1)
                    if (cell in rows[r] or 
                        cell in cols[c] or 
                        cell in grids[gridTuple] ):
                        return False
                    
                    # tc: put operation O(1)
                    cols[c].add(cell)
                    rows[r].add(cell)
                    grids[gridTuple].add(cell)

        # overall: tc O(n^2)
        # overall: sc O(n)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Row, Col, Box InitO(1)O(n2)No iteration O(1)Dictionary allocation for grid cells O(n2)
Iterate over gridO(n2)O(1)Iterate over grid O(n2)No allocation O(1)
OperationsO(1)O(1)Lookup, Put operation O(1)No allocation for operations O(1)
OverallO(n2)O(n2)Iterate over grid dominates, O(n2)Allocation for grid cells O(n2)

Solution 2: Array of Arrays [[]] With // Floor Division For 9 Indexing Options [TC Opt] - Hashmap/Representation

    def isValidSudoku(self, board: List[List[str]]) -> bool:
        
        # Array of Array vs Dictionary:
        # Due to the overhead of the dictionary functions,
        # for a small use case such as 9x9 using direct array of arrays is faster.
        # For larger use cases though, dictionary would be more optimal.

        # sc: arrays for rows, columns, boxes O(n)
        rows = [[], [], [], [], [], [], [], [], []]
        col = [[], [], [], [], [], [], [], [], []]
        grids = [[], [], [], [], [], [], [], [], []]

        # tc: iterate over grid (r * c) O(n^2)
        for r in range(9):
            for c in range(9):
                
                cell = board[r][c]
                if cell != ".":  

                    # indexing box sets by unique int calculation:
                    # ------------
                    #  (j//3)     = 0, 1, 2
                    #  (i//3) * 3 = 0, 3, 6
                    # ------------
                    #    0 1 2
                    # 0  0 1 2
                    # 3  3 4 5
                    # 6  6 7 8
                    # ------------
                    # allows to 9 indexing options

                    boxKey = (r//3) + ((c//3) * 3)

                    # tc: lookup operation O(1)
                    if (cell in rows[r] or 
                        cell in col[c]  or 
                        cell in grids[boxKey]):
                        return False

                    # sc: put operation O(1)
                    col[c].append(cell)
                    rows[r].append(cell)
                    grids[boxKey].append(cell)
        
        # overall: tc O(n^2)
        # overall: sc  O(n)
        return True
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Row, Col, Box InitO(1)O(n2)No iteration O(1)Dictionary allocation for grid cells O(n2)
Iterate over gridO(n2)O(1)Iterate over grid O(n2)No allocation O(1)
OperationsO(n)O(1)In array operation O(n), Put operation O(1)No allocation for operations O(1)
OverallO(n3)O(n2)Iterate over grid * linear array in operation dominates, O(n3)Allocation for grid dominates, O(n2)

Note: Solution 2 is faster than Solution 1 for small sets, such as our 9x9 board, due to defaultdict overhead

128. Longest Consecutive Sequence ::3:: - Medium

Topics: Array, Hash Table, Union Find

Intro

Given an array of integers nums, return the length of the longest consecutive sequence of elements. A consecutive sequence is a sequence of elements in which each element is exactly 1 greater than the previous element You must wrtie an algorithm that runs in O(n) time.

InputOutput
[100,4,200,1,3,2]4 from [1, 2, 3, 4]
[0,3,7,2,5,8,4,6,0,1]9 from [0, 1, 2, 3, 4, 5, 6, 7, 8]
[1, 0, 1, 2]3 from [0, 1, 2]

Constraints:

0 ≤ nums.length ≤ 105

-109 ≤ nums[i] ≤ 109

Abstraction

Give a list of integers, find the longest sequence of increasing integers. This sequence does not have to follow the left to right order of the array.

Space & Time Complexity

SolutionTime ComplexitySpace ComplexityTime RemarkSpace Remark
Brute Force
Hashmap

Brute Force

    def longestConsecutive(nums: List[int]) -> int:
        
        longest = 0

        # time complexity: Iterate over list of n length O(n)
        for num in nums:

            currLen = 1
            
            # time complexity: Iterate over potential n consecutive integers O(n)
            # time complexity: Lookup in list takes O(n)
            # time complexity: O(n^2)
            while num + currLen in nums:
                currLen += 1

            # compare to global max
            longest = max(longest, currLen)

        # Overall time complexity: O(n^3) for worst-case where we scan sequences n times for n numbers
        # Space complexity: O(1)
        return longest
AspectTime ComplexitySpace ComplexityTime RemarkSpace Remark
IterateO(n)O(1)Iterate over list of n length O(n)No additional memory allocation for iteration O(1)
Iterate consecutive sequenceO(n)O(1)Iterate over potential n consecutive integers O(n)No additional memory allocation for iteration O(1)
List lookupO(n)O(1)Lookup in list takes O(n)No additional memory allocation for list lookup O(1)
OverallO(n3)O(1)Iterating * Iterating * Lookup dominate leading to O(n3)No additional memory allocation O(1)

Bug: Empty list for Union tree leads to fail

    def longestConsecutive(self, nums: List[int]) -> int:

        # BUG:
        # need to check for empty list case
        # or else max(size.values()) will return error

        # Instead:
        # if len(nums) == 0
        #   return 0

        parent = {}
        size = {}

        def find(x):

            if parent[x] != x:
                parent[x] = find(parent[x])

            return parent[x]


        def union(x, y):
            root_x = find(x)
            root_y = find(y)

            if root_x != root_y:

                size_x = size[root_x]
                size_y = size[root_y]

                if size_x < size_y:
                    parent[root_x] = root_y
                    size[root_y] += size[root_x]
                else:
                    parent[root_y] = root_x
                    size[root_x] += size[root_y]

        setNum = set(nums)
        for n in setNum:
            parent[n] = n
            size[n] = 1

        for n in setNum:
            if (n+1) in parent:
                union(n, n+1)

        return max(size.values())

Solution 1: HashMap Bridging Left/Right Neighbor Continents And Updating Boundaries [TC Opt] - HashMap/Representation

    def longestConsecutive(nums: List[int]) -> int:
        
        # Continent Boundaries:
        # If we imagined a number line, with runs of consecutive numbers being colored
        # different colors, we could have:
        #
        #    0 1   4 5 6 7 8   11 12 13 14  19 
        #    ***   #########   -----------  ++ 
        #
        # Continent boundaries builds the lengths of these 'continents' as we iterate
        # by checking if a number has left and right neighbors who already have a 
        # continent length, and then just joins the continent lengths by 'bridging the continents'
        # by adding both the left and right neighbor continent lengths and adding +1
        # to the total length:
        #
        #     leftContinent_length   "+1 Bridge"   rightContinent_length 

        longest = 0

        # sc: hashmap of list of n length O(n)
        seqLen = defaultdict(int)

        # sc: ignore duplicates O(n)
        numSet = set(nums)

        # tc: iterate list O(n)
        for num in numSet:
            
            # Get lengths of neighbor sequences
            leftContinentLenFromBoundary = seqLen[num - 1]
            rightContinentLenFromBoundary = seqLen[num + 1]

            # Calculate new bridged sequence length
            bridgedLen = 1 + leftContinentLenFromBoundary + rightContinentLenFromBoundary

            # Update new continent boundaries
            seqLen[num - leftContinentLenFromBoundary] = bridgedLen
            seqLen[num + rightContinentLenFromBoundary] = bridgedLen

            # Update the global max
            longest = max(longest, bridgedLen)

        # overall: tc O(n)
        # overall: sc O(n)
        return longest
AspectTime ComplexitySpace ComplexityTime RemarkSpace Remark
SetO(n)O(n)Converting list unique list O(n)Allocation for unique set O(n)
Iterate over setO(n)O(n)Iterate set O(n)Dictionary allocation for sequence length per element O(n)
OperationsO(1)O(1)Get, Put operation O(1)No allocation O(1)
OverallO(n)O(n)Iterate unique list dominates O(n)Allocation for unique elements, sequence length dominates, O(n)

Solution 2: Unique Set Creating Rummy Run While Run Exists [TC Opt] - Hashmap/Representation

def longestConsecutive(self, nums: List[int]) -> int: 

    # Rummy Run:
    # Each number is part of only 1 sequence
    # and thus are only iterated over once.
    # Thus, the number of times the loop runs across 
    # all iterations is O(n)

    longest = 0

    # sc: ignore duplicates O(n)
    numSet = set(nums)

    # tc: iterate list O(n)
    for runStartCandidate in numSet:

        # Check if we have found start of a new run: the left most element of a run
        # tc: lookup constant O(1)
        if (runStartCandidate-1) not in numSet: 

            currRunLen = 1

            # Check if run's next element exists
            # tc: iterate over run O(n)
            while (runStartCandidate + currRunLen) in numSet:
                # Point to next sequential element
                currRunLen += 1 

            # Compare max
            longest = max(longest, currRunLen)

    # overall: tc O(n)
    # overall: sc O(n)
    return longest        
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
SetO(n)O(n)Converting list unique list O(n)Allocation for unique set O(n)
Iterate over setO(n)O(1)Iterate set O(n) * Sequence Start Check In Operation O(1)No allocation O(1)
Rummy IterateO(n)O(1)Iterate while valid sequence, n steps across all iterations O(n)No allocation O(1)
OverallO(n)O(n)Iterate set dominates, O(n)Set allocation dominates, O(n)

Solution 3: Union Find Tree Grouping Elements By Run [TC Opt] - HashMap/Algorithm

    def longestConsecutive(self, nums: List[int]) -> int:
        
        # Union Find Tree:
        # Tracks sets of elements broken into non overlapping groups.
        # In this case it tracks elements broken into groups of corresponding runs.

        # Check Empty: 
        # Avoid max(size.values()) return error
        if len(nums) == 0:
            return 0

        # Union Find Initialization:
        
        # Stores parent of each number
        # sc: dictionary for list O(n)
        parent = {}  

        # Stores the size of connected components
        # sc: dictionary for list O(n)
        size = {}    

        # Find + path compression
        # tc: Amortized O(α(n)) per operation
        def find(x):

            # Find Representative:
            # If the parent of x is not itself, keep going up, we have not reached the representative  
            if parent[x] != x:

                # Path compression:
                # Set parent of current, to the parent of its eventual parent,
                # By the end the representative will be the parent of all children
                parent[x] = find(parent[x])

            # Representative Found:
            # Reached top of the group, either through compression or just regular parent, return representative
            return parent[x]

        # Union operation with size optimization
        # tc: Amortized O(α(n)) per operation
        def union(x, y):

            # Get parent for trees x and y
            rootX = find(x)
            rootY = find(y)

            # If trees do not share representative
            if rootX != rootY:

                # Grab tree sizes
                xSize = size[rootX]
                ySize = size[rootY]

                # Attach smaller tree to larger tree:
                # 1. update larger tree size 
                # 2. update smaller tree parent
                if ySize < xSize:
                    parent[rootY] = rootX
                    size[rootX] += size[rootY]
                else:
                    parent[rootX] = rootY
                    size[rootY] += size[rootX]

        # Init Union Find Tree: 
        # tc: iterate n O(n)
        for num in nums:

            # Ignore duplicates that have already been initialized
            if num not in parent:
                # 1. Set all parents to self
                # 2. Set all sizes to 1
                parent[num] = num
                size[num] = 1

        # Move elements into groups of runs:
        # Merge consecutive number sequences as trees
        # tc: iterate list O(n)
        for num in nums:
            if num + 1 in parent:
                union(num, num + 1)
        
        # Return group/tree (sequence of numbers) with longest size/length
        res = max(size.values())

        # overall: tc O(n)
        # overall: sc O(n)
        return res
AspectTime ComplexitySpace ComplexityTime RemarksSpace Remarks
Union Tree InitO(n)O(n)Iterate list O(n)Dictionary allocation for parent and size O(n)
UnionAmortized O(α(n))O(1)Each union operation is nearly constant with path compression and size optimization α(n)No allocation O(1)
FindAmortized O(α(n))O(1)Find with path compression is nearly constant O(α(n))No allocation O(1)
Iterate JoinO(n)O(1)Iterate list O(n) * Union which is constant, so just O(n)No allocation O(1)
Get LargestO(n)O(1)Grab max from dictionary O(n)No allocation O(1)
OverallO(n)O(n)Iterate * Union dominates, O(n)Dictionary allocation for parent and size dominates, O(n)