Small to Large Merging Notes [Advanced]. LeakCode

Question Details

Here's some notes on small to large merging. Written by me (a human), not AI. Please note this is a more advanced topic and not for interviews, but it does appear in LeetCode. 1: Naive version

Full Details

Here's some notes on small to large merging. Written by me (a human), not AI. Please note this is a more advanced topic and not for interviews, but it does appear in LeetCode. 1: Naive version Say we have a DFS function and we call it on every node in the tree. At each node, we iterate over all pairs of children inside. It looks O(n^3) because we do n^2 work at n nodes but it is actually O(n^2) because we can see every pair of nodes only gets considered once, at their LCA. def dfs(node): for child1 in subtree[node]: for child2 in subtree[node]: # do something 2: Speedup with small to large merging Now instead of looping over pairs of children we will just loop over children. Each child contains some data, with size c where c = size of that child tree. def dfs(node): accumulator = [] for child in children[node]: childData = dfs(child) # the size of this is the size of that child tree if len(accumulator) < len(childData): accumulator, childData = childData, accumulator # crucial small to large to get O(n log n) for val in childData: # do work for val in childData: # safely update accumulator Proof of O(n log n) time complexity: Consider any element e. It can be in the smaller bucket at most log N times. Every time it's in the small bucket, it gets merged into a larger bucket of at least the same size, meaning the container size doubles. The container size can double at most log N times. 3: Simpler version of small to large merging that is O(n log n) I have found instead of swapping accumulator and childData we can just pick the heaviest child as the root container and merge everything else in. This is because if we initialize the accumulator on the largest child, then every other child bucket would be smaller, meaning the bucket size doubles. The previous argument then holds. def dfs(node): childrenData = [dfs(child) for child in children[node]] # a bunch of buckets, each bucket is the size of that child tree childrenData.sort(key = lambda x: len(x), reverse=True) heavyChild = childrenData[0] for i in range(1, len(childrenData)): # merge this child into our root 4: Traps It is not safe to execute O(heavyChild) work in each node, like this: def dfs(node): childrenData.sort(key = lambda x: len(x), reverse=True) heavyChild = childrenData[0] newDataStructure = fn(heavyChild) # takes O(heavyChild) work, NOT SAFE Imagine a stick tree, we would do 1+2+3+...+N = O(n^2) work. Example bad submission (that somehow still passed): <a href="https://leetcode.com/problems/maximum-xor-of-two-non-overlapping-subtrees/submissions/1967670898/)

Free preview. Unlock all questions →

Topics

Trees Graphs Sorting Bit Manipulation

Small to Large Merging Notes [Advanced]

Question Details

Full Details

Topics

More from Small