The iterated logarithm is the number of times the logarithm function must be applied to a number before the result is at most . A runtime of is almost, very nearly almost constant but not quite as good as constant from an algorithmist standpoint. The iterated log of the number of atoms in the universe is at most . Perhaps this is the reason why implementationists regard it as a constant for all intents and purposes.
Here, I will introduce three problems with algorithms that have log star runtime factors. Some of these algorithms are quite old but I’ve been thinking about iterated log algorithms in relation to a couple of research questions so thought I would still share some of what I read in the process.
A logical way of representing a set is to represent the set as a tree with the name of the set attached to the root of the tree. Each element in a set is a node in the tree. We may perform a merge (union operation) between two sets by adding a pointer from the root of one tree to the root of the other tree. To find which set an element belong to (find operation), we can follow parent pointers from the element up the tree to the root to find the name of the set. If we maintain a balanced tree for each set, then each merge operation takes time and each lookup operation would take time. Not yet quite what we want.
The two key ideas that will help us improve the runtime of Union-Find is that of a special rank that is used to determine which of the two trees’ roots becomes the root of the new merged tree and the idea of lazy updates of root pointers.
First, we’ll explain briefly the idea of root pointers and lazy updates of such pointers. Rather than searching from the node of a tree (representing an element of a set) to the root of the tree each time we perform a find operation, we maintain pointers from each node to the root of its tree. Thus, to find the set that an element belongs to, we simply have to follow the root pointer to the corresponding root with the label of the set that the element belongs to. However, we do not update the root pointers with each merge because that could be time consuming. Instead, we perform a type of lazy update depending on whether a set was constructed, merged (union operations) or the element of a set was queried (find operation):
- make_set(x): is the element to be added to the newly created set. The rank of is set to and a pointer is added that points to itself.
- union(x, y): if the root of element and the root of element have the same rank, then one of the roots is arbitrarily chosen to be the root of the merged set. If one of the roots has larger rank, then that root is automatically chosen to be the root of the merged set.
- find(x): starting from , follow the parent pointers until we reach the root of the set. Note that the root pointers of all nodes that are traversed during this procedure are also updated to point to the correct root.
The lazy updates only update the root pointers during the find procedure.
Although all nodes have ranks, we only care about the rank of the root. The rank of the root determines which of the previous roots remains the root of the merged tree. The root with the larger rank becomes the root of the merged sets. We define and update rank as follows:
- When creating a set (a single new node), we set the rank of the node to be and create a parent pointer to itself.
- When two sets are merged and one root has a larger rank than the other, then we create a parent pointer from the root with smaller rank to the root with larger rank and make the root with smaller rank a child of the root with larger rank. If the roots have equal rank, we arbitrarily pick a root to be the new root of the merged sets and increase its rank by .
Note that one key aspect of this rank updating procedure is that once a node becomes a non-root node, then its rank is never updated again. Therefore, all non-root nodes have ranks that are at most the rank of the root of the tree (set) it is part of. In fact, all non-root nodes have strictly smaller rank.
We also make the observation that there are at most nodes of rank . This can be proven by noticing that for a root to gain in rank, the size of its tree must double. Therefore, a root of rank must have nodes in its tree. We know that all non-root nodes in a tree have smaller rank. Thus, there can be at most trees with roots of rank . Using these observations, we may proceed with the proof of the runtime of Union-Find.
Theorem 1: Given Union-Find operations, the runtime of performing all operations is . We also reasonably assume that .
Proof Ideas: The main idea of the proof is to divide the non-root nodes into buckets that contain nodes of rank . This gives us a total of buckets. Therefore, there is a maximum charge of for elements in different buckets. Furthermore, a bucket with maximum value gets charged since we have at most elements with range . For a more detailed proof, please refer to the original paper or a very good summary of the proof.
The definition of the -coloring problem is given a graph with vertices and edges, we produce a coloring of the graph by assigning each node a color. A valid coloring of the graph uses at most colors such that no two adjacent vertices are colored the same color. Pages 5 and 6 of the paper presents and proves a parallel algorithm that solves this problem. The paper further extends this result to the -coloring problem.
Given a list of unordered elements (assuming all elements are distinct), we are allowed to make comparisons each round. How many rounds of comparisons are needed to obtain an element from the biggest elements in the list? Turns out we only need rounds. The proof of this bound is a bit too elaborate for me to explain here. Please refer to the original paper for the complete proof.