EmBeeZed's blog

By EmBeeZed, history, 7 months ago, In English

It turns out, that the final amortized time complexity is $$$O(α(n))$$$, where $$$α(n)$$$ is the inverse Ackermann function, which grows very slowly. In fact it grows so slowly, that it doesn't exceed $$$4$$$ for all reasonable $$$n$$$ (approximately $$$n<10^{600})$$$.

code

Question: Why this works in $$$O(α(n))$$$?

 
 
 
 
  • Vote: I like it
  • +7
  • Vote: I do not like it

»
7 months ago, # |
  Vote: I like it +6 Vote: I do not like it
»
7 months ago, # |
  Vote: I like it +1 Vote: I do not like it

Read this blog to understand, atleast slightly, what exactly is going on.

  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    ?

    The linked blog doesn't even mention inverse Ackermann.

    • »
      »
      »
      7 months ago, # ^ |
      Rev. 4   Vote: I like it +12 Vote: I do not like it

      I thought it would help him get a jist of what is exactly happening within DSU. Also, the blog does introduce a function, $$$log^{*}(n)$$$, although it doesn't mention any name. This function and how it comes up in the complexity is quite easy to see, and provides quite good bound too.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

thanks

»
7 months ago, # |
  Vote: I like it +50 Vote: I do not like it

Using path compression heuristic alone does not make your find operation work in $$$\mathcal{O}(\alpha(n))$$$. You have to use the union-by-rank heuristic as well.

  • »
    »
    7 months ago, # ^ |
    Rev. 5   Vote: I like it +42 Vote: I do not like it

    Path compression alone gives amortized bound $$$O(\log n)$$$ for one operation. For all details see paper: http://www.csd.uwo.ca/~eschost/Teaching/07-08/CS445a/p245-tarjan.pdf (see theorem 4. on page 264; the construction of the worst-case scenario starts in the middle of page 261 -- the essence is described in the last paragraph on this page and the general idea is depicted on figure 9 on page 262; also a nice summary of many variants of DSU is given in table I and II on page 280). So the whole idea is based on taking advantage of binomial tree's structure and arranging unions to construct the tree and then in such what that each iteration takes $$$\Theta(\log n)$$$.

    Once I wrote a task that exposes this nuance in path-compression-only DSU: You are given graph of $$$n$$$ nodes and $$$m$$$ $$$(1 \leq n, m \leq 4\cdot 10^6)$$$ weighted edges and you have compute the sum of weights of edges in the MST. Input: $$$n$$$ and $$$m$$$ in the first line. Then $$$m$$$ triples $$$a_i, b_i, w_i$$$ meaning an edge of weight $$$w_i$$$ connects vertices $$$a_i$$$ and $$$b_i$$$ where $$$a_i \neq b_i, 1 \leq a_i, b_i \leq n, 1 \leq w_i \leq 10^9$$$ and $$$w_1 \leq w_2 \leq \ldots \leq w_m$$$.

    The obvious solution is to use Kruskal's algorithm.

    Model solution using DSU with path compression and ranks:
    Solution using DSU with path compression only:
    Binomal worst-case test generator for path compression only:
  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I prefer union-by-size. It's slightly less "correct" as a heuristic, but has the same $$$O$$$ complexity, and size is the more useful statistic for the "client" side.

    • »
      »
      »
      7 months ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      If I remember correctly, union by rank alone guarantees $$$\mathcal{O}(logN)$$$ performance for the find operation. The worst possible test case being that you combine sets of the same size only.

      • »
        »
        »
        »
        7 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        According to GeeksForGeeks "using size as rank also yields worst case time complexity as $$$O(\log n)$$$" but the link to the proof is defunct and redirects to some kind of news website.

        • »
          »
          »
          »
          »
          7 months ago, # ^ |
            Vote: I like it +3 Vote: I do not like it

          Well, the proof for the famous "small-to-large" technique also works for explaining why the union by rank heuristic provides that time complexity. I think that the proof can be found somewhere on Codeforces. Do pardon me for my laziness to Google.

»
7 months ago, # |
  Vote: I like it +1 Vote: I do not like it

Just a little surprised: Why did the blog post receive downvotes? I feel the above question is legitimate to ask, and it seems like it was an effective way for others to provide helpful comments / links.