SecondThread's blog

By SecondThread, history, 4 years ago, In English

According to Wikipedia, an RMQ can be built with $$$O(n)$$$ memory and time precomp that can answer queries in $$$O(1)$$$. The idea is to divide the array into $$$log(n)/4$$$ blocks and then build an RMQ/sparse table normally on each of the blocks using $$$O(nBlocks*log(nBlocks))$$$ time and memory. In this case, $$$nBlocks$$$ is roughly $$$n/log(n)$$$ so the memory and time used to precomp everything is linear. By doing this, we can find the min of all completely covered blocks. That part sounds easy.

The hard part is dealing with the blocks that a query starts and ends in. Wikipedia says that I need to map each block to its corresponding Cartesian tree, and then I can precompute all the answers for every Cartesian tree. I see how, if I could do that, it would be easy to answer queries in $$$O(1)$$$. As I understand it, I could just store which index for a given Cartesian tree contains the relevant min for all pairs of queries within that subarray in $$$O(blockSize^2 * nCartesianTrees)$$$.

But how do I go about quickly (and hopefully cleanly) mapping a subarray to its corresponding Cartesian tree?

Side question: Are there even performance gains to be made here, or am I just wasting my time trying to get rid of this log factor?

  • Vote: I like it
  • +43
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by SecondThread (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it +21 Vote: I do not like it

Is it enumerated?

»
4 years ago, # |
  Vote: I like it +6 Vote: I do not like it

There's a better way to go about it, it's described in this Topcoder article.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The method they describe only applies to arrays where adjacent elements differ by +1 or -1, which is a very severe restriction.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +2 Vote: I do not like it

      They also show how to reduce the general problem to the restricted version. See the "From RMQ to LCA" section.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Oh, I see now. That’s pretty clean; thanks for the link!

»
4 years ago, # |
  Vote: I like it +26 Vote: I do not like it

You can detect the Cartesian tree with the algorithm presented here starting at slide 119, and you can represent a tree on $$$N$$$ nodes in $$$2N$$$ bits using a balanced parentheses representation.

Some thoughts on practicality from recent discussion with neal:

  • The difference in time between $$$O(N)$$$ precomputation and cache-efficient $$$O(N \log {N})$$$ precomputation isn't typically important. However, the memory improvements might make this worthwhile.
  • The speed of your $$$O(1)$$$ lookups is more commonly going to be what you want to optimize for. $$$N \log{N}$$$ RMQ can handle a lookup by performing two memory accesses. The Cartesian tree version needs to do a lot more: 2 lookups on the block sparse table, 2 lookups on the Cartesian tree table, with additional lookups on the initial data to join those results.
  • $$$log(N)/4$$$ makes sense for the asymptotic analysis (since the Catalan numbers grow like $$$4^N$$$), but it's tiny for practical purposes. For $$$N = 10^5$$$, it's $$$4$$$. If you do try implementing this, you'll fare better with a larger block size (say, $$$8$$$) which still produces a compact Cartesian tree table, but sees bigger improvements on the size of the block sparse table.
»
4 years ago, # |
  Vote: I like it +22 Vote: I do not like it

Here you can find how to calculate the part for in-block-queries. But there is another way to do this without the "cartesian" tree way. For each position $$$i$$$ inside a block calculate a binary mask with the bits at positions $$$b_0 > b_1 > \ldots > b_k$$$ set to one if $$$a[i] > a[b_0] > a[b_1] > \ldots > a[b_k]$$$ ($$$i > b_0$$$), here $$$b_j$$$ is maximum. Those masks can be represented in an integer if the size of a block is less than the size of a word in c++. Then to get the minimum position in the interval $$$[l, r]$$$ inside a block, you only have to find the smallest active bit $$$b$$$ in $$$mask[r]$$$ such that $$$b\ge l$$$. If there is none active bit satisfying this condition the answer will be $$$r$$$.