daubi's blog

By daubi, history, 3 years ago, translation, In English

RMQ $$$O(n)/O(1)$$$

Problem

For convenience, we will assume that the array numbering starts from zero.

Let's split our array into blocks of $$$\lceil\log_{2}(n)\rceil$$$ numbers. Let's denote $$$bl = \lceil\log_{2}(n)\rceil$$$. The first block contains numbers from $$$a_{0}$$$ to $$$a_{bl - 1}$$$, the second from с $$$a_{bl}$$$ to $$$a_{2 * bl - 1}$$$ and so on, the last block may contain less than $$$bl$$$ numbers.

Thus, we got $$$\lceil\frac{n}{bl}\rceil$$$ blocks. Let's learn how to find the minimum for a segment that is located entirely inside the block.

Let us consider blocks independently, in each block we will go from left to right and will maintain stack of minimums. For each boundary $$$r$$$ we will save the stack of minimums for a segment from beginning of the block, in which $$$r$$$ is located, to $$$r$$$. We will keep the stack of minimums as a mask of zeroes and ones, the $$$ith$$$ bit will contain $$$1$$$, if the $$$ith$$$ number in the block is now located in the stack of minimums.

Suppose now we want to find the minimum in the segment from $$$l$$$ to $$$r$$$, while $$$l$$$ is in the same block with $$$r$$$. Note that the minimum is at the position of the leftmost bit $$$1$$$ located to the right of the $$$lth$$$ bit(or $$$lth$$$ bit) from the stack of minimums, which we keep in $$$r$$$. Let's denote the mask of stack of minimums located in $$$r$$$ as mask. Then the bit $$$1$$$ that we need is the first bit in $$$mask » (l - start_{l})$$$, where $$$start_{l}$$$ is the start of the block. $$$start_{l} = \lfloor\frac{l}{bl}\rfloor * bl$$$. The count of different masks is $$$2^{bl}$$$ < $$$2 * n$$$, so with the help of dynamic programming we can pre-calculate the index of its leftmost bit $$$1$$$ for each mask.

Thus, we made a pre-calculation $$$O(n)$$$ and are able to find the mininmum on the segment inside the block in $$$O(1)$$$. Now we need to learn how to look for the minimum if $$$l$$$ and $$$r$$$ are in different blocks. Then the minimum we need is $$$min(getmin(l, end_{l}), getmin(start_{r}, r), getmin(end_{l} + 1, start_{r} - 1))$$$. We know how to search for the first 2 values, since the boundaries are in one block, and the third value is the minimum on the segment of blocks. Then, let's find the minimum in each block and build sparse table on array of these minimums. Pre-calculation of sparse table takes $$$O(\lceil\frac{n}{bl}\rceil * log_{2}(\lceil\frac{n}{bl}\rceil))$$$, that is $$$O(n)$$$.

We learnt how to find the minimum on segment in $$$O(1)$$$ based on the $$$O(n)$$$ pre-calculation.

For $$$n = 10^6$$$ my implementation of this algorithm works as fast as the non-recursive segment tree.

Code

An improved version of this algorithm, supporting updates

Problem

Recently I thought about how this structure can be changed and came up with idea that instead of sparse table on minimums in blocks, I can once again break minimums into blocks by $$$log_{2}(n)$$$, calculate stacks of minimums and again build this structure. In fact, we will have severals levels of the structure, at each it is necessary to support the needed masks and block sizes. We will reduce the length of the next level as long as the count of the numbers on level is greater than $$$2$$$. On each level the count of numbers is reduced by $$$log_{2}(len)$$$ times, where $$$len$$$ is the length of the array on last level. At each level pre-calculation is lineary.

Thus, let $$$f(n) = f(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + n$$$, then pre-calculation requires $$$O(f(n))$$$.

Let's consider an update request: The stack of minimums changed in one block at the longest level, so let's recalculate it simply, then the minimum in the block can change, so we need to recalculate the stack of minimums in one block at the higher level and so on. Thus, let $$$g(n) = g(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + \log_{2}(n)$$$, then an update request takes $$$O(g(n))$$$.

Let's consider a request for a minimum on a segment: If the boundaries are located in one block at the longest level, then we simply find the answer. Otherwise, we need to find the minimum in small subsegment on the left and on the right (in $$$O(1)$$$) and on the segment of blocks. Thus, let $$$t(n) = t(\lceil\frac{n}{\lceil\log_{2}(n)\rceil}\rceil) + 1$$$, then minimum on a segment request takes $$$O(t(n))$$$.

Thus, update operation takes longer than $$$O(log_{2}(n))$$$, but faster than $$$O(log_{2}(n)^2)$$$(because the number of levels is not more than $$$log_{2}(n)$$$ and at each level the length of block is not more than $$$log_{2}(n)$$$. I could not give a more accurate estimate.

The get operation is faster than $$$O(log_{2}(n))$$$, since each time the length of the level is reduced by at least two times $$$2$$$. Again, i don't know how to estimate this more accurately, but i completed several tests.

For $$$n = 10^6$$$, this structure takes on average $$$1250ms$$$, recursive segment tree takes $$$850ms$$$, non-recursive segment tree $$$680ms$$$.

When the number of get requests is $$$100$$$ times more than the number of update requests, this structure takes on average $$$1170ms$$$, recursive segment tree $$$1600ms$$$, non-recursive segment tree $$$1200ms$$$. I don't know why the running time of my structure has not changed much, most likely this is because of a constant, but in theory it should work much faster, since for $$$n = 10^6$$$, $$$t(n) = 6$$$, $$$g(n) = 65$$$. $$$f(n) = 1053421$$$, that is almost $$$10^6$$$. The tests I did are random.

Code
  • Vote: I like it
  • +71
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by daubi (previous revision, new revision, compare).

»
3 years ago, # |
  Vote: I like it +24 Vote: I do not like it

Finally, an educational blog that makes sense to me. Thanks!!

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by daubi (previous revision, new revision, compare).

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by daubi (previous revision, new revision, compare).

»
3 years ago, # |
  Vote: I like it +6 Vote: I do not like it

This happens and tutorial comes out.. what are the odds though :P

»
3 years ago, # |
  Vote: I like it +5 Vote: I do not like it

First part of this blog is described in https://codeforces.com/blog/entry/78931.

»
3 years ago, # |
  Vote: I like it +15 Vote: I do not like it

Here's my code for the static data structure above. It's not perfect, but it's pretty fast.

As for the dynamic structure. $$$f(n)$$$ is obv linear. My scientific guess is that $$$g(n)=log(n)\cdot t(n)$$$ and $$$t(n)=log^*(n)$$$.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Yes, thanks, I agree about linearity of $$$f(n)$$$, but the second statement does not work if you take a large enough $$$n$$$ and look at these functions.

»
3 years ago, # |
  Vote: I like it +15 Vote: I do not like it

FWIW, the number of 'levels' $$$t(n)$$$ is $$$\Theta(\frac{\log{n}}{\log{\log{n}}})$$$, and likewise $$$g(n)$$$ is in $$$\Theta(\frac{(\log{n})^2}{\log{\log{n}}})$$$.

Proof sketch for t(n)
»
3 years ago, # |
Rev. 2   Vote: I like it +25 Vote: I do not like it

For g's complexity, going from $$$n$$$ to $$$\sqrt{n}$$$ adds $$$\mathcal{O}\left(\log_2(n) \log_{\log_2(n)}(n)\right)$$$ or $$$\frac{\log_2(n)^2}{\log_2 \log_2(n)}$$$. Note that $$$\log_2(\sqrt{n}) = \frac{1}{2} \log_2(n)$$$, so the next term $$$\frac{\log_2(\sqrt{n})^2}{\log_2 \log_2(\sqrt{n})}$$$ is at most $$$\frac{1}{2}$$$ the first term. Thus, the complexity is $$$\mathcal{O}\left(\frac{\log_2(n)^2}{\log_2 \log_2(n)}\right)$$$. Similarly, the complexity of t is $$$\mathcal{O}\left(\frac{\log_2(n)}{\log_2 \log_2(n)}\right)$$$.

Edit: sniped :)