pakhomovee's blog

By pakhomovee, 3 years ago, translation, In English

Mo's algorithm

Problem 1
Problem 2

Let's solve the first problem. We can solve it using persistent data structures, but we want to be able to solve problem 2 as well.

Algorithm 1

Let's solve the task offline. We need to sort the queries using the <$$$\frac{l}{k}$$$, r> comparator, where k is a fixed integer. It is quite obvious that we can easily add/delete one integer (we can do that using another array B, $$$B_{i}$$$ equals to the number of occurrences of integer i). Let's divide the queries into blocks, $$$i_{th}$$$ block contains all queries, which have $$$\left \lfloor \frac{l}{k} \right \rfloor$$$ equal to $$$i$$$. There are $$$O(\frac{n}{k})$$$ blocks. Let's assume that we have somehow calculated the answer for segment $$$[L; R]$$$. We can get the answer for segment $$$[L'; R']$$$, if we move the left and the right borders, until they are equal to $$$L'$$$ and $$$R'$$$. It is obvious that for each block the left border can move by $$$O(k)$$$ positions only for a single query, the right border — $$$O(n)$$$ positions in total, because the right borders are sorted in the increasing order. In total the left pointer will move by $$$O(q \cdot k)$$$ positions, the right border — $$$O(\frac{n^{2}}{k})$$$. If we set $$$n = q$$$, $$$k = \sqrt{n}$$$, we get the total complexity of $$$O((n + q)\cdot\sqrt{n})$$$.

Sample code

This code has a couple of problems. Firstly, it is quite slow.

Think about the second problem yourself.

The problem

Algorithm 2

Let's sort the queries the same way as in the first algorithm.

Let's call a block all the queries with the same $$$\left \lfloor \frac{l}{k} \right \rfloor$$$. There are $$$O(k)$$$ blocks in total. Let's answer the queries of length $$$\leq k$$$. We can answer each query in $$$O(k)$$$. Now, we need to answer the queries with length $$$> k$$$. For each query like that the position $$$k \cdot \left \lceil \frac{l}{k} \right \rceil$$$ is in the query. It is obvious that for each block we can move the right border as in the previous algorithm. For this we need the add operations only. For every query in the block we need to move the left border by $$$O(k)$$$ positions only, if we set $$$L = k \cdot \left \lceil \frac{l}{k} \right \rceil$$$ (the left border we need is $$$L'$$$) and try to make $$$L = L'$$$ by moving $$$L$$$ to left left. To sum up, we need $$$O(k)$$$ for each query to move the left border, $$$O(n)$$$ for each block to move the right border. This makes the complexity of $$$O(q \cdot k + \frac{n^2}{k})$$$ in total (assuming $$$n = q$$$). if we set $$$k = \sqrt{n}$$$, we get $$$O((n + q) \sqrt{n})$$$.

Some notes

3D Mo

Let's solve the second task.

Algorithm 1.

We can use sqrt decomposition while using Mo's algorithm. Let's divide the queries into blocks of size $$$k$$$. Inside each block we can use Mo's algorithm. The final complexity is $$$O(\frac{q}{k} \cdot (k^{2} + \frac{n^{2}}{k}))$$$. If we set $$$n = k$$$, we get $$$O(nk + \frac{n^3}{k^{2}})$$$. If we set $$$k = n^{\frac{2}{3}}$$$, we get $$$O(n^{\frac{5}{3}})$$$.

Sample code

Algorithm 2.

Let's sort the queries by <l / k, r / k, i>. Each of the query borders is located in a fixed block (the blocks are defined as in the previous algorithms). For each pair of blocks we can answer the queries. Let's look through the queries in the chronological order. For each pair of blocks we look at all the change queries. We move the pointers as in the first algorithm. To do all of this fast, we can maintain the index of the last query we have already checked. The total complexity is $$$O(\frac{n}{k} \cdot \frac{n}{k} \cdot q + qk)$$$, because every l/r pointer can move by $$$O(k)$$$ steps per query. If we set $$$n = q$$$, we get $$$O(\frac{n^{3}}{k^{2}}+qk)$$$. If we set $$$k = n^{\frac{2}{3}}$$$, we get $$$O(n^{\frac{5}{3}})$$$.

Sample code

Algorithm 3.

Let's sort by <i / k, l / k, r>.

Let's call a block all the queries with the same i / k and l / k. For each block we can look at all the change queries, which we haven't already processed and which are connected with this block. The left pointer can move by $$$O(k)$$$ positions per query, the left pointer can move by $$$O(n)$$$ positions per block. If we set $$$n = q$$$, we get $$$O(\frac{n^{3}}{k^{2}} + nk)$$$. If we set $$$k = n^{\frac{2}{3}}$$$, we get $$$O(n^{\frac{5}{3}})$$$.

Some more problems.

Problem 1
Problem 2
Problem 3
Problem 4
Problem 5
Problem 6
  • Vote: I like it
  • +88
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it +33 Vote: I do not like it

You can find a lot of difficult sqrt problems (and some polylog problems) here prepared by ODT

»
3 years ago, # |
  Vote: I like it -39 Vote: I do not like it

Well known in CNOI because of ODT lol

»
3 years ago, # |
  Vote: I like it +5 Vote: I do not like it

I think a better way of thinking about Mo's algorithm that allows you to also optimize it in a lot of cases is to think about inserting a separator each $$$k$$$ positions and "bucketing" the queries depending on the first separator that is inside the query (not surprisingly, the l/k-th). For each bucket, you sort the queries increasingly by r and do sweep-line. This way, you only need a data structure that should be able to:

  • insert a value (in good complexity, say $$$O(1)$$$);
  • compute the answer (in relatively good complexity, say $$$O(\sqrt{N})$$$);
  • restore to a previous checkpoint [e.g. undo] (in relatively good complexity, say $$$O(\sqrt{N})$$$).

Sometimes not even that is needed for the problem. Nonetheless, you don't need to explicitly have the option to remove.

This is important in a lot of scenarios, for example the "most frequent element in range" problem, which you can solve in $$$O(Q \sqrt{N} \log{N})$$$ with the "standard" Mo's algorithm but rather trivially in $$$O(Q \sqrt{N})$$$ using this formulation. The only downside is that you have to write extra code to handle small queries (length smaller than $$$k$$$), but usually that is just glue code, as you can use the same data structure for that.