sliviu's blog

By sliviu, history, 4 years ago, In English

Background

Square root decomposition / Mo's algorithm is a known method used to solve a large variety of problems in O(N sqrt(N)). However, it is not used all that often by more experience programmers because we can use a BIT or segment tree instead and attain logarithmic time. However, I've recently stumbled upon problem which I only know how to solve using sqrt decomposition.

The problem

You are given an array A with N elements (indexing starts from 1). You need to answer M queries of the type: "Given three indices l r k, find the number of elements which appear exactly k times in the subarray A[l], A[l+1], ..., A[r].

Input:

11 3
1 2 4 3 2 5 6 4 5 2 1
1 6 2
2 7 3
4 11 1

Output:

1
0
4

I've tried implementing online/offline solutions using persistent segments and binary indexed trees, but to no avail. Is there a O(N logN) or O(N log^2 N) solution for this specific problem?

  • Vote: I like it
  • +4
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you explain the sqrt decomposition approach? I can't think of one without having a factor of k in the time complexity.

  • »
    »
    4 years ago, # ^ |
    Rev. 3   Vote: I like it +3 Vote: I do not like it

    You divide the queries into blocks of sqrt(n) and sort them. The comparision function would be

    return ((l1/(sqrt(n)) == l2/(sqrt(n))) ? r1<r2 : l1/(sqrt(n)) < l2/(sqrt(n)));
    

    You keep two variables left and right which represent the indices of the current query. You go through all the queries and you increment/decrement the left and right variables until you reach the indices of the current query.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Btw, u want the solution to be (n logn) for each query (so m(n logn) or for all together?

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +3 Vote: I do not like it

        O(m*n log n) would be worse than brute force. I meant O(log n) per query

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Yeah sorry for the stupid question lol, I didn't understand your solution well, I'm not really versed in this sqrt thing, will have to think about it a bit more.

»
4 years ago, # |
  Vote: I like it +19 Vote: I do not like it

You can do it in O((N + Q) * log(N)) using persistent segment trees (also some other segment trees also work). For every a[i] find the position of (k + 1)-th occurrence of a[i] in [i, n] range, let's denote it x. If such occurrence doesn't exist, set x to be infinity or such. Now, build a segment tree over x values. The answer to the query [l, r] is simply the number of x values that satisfy x <= r. You can do this part via persistent segment trees in O(log(n)) (or merge sort tree in O(log^2(n)) or whatever else fits in TL). This is a well-known problem to be solved via persistent segment trees. Just query tree(r) and tree(l — 1), and etc.

»
4 years ago, # |
  Vote: I like it +27 Vote: I do not like it

Let's make the array have size $$$2n$$$ for convenience. (This won't affect the asymptotic time complexity). We can adapt the reduction from here to show the problem is at least as hard as multiplying two $$$\sqrt{n}$$$ by $$$\sqrt{n}$$$ boolean matrices. (This isn't a proof of impossibility because it is theoretically possible in $$$o(n^{1.5})$$$, but you'll probably never use it in practice).

Multiplication of two $$$\sqrt{n}$$$ by $$$\sqrt{n}$$$ boolean matrices can be restated as follows:

Let $$$S=\{1,2,\ldots,\sqrt{n}\}$$$. Given $$$\sqrt{n}$$$ sets $$$A_1,A_2,\ldots, A_\sqrt{n} \subseteq S$$$ and $$$\sqrt{n}$$$ sets $$$B_1,B_2,\ldots, B_\sqrt{n} \subseteq S$$$, compute $$$|A_i\cap B_j|$$$ for each $$$1\leq i,j\leq \sqrt{n}$$$.

We can reduce this to your problem by dividing the array into $$$2\sqrt{n}$$$ blocks of size $$$\sqrt{n}$$$. Every block will contain each element of $$$\{1,2,\ldots,\sqrt{n}\}$$$ exactly once. For $$$1\leq i \leq \sqrt{n}$$$, the $$$i$$$th block from the beginning will contain the elements of $$$S-A_i$$$ followed by elements of $$$A_i$$$. For $$$1\leq j \leq \sqrt{n}$$$, the $$$j$$$th block from the end will contain the elements of $$$B_j$$$ followed by elements of $$$S-B_j$$$. For each $$$1\leq i,j\leq \sqrt{n}$$$, we can choose an interval containing $$$A_i$$$, $$$B_j$$$, and some full blocks. Then, we query how many numbers occur two more times than the number of full blocks. This will be exactly $$$|A_i\cap B_j|$$$, as desired.