BERNARB.01's blog

By BERNARB.01, 4 weeks ago, In English

How to solve this?

You have queries of the following three types:

  • $$$add(i, v)$$$: add an element with value $$$v$$$ at index $$$i$$$.

  • $$$query(l, r)$$$: count the number of distinct values of elements on the range from $$$l$$$ to $$$r$$$ (inclusive).

  • $$$remove(i, v)$$$: remove one of the elements with value $$$v$$$ from the index $$$i$$$.

Notes:

  • let $$$Q$$$ be the number of queries of first and the third types, and $$$Q'$$$ be the number of queries of the second type, $$$1 \le Q \le 10^5$$$, $$$1 \le Q' \le Q\space log\space Q$$$.

  • $$$1 \le v \le 10^5$$$.

  • $$$0 \le i, l, r \le 10^8\space\space(l \le r)$$$.

  • in $$$add$$$ queries some index might have multiple elements at the same time and some may share the same value.

  • in $$$remove$$$ queries it is guaranteed that there will be at least one value at index $$$i$$$ which equals $$$v$$$.

  • $$$query$$$ queries should be preformed online, but the the other two types can be preprocessed if needed.

  • notice the unusual constrains over $$$l$$$, $$$r$$$ and $$$i$$$.

Is there is any way to do this in $$$O(log\space n)$$$ time for the queries of the second type and $$$O(log^2 n)$$$ or faster for the first the third types?

 
 
 
 
  • Vote: I like it
  • +43
  • Vote: I do not like it

»
4 weeks ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

What is the limit on a[i]?

If a[i] is in the hundreds or so, then you can just iterate from 1 to the limit of a[i], and then binary search whether it's within the range.

If you can't, my next thought would be to use a segment tree to answer queries. A data structure called "merge-sort tree" (which is similar to a segment tree) stores a subarray of information (requires n log n memory). At this point, it is possible to use binary search to answer queries, or possible to use fractional cascading as an alternative (see Fractional Cascading ).

For remove queries, it is much harder as you would have to remove that number from log n subarrays. It might just be better to use a boolean array to check if that number has been used.

These are some of the ideas that I have, but they're not definitive. Do you know what the time limit of this problem is in? Normally, the size of the array would not be so large.

  • »
    »
    4 weeks ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    All the constrains are listed in the notes.

    The time limit is about 2 seconds.

»
4 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

I think if rectangle queries can be solved in O(logn), then probably there should be a nice way to solve your problem efficiently.

Similar problem : https://www.codechef.com/problems/DISTNUM

One way to solve DISTNUM.
»
4 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you post a link to the problem ?

»
4 weeks ago, # |
Rev. 2   Vote: I like it +35 Vote: I do not like it

The best algorithm I can come up with is $$$\mathcal{O}(\log^2 n)$$$ per query (which should still pass).

First, compress indices so that we have exactly one add and exactly one remove operation at every index. Say the value that is added at position $$$i$$$ is $$$v_i$$$.

Call a value active if it has been added but not deleted yet. Maintain an array where at position $$$i$$$ you have $$$-1$$$ if it is not active, and otherwise the next position $$$j$$$ which is active and has $$$v_i = v_j$$$. You can maintain these values in $$$\mathcal{O}(\log n)$$$ per query.

Now, you can count the number of distinct active values $$$v_i$$$ in range $$$[a, b]$$$ by counting the number of values in $$$[a, b]$$$ that are greater than $$$b$$$.

To do this, maintain a segment tree, and at every segment tree node a indexed set of the current values below. You can count the number of values in $$$[a, b]$$$ greater than $$$b$$$ in $$$\log n$$$ queries of the type how many values in this set are greater than $$$b$$$ which you can answer in $$$\log n$$$ time each, for $$$\mathcal{O}(\log^2 n)$$$ time per query.

  • »
    »
    4 weeks ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Well I thought of something similar, but the problem with this (and the reason why I asked for a $$$O(log\space n)$$$ solution) is that I do the queries of the second type during a binary search which makes it $$$O(log^3 n)$$$ (I re-worked the constrains so that it is more clear now).

    I also couldn't find a way to compress the indices so that exactly one add and remove operations happens at each index even if the $$$query$$$ queries were offline, because we will have to map them to the correct new indices (i.e. we have to extend or shrink the ranges of the queries) which was hard for me. Can you describe a way to do this compression?

    And can we replace the segment tree of indexed set in each node with another data structure that can do the same queries (how many values are greater than $$$b$$$) faster, and also support the same updates (even if the updates would be slower like $$$O(log^2 n)$$$)?

    • »
      »
      »
      4 weeks ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      Here's code for compressing the indices. It's basically just a bunch of binary searching. The second value in comp represents if we have added at that position and if we have deleted at that position.

      code
      • »
        »
        »
        »
        4 weeks ago, # ^ |
          Vote: I like it +6 Vote: I do not like it

        I didn't think about it this way at all!! With this I can also do the compression for the queries online, right?

        I can store only the $$$add$$$ and $$$remove$$$ queries in comp and do a binary search to find where $$$l$$$ and $$$r$$$ should be at each query. (??)

        • »
          »
          »
          »
          »
          4 weeks ago, # ^ |
            Vote: I like it +3 Vote: I do not like it

          Yes, you only need to know where the add operations will be.