svg_af's blog

By svg_af, history, 8 years ago, In English

Hello There

So i found this Problem where i have to find the LIS of 10^5 pairs of integers

a pair is greater than another pair if x1<x2 and y1<y2 strictly

now finding LIS using a segment tree for such a large input is no issue ... but sorting the pairs kinda confused me

any ideas or hints how should i think about this problem ?

  • Vote: I like it
  • +7
  • Vote: I do not like it

»
8 years ago, # |
Rev. 3   Vote: I like it +6 Vote: I do not like it

I would do it with 2d segment tree for max. The LIS for the current position is just max in rectangle [(0, 0) — (xi — 1, yi — 1)] + 1. After that we need to update current position (xi, yi) with the current value of LIS. Its really similar to the solution of LIS of integers except the fact it uses 2d segment tree/BIT. In my opinion this solution is much easier to write and understand.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How could be the implementation of a 2D segment tree with those constraints? I know we can do some coordinate compression, but still there may be 10^5 different x and y coordinates.

    Sorry, I've never done a 2D segment tree. I thought that 2D segment trees were Quad Trees, but it seems they are segment trees whose nodes are segment trees too @_@

    • »
      »
      »
      8 years ago, # ^ |
      Rev. 2   Vote: I like it +4 Vote: I do not like it

      Sorry it took so long, but it ca easily be done using max dynamic fenwick or dynamic segment tree. Or just by making it a segment tree in every node of which there is an implicit treap (or just treap). Or just compress the cordinates.

      • »
        »
        »
        »
        8 years ago, # ^ |
          Vote: I like it +5 Vote: I do not like it

        You say "easily", but that doesn't sound easy at all for me hahaha. I got a lot to learn then.

        Thanks!

  • »
    »
    8 years ago, # ^ |
      Vote: I like it +4 Vote: I do not like it

    but what about memory ??

    the concept is beautiful and new to me ... but is there a way to implement it with constraints being 10^5 ??

»
8 years ago, # |
  Vote: I like it -10 Vote: I do not like it

You can just process all pairs by increasing y, and if some of them have the same y then process them by increasing x. Now if we are proccesing pair (xi, yi), then its' result ri equals to 1 + max rj where j < i, xj < xi. You can get this max by using segment tree where for each y you store maximal result of already proccesed pairs with exact y. Because cordinates can be up to 109, you need to compress them so you will be able to store them in tree (it's needed only for y). Answer for whole task is maximum result of all pairs.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Wouldn't that be O(n) per query in the worst case?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Why would it be? You are getting max rj in log n because it's just simple query on segment tree, and you update a tree also in log n.

      • »
        »
        »
        »
        8 years ago, # ^ |
          Vote: I like it +10 Vote: I do not like it

        Sorry I didn't understand your approach very well.

        If you have 3 points (3,3), (2,2), (1,1), if you process them by increasing y (breaking ties with increasing x) wouldn't your algorithm produce 3 as an answer?

        • »
          »
          »
          »
          »
          8 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Oh yea, I missunderstood a problem I thought it's set of point, my bad.

»
8 years ago, # |
Rev. 2   Vote: I like it +40 Vote: I do not like it

Numerous O(nlg^2n) solution exists for this problem, and this is my favorite solution.

dp formula is : dp[i] = Max(xj < xi, yj < yi, j < i) dp[j] + 1. In this formula we are considering all O(n^2) pairs one by one — which is slow. Now, I will calculate this table faster with divide & conquer approach.

Let solve(s, e) -> calculate all dp value pairs from s to e. What we will do is -

  • call solve(s, m) to get dp values (m = (s+e)/2)

  • "propagate" the dp values from [s, m] to [m+1, e]

  • call solve(m+1, e) to get the rest.

when we are doing D&C, we should still consider all O(n^2) pairs. This is why we propagate all the values from [s, m] to [m+1, e].

Naive propagation can be done by this short code :

for(i = [m+1, e]) update dp[i] to Max(xj < xi, yj < yi, j = [s, m]) dp[j] + 1

But notice that [s, m] and [m+1, e] are disjoint intervals — which eliminates the needs to consider for i < j. We can preprocess the values in [s, m] by sorting in xcoord & maintaining segment trees by ycoord. Now, updating values in [m+1, e] can be replaced by simple queries.

Time complexity is T(N) = 2T(N/2) + O(NlgN), we can show T(N) = O(Nlg^2N) by master's theorem. Memory complexity is O(N) (which is a lot better than 2d segtrees.)

  • »
    »
    8 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    It is actually T(N) = 2T(N/2) + O(NlogN) which is O(Nlog^2N) too, isn't it?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      You're right. Thanks for pointing out that.

»
8 years ago, # |
Rev. 3   Vote: I like it +10 Vote: I do not like it

I implemented a solution using 2D BIT with coordinate compression. We only create entries on the map as needed, so memory and runtime are both O(Nlog^2N). I used unordered_map to speed up the queries, but I'm still getting TLE #3.

I tested it on my computer and it runs in around 2 seconds for the worst test case, but the problem asks for runtime under 0.345 seconds...

I wonder if anyone has anything faster than O(Nlog^2N).

  • »
    »
    8 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Actually, my above solution can be optimized to NlgN since -

    • There is O(N+Q) RMQ.

    • Sorting step can be eliminated by mergesort-fashioned way (cache the result like merge sort tree)

    • since we can't update the linear RMQ, we should compress the coordinate at every recursion stage. Solution to this is similar as above.

    So better asymptotic is "possible". But I've never heard about "practical" O(nlgn) solution.

    btw if you are getting TL with 2d segtrees, I "strongly" recommend you to write above solution. I guess it will run at least 2-3 times faster than 2d segtrees (which are quite slow + memory hungry in practice).

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 4   Vote: I like it 0 Vote: I do not like it

      Hi, could you explain more on this algorithm? I don't see how linear-time RMQ can be used here. Are you talking about 2D static RMQ, or 1D dynamic, or some other version?

»
8 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I believe it can be solved in O(nlog(n)^2) in the same manner as usual LIS but with constant less than in segment tree. Let dp[i] denote the set containing smallest elements where subsequence of length i could end(if we sort first elements in this set, the second elements will be anti-sorted).Now upper bound will work in the same manner instead of array of sorted elements d[i] we get array of sorted array sets. I am not sure about that:)

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

There was a comment that said this problem is similar to problem NICEDAY on spoj. But I can't see the similarity. Does anyone know how?