CleanAirAndCleanWater's blog

By CleanAirAndCleanWater, history, 9 years ago, In English

Hi all,

I have been struggling with this problem for a long time, that would be really helpful if you could give me some hints :)

Given an array A of N elements integer number. N can be up to 200000. Each element A[i] ranges from -1e9 to 1e9, no element equals to Zero (0). We define a "valid" sequence such that 1. {x, -x} is valid sequence. 2. {x, S, -x} is valid sequence if S is a valid sequence. 3. {S1, S2} is valid sequence if S1, S2 are both valid sequences.

There are M queries, M can be up to 200000. Each query asks you to find the longest valid sequence in range [L, R] with 1 <= L <= R <= N.

Link to the problem: http://vn.spoj.com/VM14/problems/VMLSEQ/ (Vietnamese, my friend told me the contest is over so submission is not allowed anymore, but I think it's still very interesting/hard problem for discussion)

I came up with a hashing — based method to compute the "nearest" index j for each index i so that sequence [i, j] is valid, but I couldn't go any further. I thought the solution could be some kinds of merging — technique with two consecutive ranges in Interval tree (suffix — prefix), but so far I have no efficient method for that :(

| Write comment?
»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Is a sequence like {-5, 5} valid? I was looking at the original statement and I couldn't find an example of a sequence starting with a negative and ending with a positive number. If it always starts with a positive, that is of type {+x, -x}, then constructing a solution might be considerably easier.

»
9 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

You can answer each query in O(log^2(N)) time. My idea is using segment tree + hashing where in each node I save the lenght of the largest valid sequence, lenght of longest prefix and suffix. So the new lenght of the of longest valid sequnce will be ( LeftChild . suf + RightChild . pref) + bin_search(Here we will compute the lenght of the longest common subsequnce (such that if the first string is a and the second is b for each element a[i] = -b[i]) of [start_left_suf - 1; l] and [end_right_pref + 1; r]here l, r are borders of the current segment/node).

So the binary search is working in O(log(lenght)) = O(log(N)). That means that the merging is also done in logN time witch leads us to the total complexity of O(Nlog(N)) build and O(log^2(N)) query.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Could you elaborate on the idea of binary searching when merging? Perhaps with an example?

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it +8 Vote: I do not like it

      I guess by [start_left_suf — 1; l] he means [l; start_left_suf — 1] but starting from start_left_suf-1, that is [l; start_left_suf — 1] but reversed.

      Consider the sequence {1, 1, 1, 1, -1, -1, -1, -1} containing 8 elements. Let's assume that we have somehow computed the answers for [1;4] and [5;8]. All of those answers are actually zero so let's merge them. He said that we take LeftChild.suf+RightChild.pref which is 0. Then assuming we have hashed the array considering all elements from the right segment [5;8] being multiplied by minus one, we get {1, 1, 1, 1, 1, 1, 1, 1} hashed in some way and we can do binary search to find the length of the longest prefix of [5;8] that is also a suffix of [1;4] which gives us 4.

      However, here are some of my questions that don't seem to be answered in his comment:

      1) How does one compute the "length of longest prefix and suffix" stored in each node?

      2) Is the length we find with the binary search actually the longest valid subsequence? I don't think so and here is a counter example:

      {-6, 6, -7, -4, 17, 3, -2, 2}

      What we find with the binary search will be 0 since -4 differs from -17, however there are two subsequences of length 2 ({-6, 6}, {-2, 2}).

      So we need to compare the answer obtained with the binary search and the longest valid subsequences for both of node's children.

      But this is going to work only if the binary search really finds the longest valid subsequence that has some elements from the left and some elements from the right child.

      3) Does the binary search actually find what it is supposed to?

      Here is an example:

      {1, 2, -2, 3, -3, 1, 1, 1}

      We will again assume that we have computed the answers for [1;4] and [5;8]. So LeftChild.suf+RightChild.pref gives us 0. Then the binary search will result into getting 2 as the length of the longest subsequence ({3, -3}) but the answer is 4 ({2, -2, 3, -3}) so his solutions seems to be incorrect.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +3 Vote: I do not like it

        This is precisely why I asked for more details, it seemed that the binary search would neither be implemented easily nor work correctly. Maybe he can explain further if his solution is actually correct :)

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Sorry for wasting your time :/ . The 3rd error proved that the solution won't work.

»
9 years ago, # |
Rev. 14   Vote: I like it +16 Vote: I do not like it

Everyone whose a SPOJ account can practice the problem described in this link: VMLSEQ

I solved this problem. Here comes some main ideas:

We can split the given sequence into some arrays of indices (I will call them as chain): For example

10 2

2 1 -1 -2 5 1 -1 7 -7 5

can be splitted like this:

0 4

1 3

5 7 9

6

8

10

Property:

In each chain, for every two element x and y, a[x + 1..y] is a valid subsequence.

It's not hard to build these chains, I think you can do it yourself.

Now for each query l r, change it int to l-1 r you have to find two indices x and y which satisfy l <  = x <  = y <  = r and x, y are in the same chain.

My first approach is sort all queries (l-increase then r-increase). Iterate all queries then update result via a segment tree. Complexity is O(NlogN) for building chain, O(QlogQ) for sorting queries and O(Q * SQRT(Q) * logN) for solving queries. Unfortunately, this solution got TLE in some large tests.

The second approach is SQRT decomposition: sort queries and solve them in each SQRT(N) block, separately. This approach have complexity is O(Q * SQRT(Q)) for solving queries. It got 100 points and took about 2s in runtime. You may read the main idea of this approach HERE.

The third approach is heavy-light. Consider chains which have size smaller than BUBEN is light, otherwise heavy. For light chains, sort queries and iterate them, save local best of each queries. It takes O(Q + |sum.size.of.all.light.chains|) in worst case. For heavy chains, iterator all queries, with each pair of queries and heavy chain, use binary search and segment tree to find local best. It takes O(Q * |number.of.heavy.chains| * logN) in worst case. Combine local best of two phase, we have the answer for all queries. Chose BUBEN as SQRT(N), the complexity O(Q * SQRT(N) * logN) in worst case but indeed this solution is extremely fast (currently best time). I think it's hard to make a test to kill this solution.

My explaination seems confusing since I have no experience in explanning my ideas but I hope this is helpful.

The main idea of my algorithm is solve the chains queries offline, I found a problem which can be solved using this method: http://www.spoj.com/problems/ZQUERY/ . Time limit for this problem is relaxing, so you can take some implementations for benchmark.

All of algorithms I described have complexity SQRT(N) as a factor, I'm curious about the poly-log algorithm. Please share here if you have such that kind of algorithm.

UPD: There is an algorithm with complexity O(N4 / 3). Let's separate given sequence into S blocks, we can create dp[i][j] = Length of longest valid subsequence from which starts at block i and ends at block j. Sort all queries then solve each of them with O(N / S) in worst case. So we have a solution with complexity O(S * S + Q * N / S). Set S equal to N2 / 3, we have an O(N4 / 3) solution (considering N = Q). However this approach took 1.8s in the problem, much slower than the third.