swordx's blog

By swordx, history, 4 months ago, In English,

Hello Guys,

Can you share your thoughts on the following question:

Given a string, we have to find the longest non-overlapping repeating subsequence. For example, If we have a string like "abdead", then here the answer is "ad" because longest repeating subsequence is "ad" which is non-overlapping.

My initial thought on this question is that we can use the concept of longest common subsequence but we need to manage the non-overlapping part.

Thanks in advance!

 
 
 
 
  • Vote: I like it
  • +8
  • Vote: I do not like it

»
4 months ago, # |
  Vote: I like it +14 Vote: I do not like it

Split the string into two strings $$$s_1,s_2$$$ such that $$$s=s_1s_2$$$ in all possible ways (there are $$$n-1$$$ ways) and then find the LCS of those two strings. The largest such LCS is the solution. The complexity is $$$O(n^3)$$$. It can probably be done faster though.

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thanks @ivan110sic, seems like a good approach. But is there any other way possible which is better than O(n^3)?

  • »
    »
    4 months ago, # ^ |
      Vote: I like it -11 Vote: I do not like it

    You can use ternary search for the split, and it'll be $$$O(n^2log(n))$$$

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      You cannot use ternary search. For example for s = "sxyaatxysaaxyt" the LCSs for subsequent splits are 1 2 3 3 4 5 4 5 4 3 3 2 1.

      However, there is $$$O(n^2)$$$ algorithm for finding the optimal split.

      • »
        »
        »
        »
        4 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        @moonsoon, Can you please share the algorithm to find the optimal split. I tried reading the article you mentioned, But I am not able to understand the algorithm.

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    if s = "abcadbcd" then longest non onverlapping repeating subsequence is abcd, but with your approach there donot exist a spliting of s in which LCS of s1 and s2 is abcd. please correct me if i am wrong

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      No, you need to find the longest subsequence which has the form $$$ss$$$ for some string $$$s$$$, not two identical non-overlapping subsequences.

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I think I miss understood the meaning of non overlapping.

      My understanding is no character of subsequence1 should be at the same position as any character of subsequence2

      What I expect everyone is taking non overlapping meaning: subsequence1 should end before subsequence2 starts

»
4 months ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

For each suffix run z-algorithm. O(n^2) update: only works for segments

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    But, Z-algorithm will find the longest substring, not the subsequence right??