### Phon1209's blog

By Phon1209, 7 months ago, ,

Problem statement: There are string $A$ and set of string $B$, length of $A <= 100000$, set $B$ have 500 string elements and $|B_i|<=500$. All string consisted of the lowercase alphabet only. The task is to find longest common subsequence of string $A$ to all string $B_i$.

• +2

 » 7 months ago, # |   -17 i don't know much about suffix array, but i think with this data structure you can solve this problem. Good LuckLink about how calculate LCS with suffix array -> http://lpcs.math.msu.su/~pritykin/csr2008presentations/starikovskaya.pdf
•  » » 7 months ago, # ^ |   0 I want to find longest common subsequence, not a substring. But thank you for helping me.
•  » » » 7 months ago, # ^ |   +8 oh, sorry :c
•  » » » » 7 months ago, # ^ |   0 However, about suffix array I think this link can help you to get along with it.
 » 7 months ago, # | ← Rev. 3 →   -6 Time complexity can't be reduced, but you can actually reduce used memory. While calculating dp[i][j] you just need dp[i-1][j] and dp[i][j-1], so you can remeber only last row and row of dp you are currently calculating. EDIT: Nevermind, haven't noticed you've got 500 strings in array, so that definitely won't pass.
 » 7 months ago, # |   +12 There are two problems here which are easy to confuse:For the Longest common subsequence problem, $\mathcal{O}(mn)$ is pretty much the best you can do (apparently you can get $\mathcal{O}(n^2/\log n)$ if $m=n$, see here).For the Longest common substring problem, you can for instance use suffix arrays (as in the link given by DougNobrega) to achieve $\mathcal{O}(m+n)$.Are you sure your problem is asking about subsequences rather than substrings?
•  » » 7 months ago, # ^ |   0 Yes, definitely subsequence.
 » 7 months ago, # |   +20 Note that $|B|=m\leq 500$, so the answer is not to be very large.Let $f_{i,j}$ denote if the length of longest common subsequence is $i$, what is the minimum position of $k$ that $LCS(A[1..k],B[1..j])$ is $i$. There are only $O(m^2)$ states, and the complexity is $O(26n+m^2)$.In your problem there are $500$ queries, so the whole complexity is $O(26n+500m^2)$.
•  » » 7 months ago, # ^ |   0 Can you explain it in detail? It's really like what I looking for.
•  » » » 7 months ago, # ^ |   +10 Precompute: $g[i][j]$ denotes the minimum position $k$ that $A[k]=j$ and $k\geq i$, this can be calculated in $O(26n)$.Start DP from $f[0][0]=0$.Transition Equation: $f[i][j]\rightarrow f[i][j+1]$ $g[f[i][j]+1][B[j+1]]\rightarrow f[i+1][j+1]$
 » 7 months ago, # |   0 Do you have a link to the problem?
 » 7 months ago, # |   +5 Use a bitset to maintain the differentiated array of each row in the trivial O(n * m) DP and you'll get an O(n * m / w) algorithm. (w = 32 or 64 usually)(Sorry that I've got the description of this optimization in Chinese :(