are there any faster algorithm to find the longest common subsequence of string than $$$O(nm)$$$

→ Pay attention

Before contest
Codeforces Round 941 (Div. 1)
2 days
Register now »

*has extra registration

Before contest
Codeforces Round 941 (Div. 2)
2 days
Register now »

*has extra registration

→ Streams

AMA: TheOneYouWant

By aryanc403

Before stream 23:56:27

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	162
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Phon1209's blog

are there any faster algorithm to find the longest common subsequence of string than $$$O(nm)$$$

By Phon1209, 5 years ago, In English

Problem statement: There are string $$$A$$$ and set of string $$$B$$$, length of $$$A <= 100000$$$, set $$$B$$$ have 500 string elements and $$$|B_i|<=500$$$. All string consisted of the lowercase alphabet only. The task is to find longest common subsequence of string $$$A$$$ to all string $$$B_i$$$.

#strings, #lcs

Phon1209
5 years ago
12

Comments (10)

Show archived | Write comment?

DougNobrega

5 years ago, # |

-17

i don't know much about suffix array, but i think with this data structure you can solve this problem. Good Luck

Link about how calculate LCS with suffix array -> http://lpcs.math.msu.su/~pritykin/csr2008presentations/starikovskaya.pdf

→ Reply

Phon1209

5 years ago, # ^ |

I want to find longest common subsequence, not a substring. But thank you for helping me.

→ Reply

DougNobrega

5 years ago, # ^ |

oh, sorry :c

→ Reply

KeNaj712

5 years ago, # |

← Rev. 3 →

-6

Time complexity can't be reduced, but you can actually reduce used memory. While calculating dp[i][j] you just need dp[i-1][j] and dp[i][j-1], so you can remeber only last row and row of dp you are currently calculating. EDIT: Nevermind, haven't noticed you've got 500 strings in array, so that definitely won't pass.

→ Reply

pwild

5 years ago, # |

+12

There are two problems here which are easy to confuse:

For the Longest common subsequence problem, $$$\mathcal{O}(mn)$$$ is pretty much the best you can do (apparently you can get $$$\mathcal{O}(n^2/\log n)$$$ if $$$m=n$$$, see here).

For the Longest common substring problem, you can for instance use suffix arrays (as in the link given by DougNobrega) to achieve $$$\mathcal{O}(m+n)$$$.

Are you sure your problem is asking about subsequences rather than substrings?

→ Reply

Claris

5 years ago, # |

+20

Note that $$$|B|=m\leq 500$$$, so the answer is not to be very large.

Let $$$f_{i,j}$$$ denote if the length of longest common subsequence is $$$i$$$, what is the minimum position of $$$k$$$ that $$$LCS(A[1..k],B[1..j])$$$ is $$$i$$$. There are only $$$O(m^2)$$$ states, and the complexity is $$$O(26n+m^2)$$$.

In your problem there are $$$500$$$ queries, so the whole complexity is $$$O(26n+500m^2)$$$.

→ Reply

Phon1209

5 years ago, # ^ |

Can you explain it in detail? It's really like what I looking for.

→ Reply

Claris

5 years ago, # ^ |

+10

Precompute: $$$g[i][j]$$$ denotes the minimum position $$$k$$$ that $$$A[k]=j$$$ and $$$k\geq i$$$, this can be calculated in $$$O(26n)$$$.

Start DP from $$$f[0][0]=0$$$.

Transition Equation:

$$$f[i][j]\rightarrow f[i][j+1]$$$
$$$g[f[i][j]+1][B[j+1]]\rightarrow f[i+1][j+1]$$$

→ Reply

Redux

5 years ago, # |

Do you have a link to the problem?

→ Reply

djq_cpp

5 years ago, # |

Use a bitset to maintain the differentiated array of each row in the trivial O(n * m) DP and you'll get an O(n * m / w) algorithm. (w = 32 or 64 usually)

(Sorry that I've got the description of this optimization in Chinese :(

→ Reply