Finding lexicographically minimal merge of two strings in linear time

№	Пользователь	Рейтинг
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

After reading the editorial for RCC Elimination round problem E, I thought of an easier problem of merging two strings such that the result is lexicographically minimal. Formally, a merge of two strings a and b is a string s of length |a| + |b| such that there exist two strictly increasing sequences of indices i₁, i₂, ..., i_|a| and j₁, j₂, ..., j_|b| such that a = s_i₁s_i₂... s_{i_|a|}, b = s_j₁s_j₂... s_{j_|b|} and each index in s appears exactly once in i₁, ..., i_|a|, j₁, ..., j_|b|.

The above mentioned editorial provides an algorithm for solving this problem that works in $\text{[math]}$ time and uses hashes. Actually, this problem can be solved in linear time. The $\text{[math]}$ solution works roughly like this: maintain current position p_a in a and p_b in b. On each step, lexicographically compare the suffix of a starting at p_a with the suffix of b starting at p_b, and take a character from the suffix that is smaller (actually, for this to work, it is necessary to terminate each string with a character that is greater than any character in the strings, so that if one of the suffixes is a prefix of the other, the shorter suffix is considered larger, not smaller). The author proposes to compare the suffixes by using binary search and hashing, which takes $\text{[math]}$ time. However, this can be done in constant time.

Actually, this is a well known Longest Common Extension problem. One of the constant-time solutions is as follows: construct a suffix tree from the strings, then preprocess it using one of Lowest Common Ancestor algorithms that can answer LCA queries in constant time. It is easy to see that the lowest common ancestor of two leaves in a suffix tree that correspond to two suffixes can be used to find the length of the longest common prefix of those suffixes. From that, performing lexicographical comparison is easy.

It is possible to build and preprocess a suffix tree in linear time, so the overall running time is O(n), but the algorithm is quite complex. Does anyone know of a simpler algorithm with the (asymptotically) same running time?

	Rev.	Язык	Кто	Когда	Δ	Комментарий
	en1		eatmore	2017-05-18 18:42:35	2252	Initial revision (published)

Rev.

Язык

Кто

Когда

Комментарий

en1

eatmore

2017-05-18 18:42:35

2252

Initial revision (published)

История