Sorting sliding windows online [Looking for answers]

→ Pay attention

Before contest
Codeforces Round 941 (Div. 1)
04:38:49
Register now »

*has extra registration

Before contest
Codeforces Round 941 (Div. 2)
04:38:49
Register now »

*has extra registration

→ Streams

Atcoder ABC #351 Short Solution Discussion

By aryanc403

Before stream 03:43:48

Codeforces Round 941 (Div 1 + 2) Solution Discussion (with Jan)

By Shayan

Before stream 06:43:48

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

div4only's blog

Sorting sliding windows online [Looking for answers]

By div4only, history, 15 months ago, In English

Given an $$$1$$$-indexed integer array $$$a=[a_1,\,a_2,\,a_3,...,\,a_n]$$$ and a fixed windows size $$$k$$$，define sliding window $$$I_{j,\,(j \geq k)} := [a_{j-k+1}, a_{j-k+2}, ..., a_j]$$$. For each $$$j \geq k$$$, we need to answer a query:

How many sliding windows in $$$\{I_k, I_{k+1}, ..., I_{j-1}\}$$$ are less or equal than $$$I_j$$$ in the alphabetical order?

For example, $$$a=[1, 2, 1, 3]$$$ and $$$k=2$$$:

(1) For $$$j=2$$$, we should answer $$$0$$$.

(2) For $$$j=3$$$, we should answer $$$1$$$ as $$$[1,2] < [2,1]$$$ in alphabetical order.

(3) For $$$j=4$$$, we should answer $$$1$$$ as $$$[1,2] < [1,3]$$$ in alphabetical order.

Suffix array + LCP array can solve it in $$$O(nlogn)$$$ offline. But how about solving online? For example, what if $$$a$$$ is an unbounded datastream instead of an array? In the datastream case, you have to process $$$a_j$$$ and $$$I_j$$$ before reading $$$a_{j+1}$$$.

strings

div4only
15 months ago
6

Comments (6)

Write comment?

MZuenni

15 months ago, # |

+10

With hashing+order statistics you can solve it in $$$O(n\log(n)\log(k))$$$. The order statistics tree gives you the position in the sorted sequence of all $$$I_j$$$ in $$$O(n\log(n)\cdot\texttt{cmp})$$$, where $$$\texttt{cmp}$$$ is the time needed to compare two sequences. With hashing you can compare two sequences in $$$O(\log(k))$$$ by binary searching the longest common prefix and then comparing the next index.

→ Reply

div4only

15 months ago, # ^ |

← Rev. 2 →

But here is a problem: The hash value of a window might be exponentially large, and you may need $$$O(k)$$$ time to maintain such a large number.

→ Reply

Svyat

15 months ago, # ^ |

Hash functions are surjective. There's no point in making it bijective for the reasons of performance. As a downside — collisions are possible. That's why you should minimize the chance of a collision (random moduli or base).

→ Reply

div4only

15 months ago, # ^ |

But how can we compare windows after moduli?

→ Reply

Svyat

15 months ago, # ^ |

link1 (cf)
link2 (cp-algorithms)

→ Reply

div4only

15 months ago, # ^ |

← Rev. 2 →

Thanks!

→ Reply