### xuanji's blog

By xuanji, history, 7 weeks ago, Curious if anyone knows how to solve the following problem faster than $O(n^2)$.

Given an array $a$ of length $n$ and an integer $n$, find any two numbers $l$ and $r$ ($l \le r)$ such that:

• Let $a'$ be the subarray $a_l, a_{l+1}, ... a_r$. Then for each $x$ that appears in $a'$, $x$ appears in $a'$ at least $k$ times (i.e. $k$ or more array elements are equal to $x$).
• The value $r-l$ is maximized.

Source of problem: I misread https://codeforces.com/contest/1676/problem/F Comments (7)
 » 7 weeks ago, # | ← Rev. 3 →   [deleted] sorry i misread the blog
 » 7 weeks ago, # | ← Rev. 2 →   Yeah bro, I solved this problem only and then looked at output format after writing entire code:PS: By segment, I mean continuous elements.Observation: If frequency of any element in the array < k, we surely cannot take this. So the array will be divided into segments. You can only take elements from one segment, and this segment is a repeating problem(not a subproblem). Solution: Solution(We'll be iterating from left to right) Create map freq: Frequency of elements which can be considered. Create map cur_freq: Frequency of elements in the segment we are considering Iterate from left to right: If freq[a[i]]+cur_freq[a[i]] >= k, then this element can be taken freq[a[i]]--; // remove the element from considerable and cur_freq[a[i]]++; // add this element to current segment else // this element cannot be taken so check whether the previous segment is required segment or not by iterating over the elements in cur_freq and checking whether all have cur_freq >= k or not and then CLEAR cur_freq as you are going into a new segment. Since every element is added once and removed from the cur_freq array atmost once. So time complexity: O(N*logN).
 » If all array elements occur at least $k$ times, then the answer is $n-1$. Otherwise there is an element that occurs less than $k$ times and we will "split" the array into subarrays that don't contain that element and recursively solve those.Store all indices of all array elements in a map> loc; that we can use binary search on to calculate the frequency of an element $x$ in an interval $[l,r]$. Let's write a function $count(x,l,r)$ that does this. We will write a function $solve(l,r)$ that calculates the maximum answer on the subarray $a[l,r]$. We will use a 2-pointer trick with divide and conquer to achieve a good time complexity. Move the $l$ pointer to the right and the $r$ pointer to the left until we find an element that occurs less that $k$ times, and recurse if necessary. The code will look something like this int solve(int l, int r) { if(l > r) return -1; int L = l, R = r; while(l <= r) { if(count(a[l], L, R) < k) { return max(solve(L, l-1), solve(l+1, R)); } l++; if(count(a[r], L, R) < k) { return max(solve(L, r-1), solve(r+1, R)); } r--; } return R-L; // all elements occur >= k times } if we recurse on index $i$, then the recurrence relation is$T(l,r) = T(l,i-1) + T(i+1, r) + O(min(i-l,r-i)*log(n))$The solution to this recurrence is $O(n*log^2(n))$, which is the overall time complexity.
 » Yes, this problem can be solved in $O(n \log n)$.First, let's assume we have a function that tells us whether a given subarray $l \ldots r$ is suitable and if not, tells us the position of an offending element, i.e. one that appears in this subarray more than 0 but less than $k$ times. Later we will discuss how to implement this efficiently, i.e. in time $O(\log n)$.To solve the problem: Try the entire array. If this does not work: Find a position of a value that appears more than 0 but less than $k$ times. Clearly this position can not be in the solution at all. Cross it off, you're left with one or two subarrays. Recurse on these subarrays, return the longest solution found. The complexity is $O(n \log n)$: every element is crossed off at most once as the subarrays you recurse on are always distinct.Now let's discuss how to check if a subarray is suitable. First, we calculate values $\mathrm{need}[i]$: suppose $a_i$ is the $j$-th occurrence of $a_i$ in the array. Then $a_{\mathrm{need[i]}}$ should be the $j-k+1$-th occurrence of $a_i$ in the array. If $j < k$, then set $\mathrm{need}[i] = -\infty$. In other words, $\mathrm{need}[i]$ is the rightmost possible left endpoint of the subarray if $a_i$ is included and it is the last of its value in the subarray. How to calculate $\mathrm{need}[i]$ is left as an exercise to the reader but it is straightforward.Now we build a persistent segment tree that stores pairs, supports range minimums and point updates. Initialize it with everything set to $\langle \infty, \infty \rangle$.The $i$-th revision of the segment tree is built as follows based on the $(i - 1)$-th revision: Set the $i$-th element of the segment tree to $\langle \mathrm{need}[i], i \rangle$. Let $j$ be the last occurrence of $a_i$ before $i$ (if there is none, skip this step). Set the $j$-th element of the segment tree to $\langle \infty, \infty \rangle$. To check if subarray $l \ldots r$ is suitable, take range minimum on the segment tree on the range $l \ldots r$ and revision $r$. If the first element is $\ge l$, the subarray is suitable. Otherwise, the second element tells you a faulty position.
 » Why is it that every time I see a blog like this, it has existed for 9 hours, unanswered, last comment in 3 hours. Then when I start writing my response suddenly there are already 2 comments explaining the solution.
•  » » Haha, I thought the same! At least the solutions are a little different
 » 7 weeks ago, # | ← Rev. 5 →   Yes you can solve this in $O (n log n)$. Use Sweepline + Min with Range add segment tree. Iterate over the left border $l$, define $weights[i] = 1$ if position $i$ is the first occurrence of $a_i$ on $[l,n]$, $-1$ if position $i$ is the $k$-th occurrence of $a_i$ on $[l,n]$, and 0 in all other cases. The segment tree will store a prefix sum of this weight. So the rightmost value r such that $[l,r]$ being valid is precisely the the rightmost value 0 on this segment tree. This can be found with segment tree descent (note that prefix sum of weight is non-negative, so we can find 0s by a descent.). It can be seen that every time you move $l$ to $l+1$, you just need to do upto 4 range add/subtracts.