Median of the subarray.

→ Pay attention

Before contest
Pinely Round 4 (Div. 1 + Div. 2)
37:59:27
Register now »

*has extra registration

→ Streams

Atcoder ABC #364 Solution Discussion

By aryanc403

Before stream 13:09:26

View all →

→ Top rated

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

View all →

→ Find user

→ Recent actions

Detailed →

v0rtex's blog

Median of the subarray.

By v0rtex, history, 3 years ago, In English

If you are given an unsorted array of size N and some queries Q which consists of l,r as the range of the subarray. The task is to find the median of this subarray. 1<=N,Q<=5*10^4. Now if we sort the subarray per query the time complexity will be Q*nlogn (here n is the length of the subarray) which will give us TLE. Can anybody share the optimal approach?

range query, median, sorting, time limit exceeded

v0rtex
3 years ago
16

Comments (16)

Write comment?

Samboor

3 years ago, # |

+11

Let's build merge-sort tree on our array. We can use binary search to check if the given value k is less than or equal to the median of subarray. Let's decompose interval into segments on our merge-sort tree. We can sum up the number of elements that are less than or equal to k in each of the segments with lower_bound or sth. similar. Based on this sum we can make transitions in binary search (if the sum is smaller than (r-l+1)/2 then the median is bigger etc). We can answer each query in O(log^3 N) since we use lower_bound on O(log N) segments in each iteration of binary search. I hope my explanation is clear.

→ Reply

v0rtex

3 years ago, # ^ |

I know the concept of segment tree but I didn't know about merge-sort tree so I was sure that something special is needed for this problem, something like segment tree. Now that you have cleared my doubt, I can solve this. Thanks!

→ Reply

Samboor

3 years ago, # ^ |

Merge-sort tree is basically a segment tree. In its segment responsible for interval [l, r] it keeps sorted subarray [l, r].

→ Reply

SecondThread

3 years ago, # ^ |

FYI, another name for this structure is a wavelet tree, in case you look it up online.

→ Reply

SuperJ6

3 years ago, # ^ |

← Rev. 2 →

+10

Wavelet tree is not the same. It is has same functionality as described but is an optimized version of merge sort tree via fractional cascading.

→ Reply

Amtek

3 years ago, # ^ |

Wavelet trees are not mergesort trees (even when you use fractional cascading). Wavelet trees are actually sort of the opposite of mergesort trees, because instead of keeping values a_i such that l<=i<=r in its nodes, we keep values a_i such that l<=a_i<=r. Wavelet trees are also usually more versatile, due to their ability to perform more complex updates on the sequence than classic mergesort trees allow.

→ Reply

LanceTheDragonTrainer

3 years ago, # |

+15

This problem can easily be reduced to SPOJ's MKTHNUM task. There are a variety of solutions to that including a $$$\mathcal{O}(QlogN)$$$ one using persistent segment tree.

→ Reply

LetterC67

3 years ago, # |

Mo's algorithm should also work here as $$$N, Q$$$ is not that big. Using a built-in order statistic tree or Segment Tree to quickly find median give us an $$$O((N + Q) * \sqrt{N} * log(N))$$$ solution!

→ Reply

LanceTheDragonTrainer

3 years ago, # ^ |

← Rev. 2 →

You don't need the order statistic tree or any another hand-written trees actually.

This problem is special because it asks only for the median of the range.

Hence, in this case, we can use the built-in balance binary search tree (set in C++ and TreeSet in Java). We will need two of them in this case. Then, we just need to follow the two-heap algorithm for solving the famous problem "find median of a running stream of integers" (you can easily google for this). Of course, we need to handle delete operations for our task. That's why we used a balanced binary search tree instead of a heap.

→ Reply

LetterC67

3 years ago, # ^ |

Oh I never seen that before, really nice technique!

→ Reply

WildHamburger

3 years ago, # |

← Rev. 3 →

+60

There 3(maybe more) methods of solving this problem.

1) Persistent Segment Tree

Finding the k-th smallest number in a range

In this task you just need find $$$n/2$$$-th smallest number in a range.

Complexity : $$$O($$$ $$$N$$$$$$log(N)$$$ $$$+$$$ $$$Q$$$ $$$*$$$ $$$log$$$$$$(N)$$$$$$)$$$

2) Merge Sort Tree.

Saving the entire subarrays in each vertex:

(Described in the other comment)

Complexity : $$$O($$$ $$$N$$$$$$log(N)$$$ $$$+$$$ $$$Q$$$ $$$*$$$ $$$log^3$$$$$$(N)$$$$$$)$$$

3) Mo's algorithm

Mo's algorithm

If you want to solve it with MO's algorithm you need to use some data structure that can :

Add element
Delete element
Find the $$$n/2$$$-th element of the multiset.

You can use segment tree to do this.

Complexity : $$$O($$$ $$$Q$$$$$$log(Q)$$$ $$$+$$$ $$$(N + Q)$$$ $$$*$$$ $$$log$$$$$$(N)$$$ $$$*$$$ $$$sqrt(N)$$$ $$$)$$$

Sorry for my bad english.

→ Reply

Iftekhar_Hakim_K

3 years ago, # ^ |

For persistent segment tree, complexity should be $$$O(NlogN+QlogN)$$$.

→ Reply

Into_Your_Arms

3 years ago, # |

← Rev. 2 →

Ok I think we can solve it by maintaining one min and one max heap. Like we do in case of finding median of stream of numbers. Of course in this case we should do it offline by sorting the queries. As we proceed in the queries in sorted order we will encounter some addition or deletion so just keep balance between these two heap. Time complexity would be $$$O(NlogN+QlogQ)$$$.

I think it should work but you are welcome to point it out if you think it doesn't.

→ Reply

LanceTheDragonTrainer

3 years ago, # ^ |

I am not sure what is your sort key. But plainly sorting by left or right query endpoint does not work. It is easy to create test cases to make that strategy get TLE.
Using the built-in heap (for Java and C++) doesn't work. Because you'll need to delete elements from your data structures efficiently. By default, deletion in heap is bad because you have to search the entire heap for that element

→ Reply

Into_Your_Arms

3 years ago, # ^ |

← Rev. 2 →

No no. I said heap just for understanding. Of course we are going to use multisite for this purpose otherwise it will TLE.

But I feel your 1st point is valid.

→ Reply

bihariforces

11 months ago, # |

You can solve this using a Trie in $$$O((N + Q) \log N \log max(A_i))$$$, build a Trie and insert each number from left to right as a padded decimal string, and while inserting, for each Trie Node in path, append the index of the number in node. ie. If values with indices 4, 6, 9 passed through some node while inserting, then node will consist of $$$[4, 6, 9]$$$, if you insert in left to right order, these will be sorted for each node by default.

This allows us to simulate a trie that only consists of elements in a range, for example, if range query is $$$[L, R]$$$ then for each node binary search the count of indices in range $$$[L, R]$$$, this will be number of values with given prefix in given range, Now we can do all query operations on this trie like any usual trie.

For this problem the operation is $$$Kth$$$ smallest element in Trie $$$(K = N / 2)$$$.

This can be performed as shown in this blog https://codeforces.com/blog/entry/104650

All we do is replace node frequency with binary search on indices in range, This solves the problem, and we can also extend this idea to allow updates (delete old and insert new) by keeping ordered set in each Trie Node, which allows $$$O(\log N \log max(A_i))$$$ updation but higher constant factor.

→ Reply