[Tutorial] Cartesian tree and some related problems

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	atcoder_official	148
5	-is-this-fft-	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

The Cartesian Tree is a useful concept for solving some problems which revolve mainly about thinking of the range of maximality of elements. It also provides a sort of "persistance" to the algorithm of monotonic stack, if you will. In this blog, we will learn what a Cartesian Tree is, how to construct one and solve some related problems to strengthen our understanding. Let's begin!

What is a Cartesian Tree?

Let's consider an array $$$a$$$ where all values are distinct. The Cartesian Tree for the array $$$a$$$ is a binary labeled tree build recursively from the function $$$get(l,r)$$$ which constructs our desired tree from the subarray $$$l...r$$$.

Now let's define $$$get(l,r)$$$

First root the tree at index $$$i$$$ such that $$$a_i = min(a_l,a_{l+1},...,a_r)$$$. Note that because all values are distinct, $$$i$$$ will always be unique.
The aforementioned index will become the root of the Cartesian Tree. It's left and right subtrees will be recursively generated by $$$get(l,i-1)$$$ and $$$get(i+1,r)$$$ respectively. If $$$i=r$$$ we will ignore the right subtree, if $$$i=l$$$ we will ignore the left subtree.

As an example, consider $$$a=[9,3,7,1,8,12,10,20,15,18,5]$$$. After calling $$$get(1,11)$$$, we get the following tree:

picture from here

By construction, the Cartesian Tree for an array $$$a$$$ where all values are distinct is always unique. Usually, when there can be repeating values, we will construct the Cartesian Tree in the same manner as before, but we will build it from the pairs {$$$a_i, i$$$}.

The above process can be easily simulated for building a Cartesian Tree in $$$O(nlogn)$$$ using a range minimum query.

Properties of the Cartesian Tree

Heap: It has a min/max heap structure

Corollary: Let $$$S_i$$$ be a set where $$$x \in S_i$$$ if and only if there exists $$$1 \le l \le i \le r \le n$$$ such that $$$min(l...r)=x$$$. If we traverse the path from node $$$i$$$ to the root, we will get all values in $$$S$$$ in decreasing order.

Inorder Completeness: Each subtree represents a range in the original array

Equivalence to RMQ: Consider $$$2$$$ nodes $$$(i,j)$$$. Let $$$x$$$ be their lca in the Cartesian Tree. We can show that $$$x = min(i...j)$$$.

Why?

Note that by using the last claim we can easily support range minimum query operations on the Cartesian Tree.

Also note that the second claim allows us to build a cartesian tree in $$$O(n)$$$, we can find the parent of each index using monotonic stacks.

Yet Another Array Counting Problem

Let's start with a problem authored by my friend Gheal.

This problem asks us, given an array $$$a$$$, to count the number of arrays $$$b$$$ such that for any subarray, the position of the maximum value is the same in both arrays, given that if there are multiple maximums, we take leftmost. We can easily note that this is equivalent to counting arrays $$$b$$$ for which their Cartesian Tree has the same structure as the Cartesian Tree for $$$a$$$.

And here goes the solution, let $$$dp_{i,j}$$$ hold the number of arrays $$$b$$$ that yield the same Cartesian Tree as the subtree of node $$$i$$$ and the attributed value for the node $$$i$$$ is at most $$$j$$$. We can note that our answer is $$$dp_{root,m}$$$. To calculate we either chose the value of our current node to be $$$j$$$ or rely on $$$dp_{i,j-1}$$$. Thus we can get quick tranisitons:

$$$dp_{i,j} = dp_{l,j-1} \cdot dp_{r,j} + dp_{i,j-1}$$$

An implementation of this idea can be viewed here.

Comfortably Numb

Let's build the max Cartesian Tree of the input array. Let's see what the task changes to when switching the array to its Cartesian Tree. It turns out we need, for each root to find $$$max(a_{node} \oplus x \oplus y)$$$ where $$$x$$$ is the prefix xor of some node from the left subtree and $$$y$$$ is the prefix xor of some node in the right subtree ( including the current root ). We can precompute prefix xors and iterate over the smaller subtree and the task becomes finding maximum xor pair which can be done with a trie. Since we merged smallest into largest the complexity is $$$O(n \cdot logn \cdot logM)$$$ where $$$M$$$ is the maximum value in the input.

An implementation of this idea can be viewed here

A harder problem, Vaporeon

Since the statement is only available in romanian, I will translate it for you.

Consider an array with $$$a$$$ with only distinct values ( the problem allows repeted elements, but figuring all the details about making the following solution work on repeating values is left as an exercise to the reader ).

We call $$$f(x)$$$ the depth in the Cartesian Tree of index $$$x$$$. We need to tolerate the following $$$2$$$ operations:

Update: $$$swap(a_i,a_{i+1})$$$
Query: range sum on $$$f$$$

Let's reformulate the problem a little. For each index $$$a_i$$$ we will find the largest range where $$$a_i$$$ is maximum. The depth of the Cartesian Tree of some index is now just the number of intervals that pass through this index. Let's see how we can update those intervals when tolerating a swap update.

We will keep $$$2$$$ arrays of treaps $$$l$$$ and $$$r$$$ where the treap at index $$$l_i$$$ holds all the segments that end at $$$i$$$ ( as left endpoint ) and $$$r_i$$$ keeps all segments ending at $$$i$$$ ( as right endpoint ). WLOG assume we are doing $$$swap(a_i, a_{i+1})$$$ and $$$a_i > a_{i+1}$$$. Now we will analyze how $$$l_{i+1},l_{i+2},r_i,r_{i-1}$$$ change after an update:

$$$l_{i+1} = \varnothing$$$
$$$l_{i+2} = l_{i+1} \cup l_{i+2}$$$
$$$r_{i-1} = r_{i-1} \cap [1,2,3,...,a_{i+1}]$$$
$$$r_i = r_{i-1} \setminus [1,2,3,...,a_{i+1}]$$$

With a bit of extra analyzing we can note that all these operations are doable because $$$l$$$ and $$$r$$$ are kept as treaps and we always merge smaller values with larger values. After performing all these operations, all we are left to do is update the range of $$$a_{i+1}$$$ which we can do via binary searching on a segment tree in $$$O(logn)$$$ and inserting the new range in the corresponding treaps.

Since every update we do a split and a merge on the aforementioned treaps and update a single range, the complexity will be $$$O(logn)$$$. And at the same time there are $$$O(1)$$$ modifications in $$$f$$$ so we can update our range sum data structure fast.

An implementation of this idea can be viewed here.

Constellation 3 ( JOISC 2020 )

We are given a binary matrix. It is known that each column of the binary matrix is white for $$$a_i$$$ cells and then it's just black. We are also given some points with coefficients and are asked to activate a subset of points with maximal cost such that no $$$2$$$ points can see eachother. We consider $$$2$$$ points being able to spot each other iff there exists a black reactangle containing both.

Let's say we took a star on the same Y-coordinate as "smallest" black column ("tallest" building). We can notice that this would imply a restriction of the type "from now on we are only allowed to take stars that are at most $$$a_i$$$ units tall". This is because we will always be able to see that star otherwise.

It turns out this observation is enough for a polynomial solution based on tree DP. This way, we build the min Cartesian Tree and let $$$dp_{i,j}$$$ be the maximum cost of a subset consisting only of nodes from the subtree of $$$i$$$ and my restriction forces me to only take stars at most $$$j$$$ units tall. Transitions are very easy, we can just decide weather we take a star on current column or not. As a detail, if we don't take any star, at least one of the children needs to go at most $$$a_i$$$ units tall restriction, because we would see the 2 stars because they would both pass through $$$i$$$. For a better understanding of this solution, I recommend reading my code.

To achieve a faster solution we must exploit the Cartesian Tree and reduce the problem to something else. The magic reduction uses the corollary of the heap property. Thus we can find that each star's visibility can be expressed as a path from its columns node to the root in the Cartesian Tree. What's more exciting is using the fourth propriety, one can note that $$$2$$$ stars intersect iff their paths intersect, because they can both see the minimum in their range, meaning they can see everything else.

Thus our problem is reduced to chosing some disjoint chains in a tree such their coefficient sum is maximum. Fun fact, this new problem appeared on infoarena before and can be viewed here.

To solve it we just keep in $$$dp_i$$$ the maximum coefficient sum such that all used chains are fully included in the subtree of $$$i$$$. To calculate it we consider each chain that ends in node $$$i$$$ and try to include it in our solution. We need to add all $$$dp$$$ values that are directly attatched to the used chain, which can be done using a bit or any data structure that can support sum on chains with point updates.

For finding the chains we can just use binary lifting, thus our final time complexity is $$$O(nlogn)$$$.

An implementation of this idea can be viewed here.

And last but not least, thanks to Gheal, valeriu, SlavicG for reviewing this blog.

Comments (12)

Write comment?

defnotmee

17 months ago, # |

← Rev. 2 →

+65

There is also an $$$O(n)$$$ construction process for cartesian tree. Suppose you have the cartesian tree fully constructed for all indices $$$j < i$$$ and you're trying to modify it to insert the $$$i^{th}$$$ indice into it.

If $$$a_i$$$ is minimum between all $$$a_j$$$, it is the new root of the tree.

Otherwise the parent of $$$i$$$ will be the rightmost $$$j < i$$$ such that $$$a_j < a_i$$$ and the left child of $$$i$$$ will be the previous right child of that $$$j$$$ (if it had one).

You can check that $$$j$$$ in amortized $$$O(1)$$$ for each $$$i$$$ with monotonic stack ideas.

Impl: https://judge.yosupo.jp/submission/125694

Edit: evil tibinyte edited it into the blog and now my comment makes no sense :rage:

→ Reply

Dominater069

Another problem on this topic

https://oj.uz/problem/view/CEOI20_fancyfence

Geanina_the_Great

treapynite pog

Everule

+41

If infoarena is worth it's salt, prove it to me that you can implement this with cartesian trees. $$$n \le 10^6$$$. Additionally $$$n\log{A}$$$ integers don't fit in memory. Good luck.

For those that are not tibinyte, here is reference solution for what it looks like, and here is explanation for the code. Make sure to read the time and memory limit.

awoo

17 months ago, # ^ |

← Rev. 8 →

Is it that hard? During the contest, I would probably take time to avoid cartesian trees, but I think it's trivial to get a solution with them, and your solution looks quite complicated. ~~Idk maybe the constant is too large, can't bother to implement.~~

Solution

UPD: ACed it freely 193600331