Hi, I’d like to introduce a simple trick about segment tree in this blog as I promised in this comment. Sorry for the long delay, as a sophomore in Peking University, I've just finished a tired semester and a painful final exam. And now I finally have enough time to do a simple introduction to this interesting algorithm.

It may be a huge project for me since my English is not good. I think I will finish this blog in several steps and I will try to finish it as soon as possible :)

In China, all of the 15 candidates for the Chinese National Team are asked to write a simple research report about algorithms in informatics Olympiad, and the score will be counted in the final selection. There are many interesting ideas and algorithms in these reports. And I find that some of them are quite new for competitors in CF although they are well known in China from the final standings of some recent contests. For example, In the last contest CF Round #458, the problem G can be easily solved using "power series beats" (Thanks for matthew99 to give this name :)) in *O*(2^{n}*n*^{2}) which was imported in China by Vfleaking in 2015.

This blog is about my report which is written about two years ago. I am satisfied with this work although there is a flaw about time complexity. xyz111 gave me a lot of inspiration, and the name “Segment tree beats” is given by C_SUNSHINE which is from a famous Japanese anime “Angel Beats”.

Here is the link about the Chinese version of this report.

### Part1. What it can do.

This work has two parts:

- Transform interval max/min (for ) operations into interval add/subtract operations in
*O*(1) or - Transform history max/min/sum queries into interval max/min operations in
*O*(1).

Here are some sample problems I used in my report. At first you have a array *A* of length *n* and two auxiliary arrays *B*, *C*. Initially, *B* is equal to *A* and *C* is all zero.

**Task 1**. There are two kinds of operations:

- For all , change
*A*_{i}to*max*(*A*_{i},*x*) - Query for the sum of
*A*_{i}in [*l*,*r*]

It can be solved in

**Task 2**. We can add more operations to Task 1:

- For all , change
*A*_{i}to*max*(*A*_{i},*x*) - For all , change
*A*_{i}to*min*(*A*_{i},*x*) - For all , change
*A*_{i}to*A*_{i}+*x*,*x*can be a negative number - Query for the sum of
*A*_{i}in [*l*,*r*]

It can be solved in and I could not proved the exact time complexity yet, maybe It is still ;)

**Task 3**. And we can query for some other things:

- For all , change
*A*_{i}to*max*(*A*_{i},*x*) - For all , change
*A*_{i}to*min*(*A*_{i},*x*) - For all , change
*A*_{i}to*A*_{i}+*x*,*x*can be a negative number - Query for the sum of
*B*_{i}in [*l*,*r*]

After each operation, for each *i*, if *A*_{i} changed in this operation, add 1 to *B*_{i}.

It’s time complexity is the same as Task 2.

**Task 4**. We can also deal with several arrays synchronously. Assume there are two arrays *A*_{1} and *A*_{2} of length *n*.

- For all , change
*A*_{a, i}to*min*(*A*_{a, i},*x*) - For all , change
*A*_{a, i}to*A*_{a, i}+*x*,*x*can be a negative number - Query for the max
*A*_{1, i}+*A*_{2, i}in [*l*,*r*]

It can be solved in and if there are *k* arrays, the time complexity will be raised to .

**Task 5**. And there are some tasks about historic information.

- For all , change
*A*_{i}to*A*_{i}+*x*,*x*can be a negative number. - Query for the sum of
*B*_{i}in [*l*,*r*]. - Query for the sum of
*C*_{i}in [*l*,*r*]

After each operation, for each *i*, change *B*_{i} to *max*(*B*_{i}, *A*_{i}) and add *A*_{i} to *C*_{i}.

The query for *B*_{i} can be solved in and the query for *C*_{i} can be solved in .

**Task 6**. We can even merged the two parts together.

- For all , change
*A*_{i}to*max*(*A*_{i},*x*) - For all , change
*A*_{i}to*A*_{i}+*x*,*x*can be a negative number. - Query for the sum of
*B*_{i}in [*l*,*r*].

After each operation, for each *i*, change *B*_{i} to *min*(*B*_{i}, *A*_{i}).

It can be solved in .

There are 11 sample tasks in my report and here are 6 of them. All of them are interesting and are hard to solve using the traditional techniques such as lazy tags.

### Part2. The main idea

#### Interval min/max operations

To make the description clearer, I think it’s better to introduce an extended segment tree template.

I think most of the competitors’ templates of the lazy tag is like this ([*l*, *r*] is the node's interval and [*ll*, *rr*] is the operation's interval):

```
void modify(int node, int l, int r, int ll, int rr) {
if (l > rr || r < ll) return;
if (l >= ll && r <= rr) {
puttag(node); return;
}
pushdown(node);
int mid = (l + r) >> 1;
modify(node * 2, l, mid, ll, rr);
modify(node * 2 + 1, mid + 1, r, ll ,rr);
update();
}
```

The main idea is **return whenever we can, put the tag whenever we can**:

- When the operation's interval and the node's interval are no longer intersected, the information inside this subtree must not be affected. So we can return immediately.
- When the node's interval is contained by the operation's interval, all the information inside the subtree will be changed together. So we can put the tag on it and return.

In other words, we can replace the two conditions arbitrarily, i.e., we can extend the template like this:

```
void modify(int node, int l, int r, int ll, int rr) {
if (break_condition()) return;
if (tag_condition()) {
puttag(node); return;
}
pushdown(node);
int mid = (l + r) >> 1;
modify(node * 2, l, mid, ll, rr);
modify(node * 2 + 1, mid + 1, r, ll ,rr);
update();
}
```

What's the use of such a modification? In some advanced data structure tasks, it's impossible for us to put tags in such a weak condition `l >= ll && r <= rr`

. But we can put it when the condition is stronger. We can use this template to deal with this kind of tasks but we need to analyze the time complexity carefully.

**Simple task:** There are three kinds of operations:

- For all , change
*A*_{i}to - For all , change
*A*_{i}to*x* - Query for the sum of
*A*_{i}in [*l*,*r*]

It's a classic problem (it's the simple extension of 438D - The Child and Sequence) and the traditional solution is to use balanced tree such as splay/treap to maintain the continuous segments with the same *A*_{i} and for operation 2, we find out all the segments with *A*_{i} ≥ *x* and change the value of each one.

But if we use segment tree, we can get a much simpler solution: let `break_condition`

be `l > rr || r < ll || max_value[node] < x`

and let `tag_condition`

be `l >= ll && r <= rr && max_value[node] == min_value[node]`

. And we can find that the time complexity of this segment tree is also .

And now, we can easily describe the main idea of "segment tree beats". For each node, we maintain the maximum value `max_value[node]`

and the strict second maximum value `second_value[node]`

.

When we are doing the interval min operation for number *x*. let `break_condition`

be `l > rr || r < ll || max_value[node] <= x`

and let `tag_condition`

be `l >= ll && r <= rr && second_value[node] < x`

. Under such an condition, after put this tag, all of the maximum values inside this subtree will be changed to *x* and they are still the maximum values of this subtree.

To make it easier to merge with other operations, we can maintain the values in this way: For each node, we maintain the maximum value inside this subtree and other values separately (the maximum values are the first kind and others are the second). Then, interval max operation will be changed to "add a number to the first kind values in some intervals". Keep the meanings of each kinds of values and the tags, you will find that the processes of pushdown and update will be much clearer.

That's the main idea of the first part of "segment tree beats". It is very simple, right? And it also has a very nice time complexity. (I will give out the proof of the time complexity in the third part of this article).

And let's see the first two sample tasks in part 1.

**Task 1**. In this task, we can maintain the number of the first kind of values inside each node `t[node]`

, and when we put tag "add *x* to the first kind values in this subtree", the sum will be added by `t[node] * x`

. Then we can easily maintain the information we need.

**Task 2**. In this task, we can maintain the maximum values/ minimum values and others separately, i.e., there are three kinds of numbers now, and we will use three sets of tags for each kind of values. Also, to deal with the queries of the interval's sum, we need to maintain the numbers of the first kind and the second kind of values. Pay attention to some boundary conditions, if there are just two different values in this subtree, there will be no third kind of values and if all the values are the same, the set of the first kind and the second kind will be the same

#### Historic information

In this part, we will focus on three special values which I named "historic information":

- historic maximal value: after each operation, change
*B*_{i}to*max*(*B*_{i},*A*_{i}) - historic minimal value: after each operation, change
*B*_{i}to*min*(*B*_{i},*A*_{i}) - historic sum: after each operation, add
*C*_{i}by*A*_{i}.

You may wonder why we need to consider these values. But in China, there had been several problems about these values before I wrote this report. And maintaining these values is a much harder task than it looks like. Here is a data structure task in Tsinghua University Training 2015:

**Sample Task**: Here is the link of this problem. There are five kinds of operations (*x* is a positive integer.):

- For all , change
*A*_{i}to*A*_{i}+*x*. - For all , change
*A*_{i}to*max*(*A*_{i}-*x*, 0). - For all , change
*A*_{i}to*x*. - Query for
*A*_{i}. - Query for
*B*_{i}.

After each operation, change *B*_{i} to *max*(*B*_{i}, *A*_{i}).

To solve this task, ordinary segment tree is enough. But It's not easy to deal with the relationship between the lazy tags and the historic information. I think it's a good sample problem to show the features of historic information.

**Here is the solution**

We have already gotten the way to deal with the interval min/max operations. And they are good tools to maintain historic information.

Now, let use consider such a kind of problems (Part1. Task 5): interval add/subtract, query for the interval sum of historic information. It's quite hard since we can't just use the traditional lazy tag technique to solve it. But we can use an auxiliary array *D*_{i} with the initial value of all zero to do some transformation:

- Historic maximal value: let
*D*_{i}=*B*_{i}-*A*_{i}. Then if we change*A*_{i}to*A*_{i}+*x*,*D*_{i}will be changed to*max*(*D*_{i}-*x*, 0). - Historic minimal value: let
*D*_{i}=*B*_{i}-*A*_{i}. Then if we change*A*_{i}to*A*_{i}+*x*,*D*_{i}will be changed to*min*(*D*_{i}-*x*, 0). - Historic sum: let
*D*_{i}=*C*_{i}-*tA*_{i}while*t*is the current operation id. Then if we change*A*_{i}to*A*_{i}+*x*,*D*_{i}will be changed to*D*_{i}- (*t*- 1)*x*.

We can use the technique introduced in the previous part about interval min/max operations to maintain the sum of *D*_{i}. And then we could get the sum of *B*_{i} or *C*_{i}.

What's more, since we have already transformed interval min/max operations to interval add/subtract operations, we can also maintain the interval sum of historic information under interval min/max operations (Part1. Task 6). This is a simple expansion and I think it's better to leave it as an exercise.

### Part 3. The time complexity

In this part, I will give the proofs of the time complexity of Part 1 Task 1 and Part 1 Task 2. You can find that the proof of Task 2 is universal: it is equally applicable to Task 3 to Task 5. And Task 6 can be regarded as two layer nested “segment tree beats”. It's time complexity can be proved using the same way as Task 2.

#### Task 1

In this part, we will proof the algorithm showed in Part 2 of this problem is : interval **min** operation, query for the interval sum.

gepardo has given a clear proof to show the time complexity of this problem is in this comment. Here, I'd like to show another proof which is much more complex. But I think it is my closest attempt to proof Task 2 is . And this proof will show the relationship between "segment tree beats" and the solution of this problem given by xyz111 (you can find this solution in this website. The name of this problem is "Gorgeous Sequence" and the solution is written in English).

The main idea is to transform max value into tags. Let the tag of each node in segment tree be the maximum value inside its interval. And if a node has the same tag as its father, we will remove the tag on it. Here is an simple example:

The left picture shows the maximum value of each node in a segment tree of array [1, 2, 4, 3]. And the left picture is the tags.

After the transformation, we can find that the strict second maximum value we maintained is just the maximum tag inside each subtree.

One important property of these tags is each tag is strictly greater than the tags inside its subtree. But why this property still holds after interval min operations? Consider the difference between this algorithm with the original segment tree: We will visit more nodes when `second_value[node] >= x`

. This process ensure that we will put a tag only if all the tags inside this subtree is smaller than the new tag, i.e., we visit these large tags and delete them.

When do the tags change? One possibility is after an interval min operation, we will delete some tags and add some new tags. And another possibility is that if we just change some of the values inside the node, the previous tag may be pushed down after `pushdown()`

(The maximum value of this subtree may be changed.)

For the convenience of description, let us define tag classes:

- All the tags added in the same interval min operation belong to the same tag class.
- After a pushdown, the new tag belongs to the same tag class as the old tag.
- Any two tags which do not satisfy above two conditions belong to different tag classes.

And we can divide the nodes we visit in a single interval min operation into two sets: ordinary nodes and extra nodes. Ordinary nodes are those nodes which an ordinary segment tree also visit in an operation of [*l*, *r*], and extra nodes are those nodes which we visit because the stronger `break_condition`

.

Then, let *w*(*T*) (*T* is a tag class) be the number of nodes which has at least one tag in *T* inside its subtree, and let the potential function Φ(*x*) be the sum of *w*(*T*) for all existing *T*.

Since there are at most *n* tags in the segment tree, so the initial value of Φ(*x*) is . Consider the influences to Φ(*x*):

- After a
`pushdown()`

, the old tag will be pushed down to two children of`node`

, so Φ(*x*) will be increased by 2. - The new tag will occur on only ordinary nodes. So for the new tag class
*T*,*w*(*T*) is . Since the number of ordinary nodes is . - The reason we visit an extra node is that we need to delete some tags inside its subtree. And a clear fact is that if we delete a tag in tag set
*T*inside a subtree, all the tags in*T*inside this subtree will be also deleted. So, after visiting each extra node, Φ(*x*) will be subtract by at least 1.

Since the total increase of Φ(*x*) is and each extra node will subtract 1 from Φ(*x*), we know that the time complexity is .

You may find that this potential function is just the same as the proof given by [usr:gepardo]. But using tags can bring us much more useful properties, and we can come up with some useful potential functions from the tags of which the definitions can be very strange if we want to define them just from the values. You may see it from the proof of the task 2.

#### Task 2

In this task, we will still use the tags defined in previous proof, and the potential function we choose is a little strange: let *d*(*t*) ( *t* is a tag ) be the depth of *t* in the segment tree and Φ(*x*) be the sum of *d*(*t*) for all existing tags. And since *d*(*t*) is and the number of the tags is no more than *n*. So the initial value of Φ(*x*) is .

Then, let's consider the influences to Φ(*x*):

- After an interval min operation, there will be no tags on the extra nodes. So we can just consider
`pushdown`

on ordinary nodes (For the interval add/subtract operations, we visit only ordinary nodes, so the following discussion is still correct) . And after a`pushdown`

, the old tag will be pushed down to the two children of`node`

, and Φ(*x*) will be added by . Since there are ordinary nodes,`pushdown`

will make Φ(*x*) increased by each interval min operation. - The new tag will occur on only ordinary nodes, so it will also make Φ(
*x*) increased by . - Each extra node will make Φ(
*x*) decreased by 1 because there must be a tag deeper than this node which is deleted after this operation.

Since the total increase of Φ(*x*) is and each extra node will subtract 1 from Φ(*x*), we prove that the time complexity is .

What's more, in this proof, we do not use any special properties of interval add/subtract operation. So the proof is still hold for other operations. That's why I call this proof universal.

So far, this article is finished. I've never thought that it will be so popular. Thanks a lot for the encouragement and the support. I hope this article can help :)

The research reports from candidates for the Chinese National Team are publicly available?

Sure, but we have only Chinese version qwq

Can you provide a link of all those reports of the lasts years??

You could find them with 國家集訓隊XXXX論文集(where XXXX is the year)

Also I've found some of them from Voleking 's github

Thank you very much!!!

Thanks for jiry_2 and other Chinese guy for those amazing thoughts, lol

I said why I got so many stars recently_(:з」∠)_

Please, post them :D

The research component of the Chinese OI camp sounds amazing. It would be really good if the reports could be shared/translated for the rest of the world as well.

Definitely agree! I'm so sad Chinese is so hard to translate. Maybe some Chinese volunteers would try to translate them. I'd pay for that... Right now I'm more interested in what are the most useful pieces of work of this kind? If this is just one of them, I can't imagine how much these tricks achieve

Btw if someone agrees to translate them, I'm also willing to pay, as judging by the already translated Chinese "tricks" and data structures these should also be nice and useful.

By the way, if you want to publish it on e-maxx-eng too, it should be easy enough as both blogs at CF and e-maxx-eng use MarkDown and TeX. And I'm ready to help if needed. :)

Set power series (I don’t know how to describe it in English so I just use the phrase given by Baidu translate)Hmmm...I don't know the official name for this term either. What about "power series beats". It just sounds as sassy as "segment tree beats".

Finally ! I've been waiting for this post for such a long time. I used to translate the Chinese version by Google Translate and it was really hard to understand lol.

Btw, here are some problems which (i think) can be applied Segment Tree Beats to:

HDU Gorgeous Sequence (this is pure, basic STB)

Bear and Bad Powers of 42

Nagini

Little Pony and Lord Tirek

Julia the snail

Thanks for great post ! Looking forward to seeing your next post about Complexity.

Actually... 453E - Little Pony and Lord Tirek An ordinary segment tree is enough.

Thanks a lot. I know the first two problems can be solved this way and I will check the remaining three problems once I have chance to use computer.

BTW, I've published some exercises about it on some Oline Judges in China. I'll give out the links later, too.

Here are the links of some exercises:

Well, I think I made some mistakes yesterday. Maybe "segment tree beats" can not help with solving the second task?

And I think the task 1,3,4 are good exercises of this article. Thanks a lot!

How problem G (CF round 458) can be solved in

O(2^{n}*n^{2}), if n = 1000000?I think he meant n = 17 for that problem

What means "history informations" in task 5?

Could be going about pertistency.

I don't think so. Because persistency for this segment tree is the same as for standard segment tree.

Sorry for the confusing description. "history information" means the value inside array B and C since Bi is the minimum value and Ci is the sum of the values on this position among all previous versions.

I'm so sorry for the stupid spelling mistakes in this blog such as "informations". I will try to fix them tonight qwq.

But I don't understand how can we use B and C. Or they'll help in some tasks?

Well, I put this inside my report because there are some data structure tasks about historic information in China, and I find that this kind of tasks can be easily solved in this way.

I have not thought about the applications of these values qwq.

Array B and C contains the historical information,which array B means the historical maximum value and array C means the sum of all the versions

What about code?

I think it'll be a good idea to require all the reports written in English from now on,since i'll never have a chance of writing one.>_<

Auto comment: topic has been updated by jiry_2 (previous revision, new revision, compare).Another tutorial post on the same topic (In Chinese): http://www.shuizilong.com/house/archives/hdu-5306-gorgeous-sequence/

Great article! I didn't manage to solve even task 1.

I felt that this technique is somewhat similar to the solution of 453E - Little Pony and Lord Tirek, but segment tree beats seem more advanced and general.

Can you please upload these task series in some online judge like SPOJ? It will be really helpful.

It seems that task 2 is still . I have a proof for it.

Let's denote the value

C.Cis a sum for all theD[l...r] where [l...r] is a segment in the segment tree.D[l...r] is a number of distinct numbers on a segment [l...r]. It's not hard to see that initiallyCdoesn't exceed . Also,Cwill increase on at most each update query.It's obvious that for each query we can go to more than into segments. Some segments we visit I will call

extra. Anextrasegment is a segment for whichl≥ll&& r ≤rr&& second_value_{node}≥x&& max_value_{node}≥x, i. e. the segment coincides with our query and we are not returning due to break condition or tag condition. Eachextrasegment processing takesO(1) (maybe from thisextrasegment we will go to some otherextrasegments, but we'll consider eachextrasegment separately, without its children, in this proof). HowCchanges if we process anextrasegment? All the values equal tosecond_valueandmax_valueinside this segment will be replaced with the query parameter, so the number of distinct numbers is decreased by at least one, i. e.Cwill be decreased by at least one for eachextraquery. The same is held both for minimum or maximum updates.But, as we know,

Ccan be increased at most by , so we will visit no more thanextravertices. So, we proved the time complexity.Correct me if I am wrong.

I've considered this potential function before. But the main problem is that the potential function will be changed a lot after an interval operation. Here is an example:

Ais an array of length 2nand . Then if I addnto allA_{i}in [1,n].Cwill be increased byO(n).BTW, you have already gotten the way to prove the time complexity of task 1. It's a very clear proof and I even want to just copy your comment in my next update :)

Thanks a lot!

You are welcome :)

By the way, I noticed that I misread the task 2. I didn't notice the adding to a segment. If we have no

`A[i] += x`

operation, just`A[i] min= x`

and`A[i] max= x`

, the proof works.Auto comment: topic has been updated by jiry_2 (previous revision, new revision, compare).I have read the Chinese version before, but I can't truly understand it until I read this post with those codes, thanks a lot.

You're welcome :)

I try another way to write this article. I believe it's much clearer than the word description in the Chinese version.

In the end of the article there are 3 paragraphs, I think that in the second one it must be written "Historic minimal value", not maximal. Aren't I right?

Oh, I forgot to change it after copying from the first line. I've fixed it, thanks a lot.

Auto comment: topic has been updated by jiry_2 (previous revision, new revision, compare).You've failed a little bit again) Part 3, Task 1, end of the second paragraph. I think that [usr:xyz111] isn't exactly what you wanted to write) (Sorry for my remarks)

In part 3, it seems that the proof of task 2 doesn't make use of the property of "depth" and only counts the number of tags. I think we can achieve a better bound if we slightly change the potential function.

Let's define the potential as the number of tags in the segment tree, i.e. set

d(t) to a constant function 1. Then the increases in the potential caused by interval min operation and new tags become instead of (because we don't sum the depth but simply count the occurrence) and the visit of each extra node still decreases the potential by 1 (removal of a tag). As a result, the total increase in the potential is , and the overall time complexity is bounded by (the initial value of the potential is at mostn).Please correct me if I am wrong.

Well, to remove a tag, you may visit at most extra points. That's why I use the depth: each visit will absolutely make Φ(

x) decrease by 1 :)I tried to code this tree for Gorgeous Sequence based on your description and got AC within 22XX ms

I stored in each node the 1st & 2nd maximums, sum and count of numbers = 1st maximum and when 1st maximum of a child is bigger than the parent's I do lazy propagation

My code if anyone is interested: Github

What s the meaning of "tags"? I thpught they were conditions where we dont transverse down the tree anymore. But as i read on i started to to get confused :/.

Can someone please provide neat code for task 2? I have a lot of problems with implementation when there are minimum, maximum and addition.

Ok, I don't understand even the first simple task. Even the break condition: max_value[node] < x. For second operation you also have to change smaller values than x, it's just an assignment. Why shouldn't we bother with A[i] < x?

Besides there are 2 types of update operation but there is only one type of break condition and tag condition described. Also the explanation of the traditional solution is not understandable, because of the same reason I wrote above: "and for operation 2, we find out all the segments with Ai ≥ x and change the value of each one." — this is just an assignment, so why can we omit smaller values...?

Can you or somebody provide code with comments for some tasks brought up in part one please, for clarity. I have read the article, but maybe in this case, some pieces of code would make it clearer.