A brief introduction of Segment Tree

Revision en1, by RDFZzzx, 2022-08-08 08:17:54

# What can Segment Tree do?

Segment Tree is a useful data structure that can solve problems like RMQ in $O(n)$ build tree and $O(\log(n))$ for each query or update.

What can it query?

• sum

• maxinum / minium

• gcd / lcm

• other thing that can be push up

# How to use Segment Tree?

Lets use segment Tree to query the sum of an array.

## Build Tree

Lets look at the example, the tree is just like many segments!

It's easy to find that the number on the bottom is the sum of the two numbers on it.

The $l$ and $r$ note the left endpoint and the right endpoint of this segment.

The next step is the give each segment an idx to mark their position. The purple letters on top of the segments are the idxs.

It's not beautiful to put big segments in the bottom, so we can reverse it like this:

We always use $ls$ to represent the left son of a segment and use $rs$ to represent its right son. We can find that $ls = pos \times 2$ and $rs = ls + 1$ ($pos$ is the id of the father segment). So we have:

#define ls (o << 1)
#define rs (ls | 1)


Note that there are some segments which have many idxs. This operation can help us to find the $ls$ and $rs$ quickly, but it will waste some memery. Scientists calculation shows that the total memery we use will not exceed four times the length of the interval.

code of build tree:

void build(int o, int l, int r)
{
if (l == r)
{
sum[o] = a[l];
return;
}
int mid = (l + r) >> 1;
//get the middle and cut the segment into two
build(ls, l, mid);
build(rs, mid + 1, r);
//build the left son and the right son
sum[o] = sum[ls] + sum[rs];
}


## Query without update

Let look at the example. If I want to query the sum from $a_3$ to $a_9$, we don't have to add them up, we can add sum[l = 3, r = 4], sum[l = 5, r = 8] and sum[l = 9, r = 9]. The time complexity is only $O(\log(n))$.

Prove: we will add only a number in a layer, and there are at most $\log(n)$ layers, so the number of the operation will not exceed $O(\log(n))$.

ll query(int o, int l, int r)
{
if (ql <= l && r <= qr) return sum[o];
int mid = (l + r) >> 1;
ll s = 0;
push_down(o, l, r);
//push_down is used to update, I will explain it next.
if (ql <= mid) s += query(ls, l, mid);
if (mid < qr) s += query(rs, mid + 1, r);
return s;
}


In the code, I use $ql$ to represent the left endpoint of the query, and use $qr$ to represent the right endpoint.

if (ql <= l && r <= qr) return sum[o]; This means if our query contains the segment, add it up.

if (ql <= mid) s += query(ls, l, mid);
if (mid < qr) s += query(rs, mid + 1, r);


If a part of the query is in the left son, visit it. If a part of the query is in the right son, visit it.

## Update and "Lazy Tag"

### Lazy Tag

"Lazy Tag" is the most important part in updation. This method can reduce the time complexity to $O(\log(n))$ for each updation. It's obvious that we can't update the whole tree, or the time complexity will be $O(n)$ like build tree. To reduce the time complexity, if a segment is not queried now, we don't have to update it at present. We usually give it a tag to record how we update it. For example, if we want to update an array and query the sum of an interval, for each "add" operation, we do not add numbers to each segment, we only have to give a "add tag" to the some segments.

For example, if we want to add $2$ from $2$ to $5$. We should give segment number $9$, $17$ and $20$ an "add tag" to sign them.

Note that segment we signed are all a part of the interval we operate on.

This picture shows what the tree is like after the operation.

void change(int o, int l, int r)
{
if (ql <= l && r <= qr)
{
return;
}
push_down(o, l, r);
int mid = (l + r) >> 1;
if (ql <= mid) change(ls, l, mid);
if (mid < qr) change(rs, mid + 1, r);
// just like what we do in query
sum[o] = sum[ls] + sum[rs];
// push up the sum of the segment
}

if (ql <= l && r <= qr)
{
return;
}


This means if the segment is contain in the interval, we don't have to visit its sons because when we query we can push up this segment directly without ask the sons. This is the reson why we can operate on it and return.

void apply_add_tag(int o, int l, int r, ll v)
{
sum[o] += (ll)(r - l + 1) * v;
}


sum[o] += (ll)(r - l + 1) * v; This means the real value the segment should add = the value each node add * the length of the segment.

### Push Down

When we query the segment which has a tag. We should update it. (of course we don't have to use extra time, we can update this when we query / update a new interval.

void push_down(int o, int l, int r)
{
{
int mid = (l + r) >> 1;
}
}


add[o] = 0; Node that we have to clear the tag on the father segment after we push the tag down.

• Someone may ask " when we clear the tag, how can we know know the real value of the segment?"

• Don't forget that there is a push up operation when we update sum[o] = sum[ls] + sum[rs]; and there is a push down operation when we query.