### paladin8's blog

By paladin8, 9 years ago, This blog post is motivated by an interesting problem I solved today: Timus 1846. I knew about segment trees previously for solving range-minimum query, but was not fully aware of the obvious generalizations of this nice data structure. Here I will describe the basics of a segment tree and some applications.

### Problem

The problem that a segment tree can solve is the following. We are given an array of values a, a, ..., a[N - 1]. Assume without loss of generality that N = 2n; we can generally pad the computations accordingly. Also, consider some associative binary function f. Examples of f include sum, min, max, or gcd (as in the Timus problem). Segment trees allow for each of the following two operations on O(logN) time:
• compute f(a[i], a[i + 1], ..., a[j]) for i ≤ j; and
• update a[x] = v.

### Description

So how does a segment tree work? It's quite simple, in fact. We build a single two-dimensional array t[0...N - 1][0..n] where t[x][y] = f(a[x], a[x + 1], ..., a[x + 2y - 1]) where x is divisible by 2y (i.e. most of the entries are ignored, but it is easiest to represent this way). We can initialize the entries by the following procedure:
• t[x] = a[x]; and
• t[x][y] = f(t[x][y - 1], t[x + 2y - 1][y - 1]) for y = 1, ..., n,
where the second step uses the associative property of f. The logic is that t[x][y - 1] computes f over the range [x, x + 2y - 1) and t[x + 2y - 1][y - 1] computes f over the range [x + 2y - 1, x + 2y) so when we combine them we get the corresponding computation of f for t[x][y]. Now we can describe how each of the two operations is implemented (at a high level).

We'll start with computing f(a[i], a[i + 1], ..., a[j]). It's pretty easy to see that this amounts to representing the interval [i, j] as disjoint intervals [i1, j1];[i2, j2];...;[ik, jk] where each of these is of the form [x, x + 2y - 1] where x is divisible by 2y (i.e. it shows up in the table t somewhere). Then we can just combine them with f (again using the associative property) to get the result. It is easy to show that k = O(logN) as well.

Now for updating a[x] = v, notice that there are only a few terms in the segment tree which get affected by this change. Obviously, t[x] = v. Also, for y = 1, ..., n, only the entry t[x - (x&(2y - 1))][y] will update. This is because x - (x&(2y - 1)) is the only value divisible by 2y (notice the last y bits are zero) that also covers x at level y. Then it suffices to use the same formula as in the initialization to recompute this entry in the tree.

### Code

Here's a somewhat crude implementation of a segment tree. Note that the value of IDENTITY should be such that f(IDENTITY, x) = x, e.g. 0 for sum,  + ∞ for min,  - ∞ for max, and 0 for gcd.

void init() {
for (int x = 0; x < N; x++)
t[x] = a[x];
for (int y = 1; y <= n; y++)
for (int x = 0; x < N; x+=(1<<y))
t[x][y] = f(t[x][y-1], t[x+(1<<(y-1))][y-1]);
}

void set(int x, int v) {
t[x] = a[x] = v;
for (int y = 1; y <= n; y++) {
int xx = x-(x&((1<<y)-1));
t[xx][y] = f(t[xx][y-1], t[xx+(1<<(y-1))][y-1]);
}
}

int get(int i, int j) {
int res = IDENTITY, h = 0; j++;
while (i+(1<<h) <= j) {
while ((i&((1<<(h+1))-1)) == 0 && i+(1<<(h+1)) <= j) h++;
res = f(res, t[i][h]);
i += (1<<h);
}
while (i < j) {
while (i+(1<<h) > j) h--;
res = f(res, t[i][h]);
i += (1<<h);
}
return res;
}

### Application

One possible application is letting f just be the sum of all arguments. This leads to a data structure that allows you to compute sums of ranges and do updates, but is slightly less efficient than a Binary Indexed Tree. The most well-known application seems to be the range-minimum query problem, where f is the minimum of all arguments. This also allows you to compute the least-common ancestor of nodes in a tree.

Lastly, in Timus 1846, we can use it with f as the GCD function. Every time we remove a number we should update the corresponding entry to zero (effectively removing it from the GCD computation). Then the answer at each step is f(a, a, ..., a[k]) which can be computed directly from the segment tree.  Comments (37)
 Actually, it's known that you need only O(N) memory to store Segment Tree with N leaves. As you see, your array has many elements, that are not beeing updated or used at all. You can store segment tree in one-dimensional array.Root is being number 1. Two children of node v are 2v and 2v + 1.So the parent of node v is v / 2. One can easily prove that it contains only O(N) nodes.
•  9 years ago, # ^ | ← Rev. 2 →   Yes, you are right. But I think it is easier to think about it as a 2-dimensional array and only use the 1-dimensional compression when you have to.
•  3 years ago, # ^ | ← Rev. 2 →   "But I think it is easier to think about it as a 2-dimensional array and only use the 1-dimensional compression when you have to."If you are doing zkw segment tree anyways might was well do it in O(n) memory, as opposed to this (arguably more intuitive) O(n^2).Building an O(n^2) segment tree is very expensive, your implementation took O(n^2) just to build the tree. Sparse table can be built in the same speed and compute rmq in O(1), as opposed to O(logn).The only advantage this 2D segment tree still has over the sparse table is O(logn) updates, but many problems where segment tree is useful probably won't allow huge O(n^2) preprocessing anyways.1D Array Segment Tree: http://codeforces.com/blog/entry/18051Sparse Table for RMQ: https://www.geeksforgeeks.org/range-minimum-query-for-static-array/Clarification: zkw segment tree is iterative segtree
•  6 weeks ago, # ^ | ← Rev. 2 →   I think you are confusing N*n with N*N. This implementation takes O(N*n) where n = O(logN).
•  you actually need 2*2^(log(N)+1) for memory, don't you? It contains O(N) leaves, not nodes. But our colors suggest i'm wrong and you're right, can you share your prof?
•  2 * 2 ^ (log(N) + 1) = 4 * N = O(N)
•  O(N) is a set. And 4 * N isn't. So you can't say that they are equal. Instead you can say 4 * N is a member of O(N).
•  I don't really say they are equal. Please read this.
 Look at this cool implementation of Range Sum Query (RSQ). This implementation needs 2 * N memory, there is some implementations with only N memory, but as for me, this implementation is very simple, and it can be easily extended with some non trivial actions, (for example, we can build an array with all indexes we need to update, and run for them a[i] = F(a[i * 2], a[i * 2 + 1]) in order)
•  9 years ago, # ^ | ← Rev. 3 →   BTW, it can be easily extended to 2D or 3D arrays, but in this case it takes (2 * n)^k memory, where k is 1, 2, 3 for 1d, 2d, 3d, ...upd
•  Right, and the operation cost becomes O((logN)k).
•  @Alias I would like to know how to update and query in 2d segment tree??
•  there is two ways. look at implementation.
•  Do you know how to implement Lazy Propagation on 2D Segment Tree? I'm just being curious. Ideas or codes would be really useful!
•  Who can help me with practice problems on 2d-, persistent and compressed segment tree? I appreciate links to OJ problems.
 Thanks,I will take this code as a supplement in regionals. : )Must be okay with you. :)
 Can you please explain the  method for updating values over a range of indexes. I looked up in the internet and saw a solution called lazy propagation, but could not understand clearly. It would be great if you can explain a little. Thanks a lot ! .
•  Lets start with my explanation, hope that somebody will give better explanation.Lazy means "Update value only when you need it" .That is you keep a flag at the "parent node" , and update the children only when you encounter them while querying.
 » great, thanks ;-)
 » Another tutorial about it http://prodeportiva.wordpress.com/2013/02/08/segment-trees/
 » Could you please give me an idea of how to implement a 2d-segment tree?. For example, to solve the problem 11297 — Census. Thanks in advance.
•  » » 8 years ago, # ^ | ← Rev. 3 →   The idea behind 2D segment tree is that you first build segment tree for every row, then you build segment tree for every column. Its easier to operate with NxM array if N and M are degrees of 2, if not, you can fill the array with 0s, but if memory is more important, you can do without it. For example, you have the following 4x4 array:1 2 3 4 4 3 2 15 6 7 88 7 6 5Build a segment tree for every row. It will look like this:10 3 7 1 2 3 410 7 3 4 3 2 126 11 15 5 6 7 826 15 11 8 7 6 5(the first element of the row is the sum of elements (1...n), the second is the sum of (1...n/2), the third is the sum of (n/2+1...n) ...Then build a segment tree for every column of the above matrix in the same way:72 36 36 18 18 18 1820 10 10 5 5 5 552 26 26 13 13 13 1310 3 7 1 2 3 410 7 3 4 3 2 126 11 15 5 6 7 826 15 11 8 7 6 5The tree contains 2N-1 rows and 2M-1 columns, and our initial matrix starts with element (N,M)If you want to update element (x,y), you update it, then all its parents in its column ((x,y/2),(x,y/4),...(x,1)), then divide x by 2, and do the same thing while x>0.For finding the sum of the rectangle with opposite vertices (R1,C1) (R2,C2) (where R1
•  » » » Very nice explanation, thank you.
•  » » » 8 years ago, # ^ | ← Rev. 2 →   What if I want to update a rectangle, say (R1,C1) (R2,C2)? Is the complexity still log(N)*log(M)? Could we now use the Lazy Propagation technique here, like the one in the 1D Segment Tree?
•  » » here is a tutorial about 2D segment tree. LinkYou might need google translate if you are not Russian. :)
 » Do you know about other problems where segment trees should be used with some other associative binary functions?
•  » »
•  » » » Thanks for your reply. I already solved the first one long time ago, and as I remember it was not about using some associative operation, but storing the progression terms in the nodes with lazy propagation. As for the second one, the link points to nowhere.
•  » » » » this one is more interesting: http://coj.uci.cu/24h/problem.xhtml?abb=2511this one its only a sum: http://coj.uci.cu/24h/problem.xhtml?abb=1850this is only a sum but it's in 2D: http://coj.uci.cu/24h/problem.xhtml?abb=1904and for the Interval Product problem, you dont need lazy propagation, but you need it for this one: http://www.spoj.com/problems/HORRIBLE/
•  » » » » » Links to coj.uci.cu problems actually uses pid instead of abb.
•  » » » » » » I wrote that long time ago, today the links are broken.
 » 7 years ago, # | ← Rev. 18 →   We can generalize that a bit more: Let (M, f) be a semigroup (that is, a set M together with an associative binary operator ).Let (T, c) be a monoid with identity element tid and a family of transformation functions (range updates), that take as their arguments a parameter and a sequence and output a sequence of the same length. gk should respect the monoid structure of t, i.e. for , gk(c(t1, t2), ·) should be equivalent to the composition and gk(tid, ·) should be the identity function. Intuitively, this allows us to compress a series of updates into one single update.Let be the set of integers between 1 and n.A segment tree is able to maintain a dynamic sequence . It allows the following operations: range query: Given two integers , return f(al, ..., ar) (the notation is just for convenience, the function f is in fact applied r - l times). range update: Given two integers and a parameter , replace the subsequence  < al, ..., ar >  by g(t,  < al, ..., ar > ). It is able to do so in time per operation if the following conditions are met: f and c can be computed in . There is a function , that, given an update and the value f(a) for a sequence a, can compute f(gt(a)) in . Intuitively, we need to be able to compute the new aggregated value of an interval when we only know the old aggregated value and the update that happened since. The implementation works by assigning nodes to ranges in a binary tree-like fashion and storing f(al, ..., ar) and a lazy update t inside the node responsible for the range [l, r].The tree can be implemented without any further knowledge about the internal structure. It just needs implementations of f, c, h and it needs to know tid.This characterization is general enough to directly allow a scenario like: allow range queries for min, max and sum of an interval allow range updates "set all to v" and "add v to each value" Here the values in M are tuples (min, max, sum, length) and the values in T are tuples (action_type, v). The functions f (combine range data), c (combine updates) and h (apply update to range data) look like this (in pseudo-Haskell pattern matching syntax): f (min1, max1, sum1, length1) (min2, max2, sum2, length2) = (min min1 min2, max max1, max2, sum1 + sum2, length1 + length2) c _ (Set, v) = (Set, v) c (Add, x) (Add, y) = (Add, x + y) c (Set, x) (Add, y) = (Set, x + y) h (Set, v) (min, max, sum, length) = (v, v, v, length) h (Add, v) (min, max, sum, length) = (min + v, max + v, sum + v * length, length) We also need to supply tid = (add, 0).For problems like http://www.spoj.com/problems/SEGSQRSS/ or http://www.spoj.com/problems/GSS3/ we need to get a bit more creative and store additional values in the M tuples that are not actually needed to answer queries directly, but to implement the function h efficiently.
 » Hi I have found one good example here Segment Tree — Data Structure
 » Correct me if I'm wrong, but an O(nlogn) is getting TLE on the problem you stated initially. Here is the problem: Timus 1846Unfortunately, I can't provide a link to my solution since the solutions are private. But I'd be very happy to hear a solution using Segment tree since any segment tree must use O(nlogn). Correct me if i am wrong again!Thanks in advance
•  » » My solution is O(n*log n) using Segment tree and gets AC in 0.39 execution time (in Java).
 » I think this should be named "sparse table with updates". Thanks for your effort anyway!
 » Help Post, Why my solution was getting WA? Problem My Solution. Why WA Plz Help me. To Find Out Plz.