XDEv11's blog

By XDEv11, 3 years ago, In English

Hey, codeforces community, this is my first blog, so your suggestions are very welcome. Also, I am not a native speaker. Please excuse language errors and let me know.

I recently found this blog entry, Efficient and easy segment trees. It's an old post from six years ago, so there might be a proof about this. However, I don't see one by far. That's why I try to write it. Hope it will help someone who wants to know this a little bit more.

Definition

We call a node legal if

  1. It's between n and n + (n - 1). (an array element)
  2. Both its children are legal and represent subarrays of same size. (a combined value from two children)

Otherwise, it's called illegal.

No overlaps

Let $$$[L_i:R_i)$$$ be the left-closed right-open interval of the i-th layer from bottom (a continuous sequence of legal nodes that represent subarrays of same size.). We know that $$$L_0=n$$$ and $$$R_0=2n$$$, and we have $$$L_{i+1}={\Large\lceil}\frac{L_i}{2}{\Large\rceil}$$$ and $$$R_{i+1}={\Large\lfloor}\frac{R_i}{2}{\Large\rfloor}$$$.

We want to prove that $$$R_{i+1}\le L_i$$$. First, the statement holds for $$$i=0$$$ because $$$R_1=n\le L_0=n$$$. Second, assume it holds for $$$i=k$$$ ($$$R_{k+1}\le L_k$$$), then it also holds for $$$i=k+1$$$ because $$$R_{k+2}={\Large\lfloor}\frac{R_{k+1}}{2}{\Large\rfloor}\le\frac{R_{k+1}}{2}\le\frac{L_k}{2}\le{\Large\lceil}\frac{L_k}{2}{\Large\rceil}=L_{k+1}$$$. Thus, by mathematical induction, the statement holds for every $$$i\ge0$$$.

As a consequence, the structure of this kind of segment tree with an arbitrary size is always feasible.

Correctness of modify()

void modify(int p, int value) { // set value at position p
    for (t[p += n] = value; p > 1; p >>= 1) t[p>>1] = t[p] + t[p^1];
}

It starts from a leaf node and keeps going up to its parent, so we can make sure every legal node associated with this array element has been updated after modify().

Although, some values in illegal nodes have also been changed, we don't use them and don't care anyway.

Correctness of query()

int query(int l, int r) { // sum on interval [l, r)
    int res = 0;
    for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
        if (l&1) res += t[l++];
        if (r&1) res += t[--r];
    }
    return res;
}

Here, we have a loop invariant -- All nodes in $$$[$$$l$$$:$$$r$$$)$$$ are legal.

At the beginning of the loop, it's easy to see it's true if we get the proper parameters.

Every time it enters the loop, l is less than r, and after the two expressions, l is less than or equal to r.

  • If l is less than r and $$$[$$$l$$$:$$$r$$$)$$$ shrinks to $$$[$$$l >> 1$$$:$$$r >> 1$$$)$$$. By definition, a node is legal if both children are legal, so the loop invariant holds.
  • Or if l is equal to r, l >> 1 is surely equal to r >> 1 and the loop is terminated.

Why we get the answer this way? Every time before $$$[$$$l$$$:$$$r$$$)$$$ shrinking to $$$[$$$l >> 1$$$:$$$r >> 1$$$)$$$, if l is a right child or r - 1 is a left child, the values of corresponding array elements are counted into res, and the others are still covered by the interval. At the end, the interval becomes empty, so all array elements in the original interval have already been counted . Moreover, with the loop invariant, no illegal node would ever be counted.

Advantage

  • Compact code
  • Easy to implement
  • No recursion and less computations (very efficient)

Disadvantage

  • Some space ("illegal" nodes) are wasted, so we lost a little benefits from segment tree. There is an mothod that also uses $$$2N$$$ memory and fully use them.
  • Querying some special intervals like $$$[0:n)$$$ are a bit slower than other implementations because it still goes up the tree.

Here is my code with OOP and GP. Thanks for D4nnyLee for collaboration. Feel free for your own usage. For simplicity, I didn't write a build() function. You can just write a loop to set every array element.

// segment tree
// contributed by XDEv11 & D4nnyLee
template<typename value_t, class merge_t>
class SGT {
    int n;
    vector<value_t> t; // root starts at 1
    merge_t merge; // associative function for SGT
public:
    explicit SGT(int _n, const merge_t& _merge = merge_t{})
        : n{_n}, t(2 * n), merge{_merge} {}
    void modify(int p, const value_t& x) {
        for (t[p += n] = x; p > 1; p >>= 1) t[p >> 1] = merge(t[p], t[p ^ 1]);
    }
    value_t query(int l, int r, value_t init) { // [l:r)
        for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
            if (l & 1) init = merge(init, t[l++]);
            if (r & 1) init = merge(init, t[--r]);
        }
        return init;
    }
};

For generic reason, I add the third parameter "init" in query() function. Although, you can add a bool or use std::optional since c++17 to avoid this, it's more intricate in my opinion.

Examples of how to use it :

query for sum
query for min

By the way, if you want to do other modifications rather than assignment, you can query that element first and assign its new value or you can change t[p += n] = x in modify() to what you need.

Thanks the original author for this amazing implementation and MikeMirzayanov for the excellent platform.

UPD: My implementation has been changed. (std::function removed)

UPD2: More formal proof about "No overlaps".

Full text and comments »

  • Vote: I like it
  • +109
  • Vote: I do not like it