retrograd's blog

By retrograd, history, 5 months ago, In English,

There seems to be a lot of encouragement for new people to learn segment trees, and in particular the lazy propagation technique, and it seems to me that most of the time it is not actually needed.

As a quick refresher (although I feel most of you would already know), lazy propagation is a technique in segment trees that lets you do updates on whole ranges of elements, by first only updating the smallest factoring of the update range in the tree. For example, in order to update range $$$[2, 5]$$$, what you do is you update only ranges $$$[2, 2], [3, 4], [5, 5]$$$ in the segment tree, and next time node $$$[3, 4]$$$ is accessed you "propagate" the updates downward into ranges $$$[3, 3]$$$ and $$$[4, 4]$$$. This allows you to effectively aggregate (combine) multiple updates on a given range, to save extra work.

Usually when people talk about lazy propagation, some tasks like "add on range" — "minimum on range" or "add on range" — "sum of range" naturally come to mind. However, I feel like these are poor examples of applications, as most of these can be solved without lazy propagation.

OMG, how is that possible?

An alternative way one could approach these kind of problems is to keep an extra array $$$lazy$$$ with the usual segment tree. $$$lazy(node)$$$ is an aggregate of all the update operations done on $$$node$$$ until present time. In the above examples, $$$lazy(node)$$$ would be the sum of all the updates done on the node. Then the recurrence becomes $$$data(node) = lazy(node) + min(data(node * 2), data(node * 2 + 1))$$$ in the "minimum on range" case and $$$data(node) = data(node * 2) + data(node * 2 + 1) + lazy(node) * length(node)$$$ in the "sum on range" case (here $$$length(node)$$$ denotes the length of the range corresponding to $$$node$$$ in the tree.

The way to query is by having an extra parameter passed down in the traversal, which aggregates the operations inside $$$lazy$$$ while going down in the tree. The technique is similar to something called "upwards-downwards dynamic programming on trees" in Romania.

An example implementation is here.

So, you still store lazy array, but you just don't propagate. Why should we care?

Honestly, firstly I would say that it's more simple and natural. A lot of data structure problems I've encountered need some sort of "lazy" value that holds an aggregate of operations that affect the whole structure (take a simple data structure problem, where you have to model adding a constant value to all elements of a collection, and pop the min element, which can be done by using a priority queue, along with an extra value that lazily aggregates all add operations).

Second of all, I think it's easier to debug a solution that does not propagate, as I feel that I can reason about the correctness of the segment tree values more easily with this approach. In contests like ACM ICPC, it is key to debug on paper as much as possible when applicable, and with this approach one could simulate the range updates and build expectations upon the data structure at different points in time easier.

Third of all (and maybe most importantly), it is way faster. I will probably make a full comparison if people are interested, but from my experience I found that lazy propagation yields a particularly slow solution (hidden constant is bigger), probably because it does a lot more mutations on the underlying array, and this approach reduces that constant very considerably.

Ok, then, why ever propagate?

Well, it seems that not all problems can be solved via this method. For example, a "set to constant value on range" and "sum on range" type problem cannot be easily solved by this. What is particular about these scenarios is that the update operations are order-dependent (in other words, they are not commutative). However, in a lot of cases the update operations are commutative, and I don't see why any of these kind of problems would use lazy propagation instead of this technique.

I'm curious to hear your opinions on this.

 
 
 
 
  • Vote: I like it
  • +70
  • Vote: I do not like it

»
5 months ago, # |
  Vote: I like it +34 Vote: I do not like it

Wait, so that recursively passed val in your code is the same as the lazy value in "standard" lazy propagation? Meh, then there isn't really a difference IMO.

  • »
    »
    5 months ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    Maybe. However, some tasks require you to design a data structure that can handle for example "add on range" and "query for global minimum". In this case there wouldn't be any actual "propagation" downwards, and performance-wise it is notably better to avoid propagating here. Still, I feel that a lot of people would actually do lazy propagation in this scenario, which is slower and a bit more cumbersome.

»
5 months ago, # |
Rev. 3   Vote: I like it +8 Vote: I do not like it

I think you answered yourself. However, in a lot of cases the update operations are commutative, and I don't see why any of these kind of problems would use lazy propagation instead of this technique.

But, in my opinion, is an optimized version of lazy propagation. You still stop and continue when another query/update has to get deeper nodes. You just do the propagation by sum along the path.

As a Romanian, I have to say that I used just the second version, I never thought of the first one.

»
5 months ago, # |
  Vote: I like it +8 Vote: I do not like it

There seems to be a lot of encouragement for new people to learn segment trees, and in particular the lazy propagation technique

But there's not so much of encouragement to use it blindly, I believe. Mine impression is that people usually learn both techniques, avoiding (and under-practicing) lazy-prop exactly because it is comparatively cumbersome.

»
5 months ago, # |
  Vote: I like it +33 Vote: I do not like it

It's called "标记永久化" in Chinese. Quite well-known in China.

»
5 months ago, # |
  Vote: I like it +46 Vote: I do not like it

clickbait of the year

  • »
    »
    5 months ago, # ^ |
      Vote: I like it +13 Vote: I do not like it

    My contribution has been going down during the whole year. Something had to be done about it.

»
5 months ago, # |
  Vote: I like it +39 Vote: I do not like it

More interesting thing that you can't do lazy propagation in 2D segment tree, but you can do "add on rectangle" query using described approach.

  • »
    »
    5 months ago, # ^ |
      Vote: I like it +19 Vote: I do not like it

    Could you please elaborate on that? What is the recurrence relation upwards in the 2D case?

»
2 months ago, # |
Rev. 4   Vote: I like it +10 Vote: I do not like it

Maybe some lazytags are not propagatable?

For example, the simplest one: all numbers in an interval set to a give number (interval covering)

Oh I see you may want to say: some problems can solve by not propagating lazytags, but why many people choose to do that?

I may say, this is a clever way, but not everytime easier to think? I think propagating is more clear and understandable. when you solving a problem, this helps you think faster, maybe?

In practice, I also found that it's not easier to write, at least for my segment-tree coding style, may because I'm not very familiar to this method?

More precisely, my segment-tree used Query(int node_id, int node_l, int node_r, int query_l, int query_r)-like functions. To calculate a tag's contribution, it must write sum += (max(node_r, query_r) - min(node_l, query_l) + 1) * tag_val[node_id]-like codes. It's definitely more complex than the old one: return sum[node_id].

  • »
    »
    2 months ago, # ^ |
    Rev. 3   Vote: I like it +10 Vote: I do not like it

    I see.

    Funny enough, usually my way of coding walks on segment tree (courtesy of Petr's streams, LOL) starts with the equivalent of query_l = max(query_l, node_l); query_r = min(query_r, node_r), which might make everything substantially easier in this approach, hehe.

    On a side note: I'm not sure why you crossed the first two sentences. As I remember, I didn't manage to find a way to use this approach with "set value on interval". Have you?

    • »
      »
      »
      2 months ago, # ^ |
      Rev. 4   Vote: I like it +5 Vote: I do not like it

      Yeah, it depends on different sgt coding styles.

      For the second question, here's some ODT's opinion (famous in China because high-level DS(and sqrt-decomposition) understandings and hard DS(and sqrt-decomposition) problems):

      For "标记永久化", the tags must "be commutative" in some way, that means for an node, if two tags have been added on it at different time, when we swap them, the "information" of the node should not change.

      But regular tag may not have the property, just like "set value on interval" tags. That's why it cannot solve some types of problems.

      (Sorry for not to read your blog carefully, I just saw you claimed the same idea in the last paragraph. Apologize for that.)


      Add some my own opinion: in persistent sgts, when it needs to do range modify operations, if we use don't propagate and use this method, it'll save not only time constant but also space, and most of times, smaller space means faster addressing and cache, another great use!

    • »
      »
      »
      2 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      There's a way to not use propagate and solve the "set value on interval" problem:

      just add another information to the tag, that is, the time when it's operated.

      You can find that, now it's commutative somehow, and solving the problem is possible.

      Is every tag type can be modified to be commutative like this? Unsure. Further discussion welcomed.

      • »
        »
        »
        »
        2 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        You could do that, but I think you wouldn’t be able to apply the lazy tag, you would have to somehow know the values in the subtree at that specific time point, and I don’t see how you would be able to.

        To make it more clear, say you set (1, 1) to be 1, then (1, 2) to be 2 and (2, 2) to be 3. How would you know in the parent node that the leftmost value should be a 2 instead of a 1, whereas the rightmost value should be unchanged?

        • »
          »
          »
          »
          »
          2 months ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Ugh.. seems I made a mistake, I wasn't thinking carefully at that time. Forget about it. We have to admit that lazy-propagate is a great trick right ;)

»
2 months ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

What about persistent segment trees? I think in many cases lazy propagation can be replaced by persistent data structures. However, multiplication + addition on segment can't be done without lazy propagation.

Simple OCaml code for setting interval values
  • »
    »
    2 months ago, # ^ |
    Rev. 2   Vote: I like it +8 Vote: I do not like it

    I think the main problem with persistent trees is the big memory overhead, which makes it frowned upon, as memory overhead rapidly translates into slower solutions, which is one of the key points in my blog about non-lazy probagation.

    I think you can handle multiplication + addition in the same way you handle it on a regular lazy propagation segment tree (you keep a $$$\text{lazy_add}$$$ and $$$\text{lazy_mul}$$$ for each segment, such that, for every value $$$x$$$ inside the segment, its real value is $$$x \cdot \text{lazy_mul} + \text{lazy_add}$$$ (initially, $$$\text{lazy_mul} = 1$$$ and $$$\text{lazy_add} = 0$$$, additions will transform $$$(\text{lazy_add}, \text{lazy_mul}) \rightarrow (\text{lazy_add} + z, \text{lazy_mul})$$$, and multiplications will transform $$$(\text{lazy_add}, \text{lazy_mul}) \rightarrow (\text{lazy_add} \cdot z, \text{lazy_mul} \cdot z)$$$).

    Why are you writing competitive programming snippets in OCaml, anyways?

»
2 months ago, # |
  Vote: I like it 0 Vote: I do not like it

i didn't fully under stood your method like if we have update on range 3-4 we will go to node 3-4 and store in the lazy of 3-4 the update. no if the query is of 1-4 we will reach the node 1-4 and take all the values in the segment 1-4 and lazy 1-4 and go back from here.we are not goning to 3-4 so how do we come to know about the update ??? please help.