When submitting a solution in C++, please select either C++14 (GCC 6-32) or C++17 (GCC 7-32) as your compiler. ×

BERNARD's blog

By BERNARD, 3 years ago, In English

Sometimes you might need to use multiple segment trees over some set of elements, for example, you have to answer range queries on a single array from a lot of small arrays or to make a segment tree on a tree to answer path queries.

I knew that in cases like these, it might be more efficient to use a single segment tree to represent everything, for example, instead of building a segment tree on each one of the small arrays, you represent each array as a continuous range in one big segment tree.

I've used this whenever I needed an unknown/big number of segment trees, thinking that this is more efficient and uses less memory, but sometimes it wasn't the case as doing this has made my code slower, I once ended up doing a binary search on every query to find the actual borders of the queries, which I could've avoided by making a separate segment tree on each array.

I've thought about the advantages of using this technique, but I really couldn't find any.

I've found that, in both ways, I'm using the same amount of memory ($$$4\sum n = \sum 4n$$$), and both had the same speed.

So what I want to ask is, when does this technique give me real practical advantages to not use multiple segment trees in those cases? (other than HLD)

  • Vote: I like it
  • +60
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it +9 Vote: I do not like it

I don't think there are any. In fact, using multiple segment trees could be better in terms of performance. Since they are more likely to be cache-friendly.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +7 Vote: I do not like it

    I've actually thought the contrary, but now after a bit more thinking... yes, it is indeed more cache-friendly.

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Ok so let's say you have a big segtree of size $$$N$$$, and instead of that you could have had smaller segtrees of size $$$n$$$ each. (Here I have assumed for simplicity that all smaller segtrees are same in size but it doesn't matter)

Anyway... We have $$$n \le N$$$, so obviously, $$$log(n) \le log(N)$$$ (cost of a single update/query). Using multiple segtrees is the asymptotically better option.

Regarding cache-friendly behaviour, I am not very sure about this but it seems like smaller segtrees are more likely to be better but can't explain why.