ZeyadKhattab's blog

By ZeyadKhattab, history, 4 years ago, In English

A Segment Tree query works by breaking down a segment that is queried into several pre-computed segment tree segments and merge them together to find the answer for the queried segment.

What is a tight upper bound for the number of segment tree segments that are merged together?

I believe it's $$$2*log(n)$$$ because the following scenario is possible:

4 nodes are visited in the majority of the segment tree levels, 2 of them make recursive calls, (visiting 4 more in the next level), and 2 segments are returned.

Is this analysis correct?

Is there a tighter upper bound?

  • Vote: I like it
  • +36
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it +26 Vote: I do not like it

$$$2 \cdot lg(n)$$$ sounds reasonable, but is it as tight as it can be?

Imagine we're trying to create the query to maximize the number of segments. For convenience, round $$$n$$$ up to the nearest power of 2 (easiest to work with, and values of $$$n$$$ that are rounded up will have fewer segments overall). Consider what happens at each node of the segment tree. Let's divide this into cases:

Trivial cases: the query doesn't intersect the tree segment at all (+0 segments), or the tree segment is completely contained within the query (+1 segment).

Case 1: The query is completely contained inside one of the halves. We don't want this to happen until the very end, and if this case happens, we also might be able to extend the query into the other half to give us more segments.

Case 2: The query completely contains one of the halves. This adds 1 segment, and we recurse into the other half if necessary.

Case 3: The query is split between the two halves, and fully contains neither of them. Then we will recurse into both halves and solve them independently.

The important thing to note (this is where the segment tree gets its efficiency) is that case 3 will only happen at most once. Why? Let's say we split our query interval. Then on both halves that we recurse into, the query segment will 'hug' either the left or right end of the tree segment. Then, if case 3 were to happen again, it would be impossible without also satisfying case 2 (which takes priority). You can also see that once a segment starts hugging an end, all future segments that we will recurse into will also be hugging an end.

So if we can only split our query at most once, let's do it at the beginning to maximize our intervals. We also want case 2 to happen as much as possible, so we'll try to cover a lot. The optimal interval must cross between $$$n/2$$$ and $$$n/2 + 1$$$ ($$$1$$$-indexed). But if we cover the whole array, we'll just end up with one segment. To provoke case 2 as much as possible, the query segment should include everything except for $$$1$$$ and $$$n$$$.

How many segments do we actually get (assuming $$$n$$$ is a power of 2)? We'll get two segments of length $$$1$$$, two of length $$$2$$$, two of length $$$4$$$, and so on until (and including) $$$n/4$$$. This comes out to exactly $$$2 \cdot \lceil lg(n) \rceil - 2$$$ segments (for example, ($$$2$$$, $$$7$$$) for $$$n = 8$$$ requires $$$4$$$ segments — ($$$2$$$, $$$2$$$), ($$$3$$$, $$$4$$$), ($$$5$$$, $$$6$$$), ($$$7$$$, $$$7$$$)). Maybe it's lower when $$$n$$$ is not a power of 2, but to actually prove things it might get more complex. Asymptotically, there's no way we can do better than around $$$2 \cdot lg(n)$$$.