Upper Bound for number of values returned in Segment Tree query

→ Pay attention

Before contest
Codeforces Round 941 (Div. 1)
17:53:21
Register now »

*has extra registration

Before contest
Codeforces Round 941 (Div. 2)
17:53:21
Register now »

*has extra registration

→ Streams

Atcoder ABC #351 Short Solution Discussion

By aryanc403

Before stream 16:58:21

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

ZeyadKhattab's blog

Upper Bound for number of values returned in Segment Tree query

By ZeyadKhattab, history, 4 years ago, In English

A Segment Tree query works by breaking down a segment that is queried into several pre-computed segment tree segments and merge them together to find the answer for the queried segment.

What is a tight upper bound for the number of segment tree segments that are merged together?

I believe it's $$$2*log(n)$$$ because the following scenario is possible:

4 nodes are visited in the majority of the segment tree levels, 2 of them make recursive calls, (visiting 4 more in the next level), and 2 segments are returned.

Is this analysis correct?

Is there a tighter upper bound?

#segment tree

ZeyadKhattab
4 years ago
1

Comments (1)

Write comment?

galen_colin

4 years ago, # |

+26

$$$2 \cdot lg(n)$$$ sounds reasonable, but is it as tight as it can be?

Imagine we're trying to create the query to maximize the number of segments. For convenience, round $$$n$$$ up to the nearest power of 2 (easiest to work with, and values of $$$n$$$ that are rounded up will have fewer segments overall). Consider what happens at each node of the segment tree. Let's divide this into cases:

Trivial cases: the query doesn't intersect the tree segment at all (+0 segments), or the tree segment is completely contained within the query (+1 segment).

Case 1: The query is completely contained inside one of the halves. We don't want this to happen until the very end, and if this case happens, we also might be able to extend the query into the other half to give us more segments.

Case 2: The query completely contains one of the halves. This adds 1 segment, and we recurse into the other half if necessary.

Case 3: The query is split between the two halves, and fully contains neither of them. Then we will recurse into both halves and solve them independently.

The important thing to note (this is where the segment tree gets its efficiency) is that case 3 will only happen at most once. Why? Let's say we split our query interval. Then on both halves that we recurse into, the query segment will 'hug' either the left or right end of the tree segment. Then, if case 3 were to happen again, it would be impossible without also satisfying case 2 (which takes priority). You can also see that once a segment starts hugging an end, all future segments that we will recurse into will also be hugging an end.

So if we can only split our query at most once, let's do it at the beginning to maximize our intervals. We also want case 2 to happen as much as possible, so we'll try to cover a lot. The optimal interval must cross between $$$n/2$$$ and $$$n/2 + 1$$$ ($$$1$$$-indexed). But if we cover the whole array, we'll just end up with one segment. To provoke case 2 as much as possible, the query segment should include everything except for $$$1$$$ and $$$n$$$.

How many segments do we actually get (assuming $$$n$$$ is a power of 2)? We'll get two segments of length $$$1$$$, two of length $$$2$$$, two of length $$$4$$$, and so on until (and including) $$$n/4$$$. This comes out to exactly $$$2 \cdot \lceil lg(n) \rceil - 2$$$ segments (for example, ($$$2$$$, $$$7$$$) for $$$n = 8$$$ requires $$$4$$$ segments — ($$$2$$$, $$$2$$$), ($$$3$$$, $$$4$$$), ($$$5$$$, $$$6$$$), ($$$7$$$, $$$7$$$)). Maybe it's lower when $$$n$$$ is not a power of 2, but to actually prove things it might get more complex. Asymptotically, there's no way we can do better than around $$$2 \cdot lg(n)$$$.

→ Reply