Is there a simple solution to the below problem in $$$O(1)$$$ per query with reasonable, such as $$$O(NlogN)$$$, preprocessing?

**Problem:** Given two nodes $$$u$$$ and $$$v$$$ in a tree, find the second node on the path from $$$u$$$ to $$$v$$$.

If $$$u$$$ is not an ancestor of $$$v$$$ this is easy. If $$$u$$$ is an ancestor of $$$v$$$ though then the answer is one of the children of $$$u$$$. You could compute the answer with jump-pointers in $$$O(log N)$$$ per query, but is there any easy way to find it in $$$O(1)$$$ per query?

I can give you an amortized O(1) solution, given that you give me these queries offline.

Thank you for your help. I was hoping for something online, but out of curiosity, what is your solution to the offline version?

Actually, I don't know the solution.

I just wanted to see if you were down to earth and would accept a solution from a person who knows nothing compared to you.

You're a good person. Thanks for the polite reply. I hope you do well in life!

In offline you can just simply run dfs. Maintain current stack of ancestors and when you visit a vertex

`v`

you can access an ancestor in any height in $$$O(1)$$$ time.Do a dfs on the tree s.t. now, for every subtree you have associated an interval. When querying, assuming u is an ancestor of v (checkable in O(1) using the intervals, and otherwise as you've mentioned the query is trivial anyway), the easy approach would normally be binary searching through the children's intervals. If for your current node, the interval is [l, r] you could try to precompute the children containing each i between l and r in O(r-l) per node, which means actually O(size_subtree). This is obviously still not good enough (we'd have O(1) query but O(N^2) precomputation). So what can be done is, first for simplicity swap the order of the children (preferably before doing the dfs) s.t. the first child is the heaviest (this step is not necessary but makes the next part easier to understand). Now, if your range is [l, r] and the range of the heavy child is [l, r'] you can only precompute that array for the range [r'+1, r], and consider the heavy child independently. Basically O(sum of size of children's subtrees with the exception of the heaviest) per node, which leads to NlogN precomputation (both memory and time). You can then query in O(1)

You can preprocess explicitly all queries such that the answer is not the biggest son of u and this is O(n log n) of information. If answer is not one the not biggest sons then ... it is the biggest son :)

You can do this with a sparse table on the depths in a preorder traversal of the nodes (similar to the reduction from LCA to RMQ). In particular, if $$$u$$$ is an ancestor of $$$v$$$, and the indices in the preorder are $$$i_u$$$ and $$$i_v$$$, then the second node is the vertex in $$$[i_u+1, i_v]$$$ with minimum depth, breaking ties with the node later in the preorder. You can store all of these in a sparse table (store the mins for each length $$$2^0, 2^1, \ldots$$$).

There's an equivalent formulation with an Euler tour, it's the same idea with tiebreaking by latest.

Well, it's probably an overkill for this problem but you can just use some technique for finding Level Ancsestor (link to wikipedia for details).

There are some nice O(1) solutions here; thanks for posting. I was also thinking about this problem earlier today (specifically the version where $$$u$$$ is an ancestor of $$$v$$$) since it's related to today's problem D, and I learned of the O(1) sparse table solution after discussing with saketh (the same solution ecnerwala described above).

Here's what I was using before. It's $$$O(\log \deg u)$$$ rather than O(1), but it's easy to code and definitely faster than jump pointers in practice:

The key idea is that the tour ranges of the children of $$$u$$$ are in sorted order, so we can do a simple binary search to find the right node.

Since degrees are O(1) on average, this will be O(1) on average if our queries are well-dispersed across all the nodes. In the worst case of a star graph where we repeatedly query the center node, our queries will be $$$O(\log n)$$$ but will benefit from caching since we always binary search the same array.

Thank you so much !!!!

I knew a weird method which can find the k-th ancestor of any node in $$$O(NlogN)$$$ preprocessing and $$$O(1)$$$ query, which can solve this problem. It is an extension to the regular binary lifting method. I think it is called ladder decomposition and it can be optimized dirtily into $$$O(N)$$$ preprocessing too.

1) You build the array for binary lifting.

2) Do heavy-light decomposition, but

heavy child is defined as the child with deepest subtree instead.3) Now for each chain in the heavy-light decomposition, lets assume that the highest node is $$$v$$$ and the length of the chain is $$$l$$$. You create a thing called "ladder" by combining the chain and the $$$l$$$ closest ancestors of $$$v$$$.

4) Lets say now you have to find the $$$k$$$-th ancestor of $$$u$$$. First you find maximum $$$w$$$ such that $$$2^w \le k$$$. You find the $$$2^w$$$-th ancestor of $$$u$$$ by the binary lifting array. Lets call it $$$v$$$.

5) Now you have to find the $$$k-2^w$$$-th ancestor of $$$v$$$. But you know that the length of the chain containing $$$v$$$ is at least $$$2^w$$$.

6) Because $$$k-2^w \lt 2^w$$$, the ladder corresponding to the chain containing $$$v$$$ also contains the k-th ancestor of $$$u$$$, so you can find it in $$$O(1)$$$.

Are you using "classic" HLD with choosing the subtree with biggest size, or the deepest subtree?

I think the approach will not work with the classic HLD.

Yes you are right. I forgot to mention that :P It is fixed now