Dominant Path Queries

→ Pay attention

Before contest
Codeforces Round (Div. 3)
43:17:50
Register now »

→ Streams

CodeChef Starters 132 Solution Discussion

By aryanc403

Before stream 21:27:49

Codeforces Round 943 Solution Discussion

By aryanc403

Before stream 45:42:49

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	jiangly	3578
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	tourist	3565
8	maroonrk	3531
9	Radewoosh	3521
10	Um_nik	3482

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	165
3	adamant	161
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
8	SecondThread	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Epilogue's blog

Dominant Path Queries

By Epilogue, history, 7 months ago, In English

Hi!. I'm solving a problem. Given a tree with N nodes with colors and q queries. For each query given two integers (u, v), answer if the path(u, v) on the tree is dominant. A path is called dominant when the most frequently occuring element appears strictly more than pathlength / 2.

Are there any offline, online solutions for this? If you have any ideas please comment bellow.

Epilogue
7 months ago
23

Comments (23)

Write comment?

NeverCompromise

7 months ago, # |

-28

Did you try to google before asking this? If you did, you would atleast know that it can be solved easily offline with Mo's.

→ Reply

Epilogue

7 months ago, # ^ |

← Rev. 2 →

I know Mo for the subtree one but this one is path query. Can you using MO with it or Did you even read the problem properly?

→ Reply

NeverCompromise

7 months ago, # ^ |

-16

Did you even read the problem properly?

Yes, I read your problem properly. You however, still didn't read my comment properly. A simple google search yields this.

→ Reply

Epilogue

7 months ago, # ^ |

Oh I saw that one before but the constraint is like 5*10^5 so.

Anyways tks for reminding me that.

→ Reply

CentroidDecomposition

7 months ago, # |

← Rev. 6 →

Arbitrarily root the tree. Let $$$cnt(i, j)$$$ denote the frequency of color $$$j$$$ on the path from root to node $$$i$$$. Now the number of occurrences of a color $$$x$$$ in $$$path(u, v)$$$ is $$$cnt(u, x) + cnt(v, x) - 2 * cnt(lca(u, v), x) + (color[lca(u, v)] == x)$$$. A path is dominant if the number of occurrences of the median color in the path is strictly greater than $$$\frac{pathlength}{2}$$$. $$$cnt(i, j)$$$ can be calculated online in $$$\mathcal{O}(\log N)$$$ with the help of persistent segment tree and the median color in a path can be calculated by binary search with range sum queries on the same tree. Overall time complexity will be $$$\mathcal{O}(N \log N + Q \log^2 N)$$$ and memory complexity $$$\mathcal{O}(N \log N)$$$.

The problem can also be solved in $$$\mathcal{O}((N + Q) \log N)$$$ by an approach similar to walking on a segment tree once you have the persistent segment trees containing information on the frequency of colors on the path to root for each node. It can also be solved by a binary lifting approach using the fact that the majority of a sequence must be the majority of one of the sequences when it is partitioned into two.

→ Reply

Epilogue

7 months ago, # ^ |

Wow that's nice. I have thought about Persistent Segment Tree but I didn't think about finding the median on the path. Thanks.

→ Reply

JaberSH1

7 months ago, # ^ |

Actually you can get rid of the binary search and just search the segtree going to the child node with the maximum sum on each step, so it'll become $$$O(NlogN + QlogN)$$$.

→ Reply

CentroidDecomposition

7 months ago, # ^ |

← Rev. 3 →

~~You cannot do that.~~ The segment tree of node $$$i$$$ contains the information of the frequencies of colors on the path from root to node $$$i$$$. To get the frequency of colors on $$$path(u, v)$$$ you need to combine the information of three segment trees. On a step in binary search you need to check the sum of frequencies of colors less than mid. The median will be the largest element for which such sum is less than $$$\frac{pathlength}{2}$$$.

→ Reply

JaberSH1

7 months ago, # ^ |

You can keep the 3 segtree roots with you and go to their right or left children on each step of the search.

→ Reply

CentroidDecomposition

7 months ago, # ^ |

Yeah that works.

→ Reply

chromate00

7 months ago, # |

← Rev. 5 →

Two approaches I can think of.

Approach 1

Approach 2

→ Reply

bicsi

7 months ago, # ^ |

You don’t need HLD, you can just apply Approach 2 as a ‘binary lifting’-like dp and check the candidate in $$$O(\log n)$$$, making it $$$O(q \log n)$$$

→ Reply

chromate00

7 months ago, # ^ |

But how would you adapt the Boyer-Moore Majority Vote on the binary-lifting like method? Can we just try two possible candidates for the majority using the fact (which I have not proven, but does seem trivial) that if an element is a majority of $$$A + B$$$ it must either be a majority of $$$A$$$ or a majority of $$$B$$$?

→ Reply

bicsi

7 months ago, # ^ |

The fact that you mentioned is indeed true, it’s not too hard to prove. However, you can just aggregate the results on the two independent paths with the same combining function.

→ Reply

abdelrahman_rashed

7 months ago, # ^ |

Proof for "if an element is a majority of $$$A + B$$$ it must either be a majority of $$$A$$$ or a majority of $$$B$$$" : $$$\newline$$$ Suppose both $$$A$$$ and $$$B$$$ have length of $$$2n + 1$$$. Then let $$$X$$$ and $$$Y$$$ be the number of majority for $$$A$$$ and $$$B$$$ respectively each with count of $$$n + 1$$$. For the majority of $$$A + B$$$ to be some element other than $$$X$$$ or $$$Y$$$, such element will need to have a count greater than that of $$$X + Y$$$. The maximum count we can get for any element other than $$$X$$$ or $$$Y$$$ is $$$|A + B| - cnt(X) - cnt(Y) = 4n + 2 - 2(n + 1) = 2n$$$. Since $$$2n < 2(n + 1)$$$ then it is impossible for any number other than $$$X$$$ or $$$Y$$$ to be the majority of $$$A + B$$$. $$$\newline$$$
Explanation for "how would you adapt the Boyer-Moore Majority Vote on the binary-lifting like method" : $$$\newline$$$ let $$$lft(x, y)$$$ be the resulting node after lifting $$$x$$$ by $$$y$$$, and let $$$dp_{i,j}$$$ be a pair $$$(cnt, val)$$$ that stores the answer after applying Boyer-Moore on all nodes on the path from $$$lft(i, 2^j)$$$ to $$$i$$$. updating the $$$dp$$$ is as follows : $$$\newline$$$ $$$dp_{i,0} = (1, val_i) \newline$$$ $$$(dp_{i,j} \neq dp_{lft(i, 2^j),j}) \implies dp_{i,j+1} = max(dp_{i,j}, dp_{lft(i, 2^j),j}) \newline$$$ $$$otherwise \implies dp_{i, j+1} = (dp_{i,j}.first + dp_{lft(i, 2^j),j}.first, dp_{i,j}.second) \newline$$$ This $$$dp$$$ can be preprocessed in a similar manner to $$$lft$$$. Now lets consider each query in the form : "is $$$path(x, y)$$$ dominant?", you can get Boyer-Moore for $$$path(x, lca(x, y))$$$ in using binary lifting on $$$dp$$$ in a similar manner to getting $$$lca(x, y)$$$. The answer is simply to check if the value of Boyer-Moore on $$$path(x, y)$$$ is the majority or not.

→ Reply

Epilogue

7 months ago, # |

I solved it using Persistent Segment Tree, I think will try some others approach too.

→ Reply

0x0002

7 months ago, # |

Maybe you can randomly choose some vertexs and count the occurance of it. If you randomly choose $$$x$$$ times, the error rate will be $$$\dfrac{1}{2^x}$$$, so $$$x$$$ be about $$$O(\log V)$$$ is enough.

→ Reply

Mr.Whitebird

7 months ago, # ^ |

thats so cooolll

→ Reply

estoy-re-sebado

7 months ago, # |

← Rev. 3 →

First find a candidate for the most common color by calculating a pair (color, score) with a HLD path query. To combine pairs its easy. $$$(c1,s1)+(c2,s2)$$$ is just $$$(c1,s1+s2)$$$ if $$$c1=c2$$$ and $$$(c1,s1-s2)$$$ if $$$c1!=c2$$$ and $$$s1>s2$$$. Otherwise its $$$(c2, s2-s1)$$$

(The cool property of this operation is that, if there is a dominant color then the result of the operation has that color.

Then use another path query to compute real amount of appearances of that color along path (also HLD).

This way you can do online queries with updates.

It's probably overkill but maybe it gives some ideas

→ Reply

estoy-re-sebado

7 months ago, # |

← Rev. 2 →

Another idea

If the path is short (L < 2sqrt(N)) we can just check it in linear time using a hashmap or something.

If the path is long (L > 2sqrt(N)) the color must have at least L/2 appearances on the path, so it must have at least sqrt(N) appearances on the tree. This means that there will not be many colors that have this property and we can try and check all of them using something like binary lifting idk.

This works for online queries but no update s

→ Reply

TimDee

7 months ago, # ^ |

-8

I don't see that updates would be mentioned in post, so I think your solution can be further simplified:

We can have at most sqrt colours with freq >= sqrt.

For each we can build array d[c][u] = how many nodes have colour 'c' on path from u to root. Then calculating frequency of c on path from u to v is trivial in O(1), path_len >= 2sqrt can be solved in O(sqrt) per query. The other case with small length can be checked in O(length) as you said.

*(Note, it works only if ML allows NsqrtN memory)

→ Reply

1802042

7 months ago, # |

Epilogue problem link plz..

→ Reply

lmn207

4 months ago, # |

An O(q.logn.C) solution is you randomize choosing value in path (u, v) Than break the query(u, v) into queries(u, v, w): count number of value w in path (u, v) you can iterate the value of a and than build fenwick tree => we count in O(logn)

→ Reply