### ganesh_6's blog

By ganesh_6, history, 2 months ago,

Can anyone provide a proper explanation of the A2 problem of Round 2 Facebook Hackercup editorial provided here: https://www.facebook.com/codingcompetitions/hacker-cup/2022/round-2/problems/A2/solution

I could not understand the Video editorial as well.

If you have other solution, kindly explain it. I want to learn and upsolve it. It is indeed a good question. Providing code/implementation the explanation is highly appreciated. Thanks!

• +12

 » 2 months ago, # | ← Rev. 2 →   +24 Let me give it a try. I'll assume you have solved A1.Let's say you want to check if A[l..r] is perfectly balanced. In A1 there were only $26$ different kinds of element, so we could just compare each element's frequency in A[l..m] and A[m+1..r]. But in A2 there are $10^6$ different kinds of element, so we can no longer use this method.Instead of comparing each element's frequency, one might come up with this idea: "what about comparing the sum of elements in A[l..m] and A[m+1..r]?". For example, if A[l..r] was [1, 2, 3, 2, 1, 3], left sum is $1+2+3=6$ and right sum is $2+1+3=6$. Both sums are equal, and [1, 2, 3, 2, 1, 3] is indeed perfectly balanced! But if you are given [1, 3, 2, 2], both sums are equal to $4$, but it's not perfectly balanced. Here you can't distinguish between $\{1, 3\}$ and $\{2, 2\}$.This situation is very analogous to string hashing. If we have a string $S=s_1s_2\cdots s_n$ and use hash function as $h(S) = s_1+s_2+\cdots+s_n \mod P$, same problem will arise. Instead we can use hash function like $h(S) = s_1r^{n-1}+s_2r^{n-2}+\cdots+s_nr^0 \mod P$ and lower chances of collisions.We can apply this hashing method to our problem(obviously you can use different hash functions). If we have some subarray A[l..r] and number of occurrences of $1$ in this subarray is $a_1$, number of occurrences of $2$ is $a_2$, ..., number of occurrences of $10^6$ is $a_{10^6}$, let's hash this subarray into $h(A[l..r]) = a_{10^6}r^{10^6-1}+\cdots+a_2r^1+a_1r^0 \mod P$.(You can choose $r$ and $P$ as you like, maybe two big prime numbers)Here, basically we are compressing all frequency information in a subarray into one single number. So to check if a subarray is perfectly balanced, we can just compare left hash value and right hash value.You might ask: "how to calculate the hash value fast?". Let's replace each $1$s in $A[]$ with $r^0\mod P$, $2$s with $r^1 \mod P$, ..., $10^6$s with $r^{10^6-1} \mod P$. Let's call this new array $B$. Then $h(A[l..r]) = B_l+B_{l+1}+\cdots+B_r \mod P$. Thus, if we use segment tree or fenwick tree, we can calculate the hash value in $O(log\hphantom{\text{_}}n)$.Now, to the main problem: how to check if A[l..r] is almost perfectly balanced? Let's say A[l..r] was almost perfectly balanced, and one extra element(say, $k$) belonged to the left half of the subarray. If we calculate $h(A[l..m])$ and $h(A[m+1 ..r])$, we can see $h(A[l..m])=h(A[m+1 ..r])+r^{k-1} \mod P$, and we get $h(A[l..m])-h(A[m+1 ..r]) \mod P=r^{k-1} \mod P$.So, we can calculate $h(A[l..m])-h(A[m+1 ..r]) \mod P$ and check if this value exists in $\{r^0\mod P, r^1 \mod P, \cdots, r^{10^6-1} \mod P\}$. If it exists, then A[l..r] is almost perfectly balanced. If it doesn't, do the same check for $h(A[m ..r])-h(A[l..m-1]) \mod P$.Hope this helps!
•  » » 2 months ago, # ^ |   0 Very well explained! Thank you!
•  » » 2 months ago, # ^ | ← Rev. 3 →   0 I implemented it as you said, but one problem i am facing is that for different r and P prime values i am getting different answers. If for one set it is passing 10 test cases, it is failing for others. What would be your suggestion? https://pastebin.com/ZUP9cjcD Basically i am facing hashing collision
•  » » » 2 months ago, # ^ | ← Rev. 3 →   0 I see some mistakes in your code. Here are what I found: Line 71: you should run the for loop up to $10^6-1$ Line 88, 89: instead of val, you should use pow(mod, val-1) Line 101, 102: you should do subtraction just like you did in Line 103 To avoid hash collisions, you can have multiple $(r, P)$ pairs and check if each pair gives the same result.
•  » » » » 2 months ago, # ^ | ← Rev. 2 →   0 I have made the changes and check for different values of r and P. Still getting WA https://pastebin.com/HY9XwpDp2 test cases are failing
•  » » » » 2 months ago, # ^ | ← Rev. 3 →   0 Also,i saw people using mt19937_64 rng(chrono::steady_clock::now().time_since_epoch().count()); to generate random number. Anything similar for Java? taran_1407 SecondThread
•  » » » » 2 months ago, # ^ |   0
•  » » » » 2 months ago, # ^ |   0 I finally found that the issue is not collision in my code but as you have mentioned in the editorial: https://www.facebook.com/codingcompetitions/hacker-cup/2022/round-2/problems/A2/solution"It's possible for us to be unlucky and get garbage that maps back to a value in h(A)" is the actual problem.
•  » » » » 6 weeks ago, # ^ |   0 Today I was reading an article and realized the meaning of your statement "To avoid hash collisions, you can have multiple (r, P) pairs and check if each pair gives the same result." is having 2 different hashes and if it satisfies all of them then true otherwise false.
 » 2 months ago, # |   0 I solved $A2$ with double multiplicative hashing. Why double hashing?Single hashing gave me FST (due to collisions in hash) which leads to disqualify for Round 3 :(Here is my submission.
•  » » 2 months ago, # ^ |   0 Yeah i am facing same issue currently. Thanks, i will look into this.
•  » » 2 months ago, # ^ | ← Rev. 2 →   0 Can you explain how you did the double multiplicative hashing? or suggest a resource to learn that
•  » » 2 months ago, # ^ |   0
•  » » 2 months ago, # ^ |   0 Your double hashing should help me.
•  » » » 4 weeks ago, # ^ |   0 Today I was reading an article and realized the meaning of your statement "To avoid hash collisions, you can have multiple (r, P) pairs and check if each pair gives the same result." is having 2 different hashes and if it satisfies all of them then true otherwise false.