Meta Hacker Cup Round 2 Editorial
Good morning!
As usual, written editorials are available in the solution section of all problems on the Hacker Cup site. Feel free to give problem feedback and discuss solutions here.
A1/A2: Perfectly Balanced
Feedback
B: Balance Sheet
Feedback
C: Balance Scale
Feedback
D1/D2: WorkLife Balance
Feedback
Feel free to leave your thoughts on the problems below. I hope you found the round, and the scoring distribution, almost perfectly balanced. :)
Can anyone explain how to handle the equal weights in problem C, the Balance Scale? I couldn't grasp the editorial clearly. Thanks in advance.
Anything with an equal weight to the chocolate chips has the same chance as being left as the chocolate chips. So if you have 3 chocolate chips, and 7 raisins of equal weight, just pretend you have 10 chocolate chips, and then multiply your answer by 30% at the end.
"Note that it's irrelevant how ties are broken by choosing the left cookie, because for any random subset lined up in a random order to be placed on the scale, it's equally likely for any two cookies of the same weight to be placed on either side"
How are raisins and chocolate chips of the same weight to be equally likely?
Symmetry. Raisins and chocolate chips of same weight are basically identical, so we can just divide up the probability equally among them.
Thank you! This observation is done to ease the implementation. For any given problem let's say if we have 5 cars of red, 10 cars of blue, 17 cars of green of equal size x*y*z and another cars of size different from these and different color. What is the probability that choosen car of size x*y*z is red 5c1/(5+10+17)c1. Here it is identical to it.
Is the problem C analogous to this?
Basically we are multiplying the obtained probability(i.e; probability that the chosen K+1 cookie subset contains atleast one cookie of weight W1) with the probability that the tie broken down by batch1 cookie. I understand this and it is crystal clear.
The doubt i have is that how the conditional probability = C1/(C1+Y)
I knew that P(AB)=P(A and B)/(P(A)*p(B)) according to Bayers theorem, but how the above formula came up from this? Any explanation is appreaciated.
Good question. This deserves an upvote!
This idea is to ease the implementation? or this is how generally it is solved?
More people than I expected apparently disliked A1/A2. If you're one of them, what didn't you like about it?
As for me, I wrote AC solution, but because of my small stack size, I got RE on the second validating test, so after 7 minutes I had no chance to send it. In fact this task is pretty nice, but the system... After the contest I asked my friend to send it using his laptop and he got AC....
I don't see how this is an issue specific to this problemeveryone has had two rounds to set up their system to run code with a sufficiently large stack size; at this point, competitors who haven't done so are willingly making the choice to MLE any problem with a sufficiently large input.
During that two rounds every submission got accepted, so I have no problems with stack size. But in this particular problem I had some problems. I mean the task was great, but I didn't like the system. Anyway it is my fault. I can't get the point why the authors created such a weird system. There is a really nice system that is used on Codeforces, so you do not need any local resources to successfully submit your code.
But there's no recursion in the solution for A... Are you sure you didn't just have UB?
I have segment tree that uses recursion. And as I say earlier, my friend submitted the same code and got accepted
So about log n depth... so your stack is smaller than 10 kb? I don't think so. Default stack is 10 mb. 99% chance you have UB.
The strongest reason I have for disliking the problem is that ideologically, I feel that it is not appropriate to propose problems where the intended solution is not deterministically guaranteed to solve the problem when you only have one submission available to you.
In Code Jam, usually these problems are set as Visible data sets so you can take a relatively informed risk in implementing something simpler with a lower probability of success, versus taking more time to implement something more complicated that has a higher probability of success. However, here, you have no recourse if you got unlucky but had the spirit of the intended solution.
So you don't like any probabilistic problems at all? Even ones like this where you can write solutions where you are more likely to get hit by an asteroid during the contest than get it incorrect by being "unlucky"?
I think my comment makes it clear that my position exists in the context of the onesubmission contest format.
But the same thing happens anywhere with system tests too
I strongly disagree. I think that it's ok to use randomized problems in Hacker Cup. It's up to you to get a small enough probability of failing. If you're not ok with
unsigned long long
and 0.01% to fail, then just use a pair of long longs. You would take a "relatively informed risk" with any contest format.(I agree with others that this task was a check of whether you know a technique or not. That's bad.)
You can solve A deterministically using Mo's algorithm with updates.
It's slow but will probably fit in 6 min
i did same but it didnot ran in time in my computer .Can u pls share your code.Thanks in advance.
How can we apply Mo’s when there are point updates as well?
lets say we have point update at query 10 and 20 then mo's algorithm can be applied for all queries between 11 to 19.Tho this is not the best approach but something which I thought could run on my computer in 6 mins.
I think this approach will fail in the simple case where every alternate query is a point update.
no why.My code passed sample.
Means I was keeping track of all queries of type 2 in a vector then whenever a query of type 1 comes I first found the result for all the queries in the vector.Cleared the vector then performed the update.
Yes it will result in TLE if we take time limit as 1 or 2 sec.
See https://codeforces.com/blog/entry/72690 and the editorial for the linked problem https://codeforces.com/problemset/problem/940/F.
Thanks!
In A1 I disliked the special case where Li == Ri (the substring length is 1). Nothing is left after removing this single letter and both first/second halves are empty strings. Is the resulting empty string a palindrome? Searching on the Internet reveals that some people think that it is (because reversing an empty string results in an empty string), but the others think that it isn't (because an empty string is not considered to be a word).
Fortunately the A2 problem statement mentioned that [1] is an example of an almost perfectly balanced array and this resolved the ambiguity for me. Still a bit of time was wasted wondering whether I need to ask for a clarification or not.
I had this issue too, but it's clarified in the sample output explanation for A1 (fourth query in the third sample). This obviously isn't ideal (imo, problems should be fully specified even without the sample input/output, and thus the sample I/O and explanation should be essentially redundant), but at least the information was there without competitors being forced to look at A2.
+1 on this.
Ideally the problem statement should clearly mention the expected outcome in case of Li == Ri (substring of length 1) without having to take a look at the test cases in sample input / output.
Does anyone know the idea behind not having partial marking for facebook(meta) hackercup ?
E.g. To scale the score by a multiplying factor which is the percentage of passing test cases
Empty string is a word. This can't be a controversy.
What are you even trying to say?
Thanks for showing that you had difficulties to provide an unambiguous answer to my simple question, which required understanding of what "word" actually means. Also the definition of "palindrome" predates computers and computer science by several centuries.
Things are always in context, and you can't talk anything if you assume to talk to the stupidest person in the room. If I say word it is a sequence of letters. Empty string is a word. It seems your idea of word, in this discussion of FBHC Round 2, is from MerriamWebster — "a written or printed character or combination of characters representing a spoken word". That's hilarious, and I hope you freak out next time if you are given a 69420 letter "word" as an input.
Agreed that empty string is a palindrome (as it reads same forwards and backwards)
My point was specific to this problem in the context "the first half of the string can be shuffled to make the string a palindrome"
So, in case of empty string there is no first half or a second half defined as such and expected outcome for such case could be a part of problem statement for better clarification.
But I get your point. Thanks :)
On another note, does anyone know the idea behind not having partial marking for facebook(meta) hackercup ?
E.g. To scale the score by a multiplying factor which is the percentage of passing test cases
well you see, the empty string is clearly a palindrome, the reason is called a vacuous truth.
I wouldn't say I disliked it personally but I didn't find it very interesting. To me it feels like just a knowledge check (do you know how to hash a multiset) followed by trying to implement it cleanly, which can be fun for some people but I personally don't find that super exciting.
It was just boring. I don't like problems using hashing too. Not because I'm afraid of systest fail, I just don't like it.
can u pls give bit more detailed explanation for A2 i am new to the topic hashing a multiset and cannot find any good blog.Tho i got what's the idea but cannot get why the probability is low.
Problems were actually quite balanced.
Great work to the team :)
Editorial of B sais, that it's too long to solve it as general "find K longest paths in DAG" problem, but it isn't. We can construct a graph with N vertices and 2N edges, not N^2 edges. We don't need to add an edge to all possible buyers. Only to the smallest cost. But to compensate for it, we need to add an additional edge "resell immediately to next cost buyer".
Code is kinda messy and definitely not easier than the intended solution.
I was going to leave a similar comment. Here is my explanation:
In the problem B it is actually possible to build a DAG with a linear number of edges in $$$n$$$ and then run an $$$O(nk\log{k})$$$ solution that is mentioned in the first paragraph. The graph is as follows:
Consider all pairs $$$(a_i, x_i)$$$ and $$$(b_i, y_i)$$$, denote the set of all such pairs by $$$S$$$. For each pair $$$(c, z)\in S$$$, create the two vertices, say, $$$(c, z, \text{in})$$$ and $$$(c, z, \text{out})$$$. Also create another two vertices, the source and the sink.
For each $$$c$$$, if $$$(c, z_1)$$$ and $$$(c, z_2)$$$ are two consecutive pairs from $$$S$$$ (lexicographically), add edges $$$(c, z_1, \text{in})\to(c, z_2, \text{out})$$$ and $$$(c, z_1, \text{out})\to(c, z_2, \text{out})$$$, both with weight $$$z_2  z_1$$$. For each guy with parameters $$$(a, b, x, y)$$$, create the following three edges: $$$(a, x, \text{out})\to(b, y, \text{in})$$$, $$$\text{source}\to(b, y, \text{in})$$$, $$$(a, x, \text{out})\to\text{sink}$$$; all of weight $$$0$$$.
Sort the vertices in the following order: source first, then all $$$(c, z, ...)$$$ in increasing lex order of $$$(c, z)$$$, then sink. Now all edges go from left to right, hence it's a DAG. The total number of edges is about $$$5n$$$ or something, and each path from source to sink corresponds to a path from the statement, and vice versa.
However, I think that this solution is easier than the one from the editorial, if one is familiar with the "find K longest in a DAG" idea
whats the intuition to make a graph like that?and how it helps in ques?
The idea is to reduce the problem from "given a set of market players, find top $$$k$$$ transaction chains" to the one of type "given a weighted directed acyclic graph with source $$$s$$$ and sink $$$t$$$, find top $$$k$$$ paths from $$$s$$$ to $$$t$$$". We want to build such a graph that any transaction chain (sorry, I don't remember how it was called in the problem) transforms into a path from $$$s$$$ to $$$t$$$ with length equal to the cost of the chain, and nothing else transforms into such path.
In the graph we built, every path from $$$s$$$ to $$$t$$$ looks like the following: we start from $$$s$$$, then we go to some vertex $$$(c, z_1, \text{in})$$$, from it we can only go to $$$(c, z_2, \text{out})$$$ and then maybe to $$$(c, z_3, \text{out})$$$ and so on, walking over vertices of type $$$(c, z_i, \text{out})$$$. The main idea is that we make at least one "step" here, staying in the $$$(c, \ldots, \ldots)$$$ area. After this we will move forward along some edge $$$(a, x, \text{out})\to(b, y, \text{in})$$$, which corresponds to a market player, and the whole sequence of steps in the $$$(c, \ldots, \ldots)$$$ area has a total length of the profit for the transaction when the previous player sold something on day $$$c$$$, and this guy bought it. Continuing in this manner, we eventually reach the sink from the last market player's $$$(\ldots, \ldots, \text{out})$$$ vertex.
Also, the editorial of D2 seems to solve the subtask of type "find the smallest index $$$i$$$ in a nondecreasing Fenwick tree so that the $$$i$$$th prefix sum is at least $$$x$$$" in $$$O(\log^2{n})$$$, while it is possible to do so in $$$O(\log{n})$$$, restoring the answer bit by bit
For A2 I used the classic 1e9 + 7 as big prime for hashing and got FST for 1 test case...
Does anyone know the idea behind not having partial marking for facebook(meta) hackercup ?
E.g. To scale the score by a multiplying factor which is the percentage of passing test cases
saddest person on earth right now
Petition to identify and remove at least one cheater
We’ve removed at least two but they might not have been in the top 500.
Sir search better please
any good cf blogs to understand stuff going with problem A2.?
this might help
Immediately after reading A, I thought about Manacher's algorithm. Can we modify Manacher's algorithm to fit this problem?
EDIT: Thanks for clarifying. This is clearly about hashing then :3
Probably not easily. The problem really isn’t about palindromes at all, though it looks like it might be initially. The properties of actual palindromes wind up being very dependent on order, and therefore completely unrelated to what we’re interested in for A.
How do we claim our tshirt, and when will it be shipped out?
I really enjoyed the round tho I can only solve A1.Learnt a lot of new stuffs. Thank you for such a great round
Btw B can be solved in $$$O((N + K) \log N)$$$. You can find the potential with DP on DAG, and use Eppstein's Kth shortest path algorithm. (I got systest fail because of my lack of braveness to
#define int long long
.)What is wrong with this approach for D1>
Let sum be the total sum of the array, and curr be the sum of the prefix upto index z. We want curr=sum/2. Assume sum is even and curr>sum/2. Then we have 3 scenarios >
1) Replace a 3 in prefix with the 1 in suffix, decrease curr by 2
2) Replace a 2 in prefix with a 1 in suffix, decrease curr by 1
3) Replace a 3 in prefix with a 2 in suffix, decrease curr by 1
We greedily take as many operations as we can of type 1, then type 2 and 3 ,or 3 and 2 I think the order shouldn't matter (Or will it?) It gives wrong asnwer on 23 test cases. Does the order of type 2 and 3 operation matter? Or it is that sometimes it is better first increase curr and then decrease it?
Nevermind I got what I was looking for
I like problem A2, D1, D2. Overall contest taught me a lot.
I did exactly the intended solution for A1, and it took 6.30 minutes to run on my computer.
Can you share your code?
sure, https://pastebin.com/EcKAEprW
Not sure if this might solve, but I couldn't see prefix array being reset to 0 for each test case.
EDIT1: My bad please ignore this suggestion. This part is correctly handled as level corresponding to position 0 is always 0 and all other values are computed on each run.
EDIT2: This code https://pastebin.com/EcKAEprW takes about 9 seconds to run for all test cases on my computer.
The prefix sum array is being set correctlythe level corresponding to position 0 is always zero, and all other values are reset for each run.
regarding edit 2, really?
wow there must be something wrong with my computer. let me run some more tests.
edit: ok, when I run on windows it takes forever, but if I run on linux it takes 8 seconds too. lesson learned... thanks!
If you like to use Windows as your primary OS, I'd consider using WSL to run programsI've found that file I/O is much faster in WSL (comparably fast to pure Linux) than in Windows' default terminal.
Interesting, thanks for sharing! Your code runs in around 3.5 seconds on my input on my PC (while my code runs in one second), so it seems like your solution really is of the correct complexity. As far as I can tell, your code should have ACed in contest.
With that said, you can improve your constant factor significantly by swapping the dimensions of your prefix sum array (i.e., position dimension first, then letter dimension second). Since you iterate over all letters at the same position at once, organizing your array this way improves cache efficiency. On my computer, this optimization reduces your runtime to 1.2s, about as fast as my code.
Interesting find, thanks for sharing and also on cache efficiency optimization part!
When running the exact same code, time difference on my machine ~ 9 seconds to your machine ~ 3.5 seconds could be due to CPU difference I think. (or hardware difference more generally including RAM, CPU and other components, given we ran same code)
I'm using an older CPU Intel(R) Core(TM) i55200U CPU @ 2.20GHz I'm likely sure yours would be better, given the time difference.
Can anyone help me know where did I go so wrong to receive so many down votes on my initial attempt to help (apart from incorrect suggestion that I later updated anyway), so that I don't do the same mistake in future ? :)
I suspect the culprit is windows and scanf. Could you try changing scanf to cin and see if that helps? When you switch scanf to cin, remember to also put
cin.sync_with_stdio(false);
at beginning of main.I'm not sure of the details, I just know that I've had problems scanf on windows in the past.
Got an approach to Problem C that seems simpler than the editorial:
We model the process as arrangements of length $$$k+1$$$ of the cookies in which the first object is the one placed on the left pan and the cookies thereafter it are in the order in which they are taken and compared with the current cookie on the weighing balance.
The total number of arrangements is $$$sum\choose k+1$$$($$$k+1$$$)! where $$$sum$$$ is the total number of cookies. If there is any cookie in this arrangement that has a weight strictly greater than $$$w_1$$$, then a cookie of batch $$$1$$$ can never be the final remaining cookie. So the favorable arrangements have all cookies having weights atmost $$$w_1$$$.
Now declare two variables $$$small$$$ and $$$equal$$$ where $$$small$$$ is the number of cookies having a weight less than $$$w_1$$$ and $$$equal$$$ is the number of cookies having weight $$$w_1$$$ and not from batch $$$1$$$. Also we precompute the factorials and inverse factorials upto $$$MAX=9*10^6$$$ in $$$O(MAX)$$$ time.
Now any favorable arrangement can be divided into 3 cases: 
1: It contains exactly one cookie of weight $$$w_1$$$.
The cookie of weight $$$w_1$$$ must be from batch $$$1$$$.Hence number of such outcomes is $$$c_1\times$$$$$$small\choose k$$$$$$\times$$$($$$k+1$$$)! which can be calculated in $$$O(1)$$$ time

2: It contains more than one cookie of weight $$$= w_1$$$ with the first two cookies of this type in the arrangement satisfying the condition that exactly one of them is from batch $$$1$$$ and the other isn't.
Why such a weird case? Well because in this case lies the key argument that helps us in solving this problem...
Let $$$C_1$$$ and $$$C_2$$$ be those two cookies mentioned above in that order. Now consider another arrangement $$$A'$$$ of the $$$k+1$$$ cookies in this arrangement $$$A$$$ in which only the positions of $$$C_1$$$ and $$$C_2$$$ swapped. Then the claim is that $$$A'$$$ is unfavorable.
To prove it, suppose $$$C_1$$$ was the cookie from batch $$$1$$$. So it must have been on the right pan before comparing it with $$$C_2$$$ as if it was on the left pan then it must have been thrown away by bringing in $$$C_2$$$ which would then be on the right pan implying that in the end $$$C_2$$$ would remain. But this is a contradiction as $$$C_2$$$ is not from batch $$$1$$$. So in the arrangement $$$A'$$$, $$$C_2$$$ must have been in the right pan before comparing it with $$$C_1$$$. Since it is in the right pan, $$$C_1$$$ would have been thrown and for all other cookies after $$$C_2$$$ in $$$A'$$$, $$$C_2$$$ would still remain. Hence $$$A'$$$ is unfavorable.
Now suppose $$$C_1$$$ was not from batch $$$1$$$. So it must have been on the left pan before comparing it with $$$C_2$$$ as if it was on the right pan then it would have remained till end which is not possible. So in the arrangement $$$A'$$$, $$$C_2$$$ must have been on the left pan before comparing it with $$$C_1$$$ after which $$$C_1$$$ would be on the right pan and hence it would be the one left in the end making $$$A'$$$ unfavorable.
Thus if total arrangements satisfying 2 is $$$x$$$ then exactly $$$\frac{x}{2}$$$ of them are favorable!.
In order to compute $$$x$$$ we iterate over positions of $$$C_2$$$. If it is in the $$$j$$$th position then there are ($$$j1$$$) choices for placing $$$C_1$$$. Also by the definition of $$$C_1$$$ and $$$C_2$$$ all cookies before $$$j$$$th position except $$$C_1$$$ have smaller weight than $$$w_1$$$ while the cookies after $$$C_2$$$ can have weight which is not greater than $$$w_1$$$. So the total ways are $$$(2\times c_1\times equal\times (j1)\times$$$$$$small\choose j2$$$$$$\times(j2)!\times$$$$$$small(j2)+(equal1)+(c_11)\choose k+1j$$$$$$\times (k+1j)!)$$$. Summing over these for all $$$2\leq j\leq k+1$$$ would give the value of $$$x$$$ above, whose half is what we want. These computations require $$$O(k)$$$ time.

3: It contains more than one cookie of weight $$$= w_1$$$ with the first two cookies of this type in the arrangement satisfying the condition that both of them are from batch $$$1$$$.
Let $$$C_1$$$ and $$$C_2$$$ be those two cookies mentioned above in that order. If $$$C_1$$$ was on the left pan before comparison with $$$C_2$$$ then $$$C_2$$$ would be on the right pan after comparison and hence it would be the remaining cookie in the end. If $$$C_1$$$ was on the right pan then it itself would remain in the end. In either case a cookie from batch $$$1$$$ is remaining in the end.
So the number of arrangements satisfying 3 would be obtained by again iterating over the position of $$$C_2$$$, i.e it is the summation of ($$$c_1\times (c_11)\times (j1)\times$$$$$$small\choose j2$$$$$$\times (j2)!\times$$$$$$small(j2)+equal+(c_12)\choose k+1j$$$$$$\times (k+1j)!$$$) for $$$2\leq j\leq k+1$$$. These computations again require $$$O(k)$$$ time.
So summing up the favorable outcomes for all $$$3$$$ cases and dividing it by the total arrangements would give us the desired probability.
The complexity of this algorithm is $$$O$$$(max($$$MAX$$$,summation of $$$k$$$ over all testcases)) $$$\blacksquare$$$
I don't think editorial solution is complex than this.