GP of Udmurtia 2018-2019

→ Pay attention

Before contest
Pinely Round 4 (Div. 1 + Div. 2)
38:57:10
Register now »

*has extra registration

→ Streams

Atcoder ABC #364 Solution Discussion

By aryanc403

Before stream 14:07:09

View all →

→ Top rated

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

View all →

→ Find user

→ Recent actions

Detailed →

dragoon's blog

GP of Udmurtia 2018-2019

By dragoon, history, 6 years ago, In English

Let's discuss solution

opencup, opencup2018-2019

dragoon
6 years ago
37

Comments (37)

Write comment?

_LeMur_

6 years ago, # |

+13

How to solve D?

→ Reply

dragoon

6 years ago, # ^ |

← Rev. 3 →

+18

For D, we did a matrix exponentiation on number of different graphs (for n=5, there were only 34 different graphs, link). However, it was still somewhat slow to raise power for every query and precomputing "power of 2" matrices was not enough. So the idea was instead of doing (A(B(CD)))v we did (A(B(C(Dv)))) which is qlogn*n^2 instead of qlogn*n^3. And that was sufficient to get AC (My teammate Bruteforceman mentioned this blog).

→ Reply

krijgertje

6 years ago, # ^ |

← Rev. 3 →

+15

You can normalize each state. Then for n=5 there only remain 34 states, 13 of which are unconnected. If you know the probability distribution of states at time l, you can use matrix exponentiation to get the answer, since 14³·q·log(r - l) fits within the time limit (one extra state for 'having been in any connected state').

Unfortunately 34³·q·log(l) seems too slow to use matrix exponentiation directly for getting this probability distribution. Instead, using matrix exponentiation again, calculate the probability of exactly k edges flipping in l steps (only 11 states so it again fits within the time limit). Now iterate over the (unnormalized) state at l and using previous result it is relatively straightforward to compute the probability that this will be the state at l.

Combining these two steps got me AC in 373ms.

→ Reply

dragoon

6 years ago, # ^ |

Can you elaborate: <<calculate the probability of exactly k edges flipping in l steps (only 11 states so it again fits within the time limit)>>?

Don't you need to know the exact edges? Or it's like, for each bitmask of edges, you check what's the probability only these will flip?

→ Reply

krijgertje

6 years ago, # ^ |

If you want to go from one state to another state, you know how many edges should flip and using the matrix exponentiation you get to know the probability of exactly this many edges flipping. To get the probability of this specific set of edges flipping, divide by $\text{[math]}$ , since each set of edges of the same size has the same probability of being flipped.

→ Reply

AryssonFigueiredo

6 years ago, # ^ |

Could you elaborate how to maintain the "extra state for having been in any connected state"?

→ Reply

TooDumbToWin

6 years ago, # |

How to solve E?

→ Reply

SirirNicheBirirDokan

6 years ago, # ^ |

+18

Generate a million binary strings and pray.

→ Reply

dragoon

6 years ago, # ^ |

← Rev. 3 →

Solved by Bruteforceman. The main idea is: suppose you would like to know what is at i'th position. Then query for:

[i, j], [i+1, j]
[i, j+1], [i+1, j+1]
[i, j+2], [i+1, j+2]

and so on. Say the reply to the first query is X and to the second query is Y. Now, the probability for both being valid is 1/4. Also, you may assume that if: Y = {X, X-1} then may be (X, Y) are valid. So record, X-Y. If you see number of 0 or 1 going over say 2 or 3, then you know that by high probability the answer is that number (0 or 1).

→ Reply

alex9801

6 years ago, # ^ |

+18

Is there any proof for probability bound in this solution? I think the chance of getting AC in one try is quite low, considering that there were 150 test cases :(

→ Reply

rng_58

6 years ago, # ^ |

It's important to use long intervals: When i < 500, use [i, 1000], [i, 999], ..., and when i > 500, use [1, i], [2, i], ....

→ Reply

alex9801

6 years ago, # ^ |

+18

That is exactly what I submitted, but got WA..

→ Reply

TooDumbToWin

6 years ago, # ^ |

Same. Got WA on test 33.

→ Reply

Kaban-5

6 years ago, # ^ |

← Rev. 10 →

-10

Waiting for 4 evidences accumulating either to the side of 0 or to the side of 1 was enough to get AC (but 3 is still not enough, 5 is also OK and 6 exceed the query limit).

UPD This is wrong as pointed by pavel.savchenkov below (I also locally stress tested the solution and achieved the same results).

UPD 2 Interestingly enough, I also made a mistake when accounting for query caching. Actually it is not so wrong, at least for 4 (but 5 already exceeds query limit and 3 sometimes identifies the string incorrectly).

Code: https://pastebin.com/xHKKWFSX Said code passes 1000 random tests with less than 17000 queries used on each. Note that linked code does not ask queries till the end of the string, but rather on substrings of length ≈ 400.

UPD 4 Removed all statements about weak tests, actually it was just me counting queries incorrectly.

→ Reply

pavel.savchenkov

6 years ago, # ^ |

← Rev. 2 →

+38

For waiting for 4 evidences you need ≈ 31950 queries according to my local stress test. Indeed for each 0 ≤ i < 1000 you need to wait for 4 appearances of event with probability $\text{[math]}$ . So average waiting time is $\text{[math]}$ . 2 queries for each step and we get 1000·16·2 = 32000. So even 32000 should not be enough.

→ Reply

rng_58

6 years ago, # ^ |

To get the i-th position you call [i, 1000], [i, 999], ... and [i + 1, 1000], [i + 1, 999], .... Then to get the i + 1-th position you call [i + 1, 1000], [i + 1, 999], ... and [i + 2, 1000], [i + 2, 999], .... This means that many queries can be shared and the number of queries is roughly halved.

→ Reply

pavel.savchenkov

6 years ago, # ^ |

+30

Thanks, that's a good point. It was stupid, I forgot to cache queries in local test, did it only in server version. But I still getting ~ 18700 queries.

→ Reply

rng_58

6 years ago, # ^ |

+30

Wait until the difference of number of evidences becomes 3. In each attempt you get a correct one with probability 1 / 4, and a wrong one with probability 1 / 1000. This means that the failure probability for a single position is about (1 / 250)³ and the failure probability is about 1 / 15625. https://ideone.com/jOqZgM

→ Reply

dacin21

6 years ago, # ^ |

← Rev. 2 →

+13

Let DP[a][b] be the probability that we succed if we're waiting for a pieces of good evidence or b pieces of bad evidence. With a single query, we either get good or bad evidence or nothing. Let p_g be the probability of getting good evidence and p_b the probability of getting bad evidence. Then

$\text{[math]}$

$\text{[math]}$ . If we choose to query the longer side, $\text{[math]}$ . ( $\text{[math]}$ to get at least one random value and $\text{[math]}$ for the last random value to match.) Computing this DP gives us (code).

n	1 - DP[n][n]	1000·(1 - DP[n][n])
1	0.00596421	5.96421
2	0.000106291	0.106291
3	2.10265e-06	0.00210265
4	4.36567e-08	4.36567e-05

By a union bound, the probability to fail at all is at most 1000 times the probability to fail at any point, so 1000·(1 - DP[n][n]).

So it seems like waiting for 2 pieces of evidence is not enough, but waiting for 3 pieces should work at least half of the time if we assume 250 test cases (by another union bound).

The solution where you wait for a difference of at least 3 should be evaluatable with a Markov chain instead of DP. I might do this after dinner.

Edit: Markov chain code. Table:

n	fail[n]	1000·fail[n]
1	0.00596421	5.96421
2	3.59987e-05	0.0359987
3	2.16e-07	0.000216
4	1.296e-09	1.296e-06

So here n = 3 works at least 95% of the time if we assume 250 test cases.

→ Reply

pavel.savchenkov

6 years ago, # ^ |

+10

Our solution which passed stress testing:

Let's guess the second half of the string. For each $\text{[math]}$ ask for K values: [0, r], [1, r], ..., [K - 1, r]. For each $\text{[math]}$ consider multiset $\text{[math]}$ . Sort all pairs by decreasing of frequency of the most frequent value in $\text{[math]}$ .

Now we found some correct differences of some prefix sums. So go through these pairs in sorted order and join indices in DSU (like in Kruskal's algorithm) until we got $\text{[math]}$ components. To get remaining $\text{[math]}$ edges just ask for value in some positions, like in other solutions waiting for 4 evidences. Then do dfs on generated tree and find out all $\text{[math]}$ .

We used $\text{[math]}$ .

→ Reply

rng_58

6 years ago, # ^ |

+45

Let me summarize the solution for E. Many people are downvoting descriptions of solutions, but they are correct and I have a complete proof for it. However, various details are important in the proof and slight difference of algorithm may result in significant difference of number of queries / failure probability.

We decide letters one by one. Basically, to decide the i-th letter, we choose some j, and compute the difference of [i, j] and [i + 1, j]. For a single choice of j, it returns the correct answer with probability at least 1 / 4. So, if we repeat this for various j and get some statistics, it's likely that we get the correct value of s_i.

Suppose that s_i = 0. As we see above, for any j, the probability that we get the evidence of s_i = 0 (i.e., Query[i, j] = Query[i + 1, j]) is 1 / 4. What is the probability that we get the evidence of the opposite thing (i.e., accidentally get Query[i, j] - Query[i + 1, j] = 1)? To keep this probability small, it's important to use long intervals. When j - i ≥ 500, this probability turns out to be about 1 / 1000 (by a simple calculation). There's no such j when i is big, in this case we should use Query[j, i] - Query[j, i - 1] for some j.

Since this is probabilistic, we need to try various js. How many js should we try in particular? Imagine a lottery that gives one of "A" or "B" with probability 1 / 4, the other letter with probability 1 / 1000, and "?" in all other cases. We want to decide which of "A" and "B" is more frequent by drawing this lottery several times. One natural way is to repeat drawing the lottery until the difference between (the number of "A"s) and (the number of "B"s) becomes 3. Again, some computation shows that with this strategy, we can almost always find the correct one: the failure probability is about (1 / 250)³. (This is equivalent to Gambler's ruin. See the section "Unfair coin flipping"). Note that, another natural way is to stop when you get the same letter three times: it has a worse error probability.

If we use the strategy above for all positions, since in each position we fail with probability (1 / 250)³, the total failire probability per testcase is about 1000 * (1 / 250)³ = 1 / 15625. Thus this algorithm will get AC with about 99% probability even with 150 testcases.

What is the number of queries? Again, the detail of the choice of j matters. One possible way is to use j = 1000, 999, 998, ... when j < 500 and use j = 0, 1, 2, ... otherwise. For a single i, we draw the lottery about 12 times on average. This is because in 12 attempts the expected difference between the number of two letters is about 3. Thus, for small i, we send queries for intervals [i, 1000], [i, 999], ..., [i, 989] and for large i, we send queries for [0, i], ..., [11, i]. Note that, when we try to get s_i we use queries [i, j] and [i + 1, j], and when we try to get s_i + 1 we use queries [i + 1, j] and [i + 2, j]. We should memoize the answer for [i + 1, j] and do not ask it multiple times. In total, roughly speaking, we need 12000 queries on average. Of course, the number of attempts for a single i is not always exactly 12 because this is a probabilistic parameter. This raises the number of queries a bit, but still we are comfortable with the limit of 18000.

My code: https://ideone.com/jOqZgM

→ Reply

Nyatl

6 years ago, # ^ |

+28

I'm the author of this problem and I understand that I failed and set too low query limit.

We had too much problems with contest preparation and I asked my team to solve problem E and write solution that you described. Their solution has different constant 6 and uses not more than 18000 queries. But only after contest I realized that they use some optimization. The solution only asks segments with length up to 500. So almost every segment is used four times.

→ Reply

Petr

6 years ago, # ^ |

+25

We didn't think about reusing queries, so we came up with an approach that requires less queries. First, for the first BUBEN numbers, do your solution, but without any memorization (I think we had BUBEN=300). This does on the order of 24 queries per item as you describe. Then, when we know all numbers from 0 to i-th, we can ask queries [j, i+1] to determine (i+1)-th number, since we know the sum in segment [j, i] already. So now we have probability of success 1/2 instead of 1/4, and for each attempt we need 1 query instead of 2. The segments are a bit shorter, though, so to be safe we waited for 5 votes here.

→ Reply

dragoon

6 years ago, # |

+10

How to solve F, G, H?

→ Reply

AlexFetisov

6 years ago, # ^ |

+15

My solution for F: Let's build the sorted list of the players one by one. To find the correct position for the next player we use binary search. That gives us 1000 * log operations which is about what we need.

Now the question is how to compare two players in the binary search? We have 5 groups: higher then 1st, 1st, 2nd, 3rd, lower then 3rd. Let's take any 2 people and ask if one is lower than the other. If we have no answer then we swap them and ask again. You can see that only for (1st, 2nd) and for (1st, 3rd) that will not give you the order. For others you will have the answer and complete the binary search iteration. If during the binary search we failed to insert the element, let's save it for later.

Now in the end we have one or two saved elements and the sorted list. I iterate from the lowest to highest asking the order and that's how we can find 1st-3rd person in the list. Then insert our saved for later there in any order and just sort those 3. That is easy — ask all 6 questions and it's enough to find the order.

I liked the problem personally, really nice. Kudos to the author.

→ Reply

aid

6 years ago, # ^ |

+10

Wow, this sorting algorithm is faster than merge sort. Only $\text{[math]}$ comparisons instead of $\text{[math]}$ ! I solved it with mergesort and thought that this problem is just about casework. Really nice that it has such an awesome solution.

→ Reply

majk

6 years ago, # ^ |

+10

My solution: Pick a guy, let's call him zero. Ask zero about everyone, and everyone about zero. This way, you know whether zero guy is first, second, third (I call these medal guys), above first or below third. Furthermore, you know the categorisation of all the remaining guys (except that sometimes you cannot distinguish a medal guy from either above or below guys). Write quicksort to sort the above and below guys. For the medal guys, you sometimes need special handling:

For instance, when the zero guy is above first, then all the medal guys behave the same with respect to him. Sorting the medal and below together cannot distinguish the medal guys properly, but they will be sorted as the top three in the "below" list. You just ask 6 queries, and one of them will identify second and third guy.
If zero guy is third, you cannot distinguish second from above first, so you sort them together. The second guy will naturally end up last.
and so on

Then it's fairly standard:

Waste a lot of time on implementation.
Get WA on test 49.
Randomise the selection of zero guy.
Get WA on test 41.
Realise contest ends in 5 minutes.
Submit the same code 10 times.
Get AC.

→ Reply

ppavic

6 years ago, # ^ |

I have a simmilar soluton, but we I pick the pivot I only ask the pivot about everyone and not also everyone about the pivot. It requires some extra casework. I actually had a bug in a very special case. And it didn't always work even on the sample but it was so rare I decieded to submit. Firsr submir, WA on 49. Submit the exact same code once more because the solution is randomized, AC :D.

→ Reply

dacin21

6 years ago, # ^ |

G: When deciding where to move, try the following in order (r is some value depending on your dice, described below.)

Move to (0, 0).
Move to a point from which you can reach (0, 0) with a roll of r.
Move directly toward (0, 0) if you're currently to far away.
Move away from (0, 0).

As for dice selection, there are a few possible issues

If we get too few or too low dice, then we spend many moves doing (3).
If we get too many dice, we spend many moves doing (2), waiting to roll r.
If r is too small and we roll some big value, we have to do (4) instead of (2), wasting the next move.

Two dice dice selection strategies that work:

Get four dice with an average of at least 8000 and set r to the sum of maximums. (Needs around $\text{[math]}$ moves to get into distance r and around 6⁴ = 1296 moves to finish which leaves some leeway for random variations.)
Get seven dice each having a duplicate value that is at least half the maximum and set r to the sum of duplicate values. (Needs around $\text{[math]}$ moves to get into distance r and around 3⁷ = 2187 moves to finish. This a bit risky if you get too low dice and can be improved by rejecting dice with an average below 5000.)

(I ignored the time taken to get the dice in my analysis, should be around 1'000 for the first and 7'000 for the second strategy in total, but the influence of this is halved as we're already moving toward (0, 0) during this time.)

→ Reply

aid

6 years ago, # |

+47

In the middle of the contest after 2 WAs on E our team decided to open pdf statements and found out that there is query limit in all interactive problems. :(

→ Reply

krijgertje

6 years ago, # |

+16

I stared for ages at why my codes for E and F gave wrong answer. I just found out that there is a query limit, which was not mentioned in the 'one-by-one' problem statement, but was only mentioned in the pdf which you can get by clicking on 'download statements'.

To be honest, I found it kind of strange that there was no query limit, so I read the statement (many times) letter for letter, but did not think to look in the pdf. Well, now I now better for next time.

→ Reply

redjohn

6 years ago, # |

How to solve A?

→ Reply

dragoon

6 years ago, # ^ |

First the slow solution:

DP(pos, set_of_taken_digits):
ret = 0
for i < pos:
  if digit[i] == digit[pos]:
    ret += DP(i - 1, set_of_taken_digits)
if digit[pos] is not in set_of_taken_digits:
  ret += DP(pos - 1, set_of_taken_digits Union digit[pos])
return ret

If you think a bit, you will be able to optimize the solution. First of all, set_of_taken_digits is a 10 digit bitmask. So, we will store dp[i] for each 0 <= i < len(S) where dp[i] is a 2^10 length array storing dp value for each bitmasks.

To compute dp[pos], we need dp[pos-1] (for the second if) and sum of dp[i] where S[i — 1] == S[pos]. So, we only need two 2^10 array (dp of i-1, and dp of i) and 10 2^10 arrays (for each digit).

→ Reply

helThazar

6 years ago, # |

+16

Does somebody know whether div2 upsolving is already enabled?

→ Reply

rsFalse

6 years ago, # |

Div.2 problem I was the easiest one. To compose lexicographically minimal string of zeroes and ones with given occurrences of '00' .. '11'. First idea was to write code with many "if ... else ...". Later I tried to partly bruteforce (--> spoiler). But what was the most simply or most elegant solution?

Spoiler: partly bruteforce

→ Reply

nar3

6 years ago, # |

How to solve C K and L?

→ Reply

Khazhak

6 years ago, # |

← Rev. 2 →

-8

how to solve K?

UPD: wrong post

→ Reply