Can correct randomized solution fail? - Codeforces

→ Pay attention

Before contest
Codeforces Round 941 (Div. 1)
47:17:27
Register now »

*has extra registration

Before contest
Codeforces Round 941 (Div. 2)
47:17:26
Register now »

*has extra registration

→ Streams

AMA: TheOneYouWant

By aryanc403

Before stream 23:12:26

Atcoder ABC #351 Short Solution Discussion

By aryanc403

Before stream 46:22:26

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

antontrygubO_o's blog

Can correct randomized solution fail?

By antontrygubO_o, history, 5 years ago, In English

In English

I am just curious.

Suppose some problem has a randomized solution which fails with probability at most $\text{[math]}$ . If it has, say, 100 tests, the probability of failing some of them is about $\text{[math]}$ . Therefore, once in 10⁴ it may happen that correct randomized solution doesn't pass.

Has anyone experienced anything like that? Maybe even the exact solution that didn't pass got AC after submitting it the second time? Or am I getting something wrong?

Tags

#randomisation

+28

antontrygubO_o
5 years ago
15

Comments

Comments (15)

Write comment?

»

5 years ago, # |

Vote: I like it

+4

Vote: I do not like it

Something like that happened to me once when upsolving. I had a solution that used random_shuffle and there was an anti-random_shuffle test case. When I replaced random_shuffle with two random_shuffles it worked.

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+15

Vote: I do not like it

Mostly because random shuffle uses rand() which is not good enough sometimes. Take a look at one of neal's blog where he describes the good way.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

If the randomized solution is expected and it is intended to pass, constrains will be put on a way that the randomized solution has very low chance of failing so no one gets unexpected verdict on the randomized solution. Sometimes TLE solutions may be judged twice to confirm they pass if they are the intended randomized solution.

And yes almost 1 in 10000 solutions will get unlucky. (0.99995050166 in 10000 to be more precise). Which is still fine considering that mostly that won't happen.

→ Reply

»

5 years ago, # |

Vote: I like it

-10

Vote: I do not like it

There were a lot of people that got WA on system test because of that on this problem: 840D - Судьба.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

It happens, but very rarely. You shouldn't worry about it.

The example provided by usernameson shows that sometimes organizers try to fail a particular solution. But then the probability of failing is 1 instead of 1 / 10⁶.

→ Reply

»

»

5 years ago, # ^ |

← Rev. 2 →

Vote: I like it

+64

Vote: I do not like it

Note there's a difference between

solution that works on any test with probability 1 - 10^- 6
solution that works fine on all tests except each millionth test

The first ones are good. The second ones can be failed by hacks or authors who predicted your solution

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

Do organisers actually do things like create anti-random_shuffle test cases? I just assumed such situations arise because a contestant hacks another solution during the contest with an anti-random_shuffle test case and the hack gets added to the system test.

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Some do. I wouldn't.

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+28

Vote: I do not like it

I wouldn't either, but that left me wondering about where exactly is the line.

To give a related example, in my experience people usually think that it is perfectly fine to include an anti-quicksort test that causes a known bad library implementation to run in Theta(n^2).

Why should one be OK and the other not? In both cases you are punishing the same thing -- the solver produced a solution that had the correct algorithm, the mistake was in their implementation, and the mistake was predictable in advance which made it possible to construct a deterministic test case that breaks the solution before the solution existed.

Where should the line be and why? And should the answer be different if the contest system allows for challenges/hacks, which means that other contestants can add those tests anyway?

I don't really have a good answer yet.

→ Reply

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

I doubt anybody has a good answer.

I would say the special tests shouldn't be created if it's about choosing a seed or a modulo (between two equally-bad possible values). Why should we punish random_shuffle with seed 0, and not with seed 2? Or hashing modulo 10⁹ + 7 and not some other prime? It shouldn't be necessary or useful to choose a strange prime.

Modulo 2⁶⁴ is different because it's not "equally-bad", I would say. But I don't think it's obvious that it should be failed by organizers. And maybe the sorting anti-test is a similar thing to the 2⁶⁴ thing?

On the other hand, maybe organizers should be malicious as much as possible and they should force participants to defend against somebody that tries to predict their exact solution and break it (that's how hacks work for sure). Maybe it's good as a preparation for real life, but it isn't pleasant for beginners. I wouldn't want to care about it in high school, when I started CP. If possible, problems should be about thinking about algorithms instead of thinking what organizers did. This is not a strong argument, though.

→ Reply

»

5 years ago, # |

Vote: I like it

+24

Vote: I do not like it

There is one important thing to note, algorithms that fail with probability $\text{[math]}$ is not that big of a problem. Suppose that a random algorithm have $\text{[math]}$ probability of getting the wrong answer, then what you could do is just to run the algorithm a couple of times. Running it 3 times and taking the majority answer will only make it fail with probability $\text{[math]}$ . And if that isn't enough then you could just run it 5 times and fail with probability $\text{[math]}$ .

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+24

Vote: I do not like it

I don't think that often possible. Usually you either have problems with TL or this is interactive problem where there's a limit on number of queries that you can't use two times

→ Reply

»

»

»

5 years ago, # ^ |

← Rev. 3 →

Vote: I like it

-10

Vote: I do not like it

My point was just that algorithms failing with probability $\text{[math]}$ is not a big problem. Lets assume that you are trying to minimize some function and you have a probability of $\text{[math]}$ of failing, then simply running it twice will lower that to $\text{[math]}$ .

Also because of this kind of exponential behavior of probabilities, you very seldomly have a small but reasonable probability of failing. You usually fail either very often or pretty much never.

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

If it's something like checking if matrices multiplication holds (A*B=C) then it's possible. However, solutions like hashing (on a random hash) are hard to do it twice sometimes (or you will end up with TLE because of high constant factor).

Before anyone says, I know that there is something called double-hashing, but hashing already has a big constant factor (because of mods and other thing). Double hashing is not possible it the running time is tight.

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Yes, I understand your point. My point is it's not often feasible. E.g if your solution takes 3/4 of TL, you can't run it twice

→ Reply