Sport programming and incorrect accepted solutions

I didn't help but notice that almost each contest there's at least some solutions that are clearly incorrect but get accepted nevertheless. For example, discussions for last 3 numbered CF rounds:

Sometimes problemsetters/testers could have done better job, sometimes it's seems close to impossible to fail specific solutions in advance without actually coming up with this solution.

So, here is an idea which I both thought of myself and also heard from several other people: What if after the contest and system testing phase there was open hacking phase, which doesn't give any points for hackers but helps to find solutions that should not be accepted.

It seems that it should be easily implementable, considering it already works somewhat like this in Educational rounds, but MikeMirzayanov may comment on that.

I see, however, several possible issues with that:

It will require to wait for round results longer.
It will not work very well for onsite competitions when results should be declared shortly after the contest (But in these cases open hacking phase could be cancelled or shortened(there's normally much less solutions to check after all))
Author may get more lazy with creating tests (i.e saying "ok, contestants will hack that anyway"), which may reduce overall quality of the resulting testsets
There will be people who are more targeted by hacks then others, which may be viewed as not fair. E.g if you are tourist , your bug in problem A will probably be found because your code is reviewed by a lot of participants who know about you personally (or see you in the first line of scoreboard), but a lot less people will read few thousands of (preliminarily) accepted solutions.
For solutions that involve random number generators and/or are close to TL balance may be somewhat changed. E.g if I know, that mnbvmar's solution worked in 998ms out of 1s, I'll probably try to "hack" using (almost) same tests few times. Again, I will care less about people who are lower in the scoreboard because hacking them will not get me closer to the hoodie.

However, I think positive impact of this feature would be less important then negative impact.

What do you think? Any other issues you can think of or any other comments are welcome.

Комментарии (30)

Написать комментарий?

RedNextCentury

6 лет назад, # |

For #4 it shouldnt be a problem if hacks are added to system tests, isnt that how it works in educational rounds?

→ Ответить

bazsi700

6 лет назад, # ^ |

+18

But each submission may have a different counterexample. Thus, if someone with ~500th place has a wrong solution different from everyone else's then he probably won't get hacked, and testing on other hacks may not fail his solution.

Isn't it the same for hacks during the contest? You might end up in a room where there's a good hacker and you might not

It is, but virtually there is nothing to do about it.

riadwaw

Well, yes, but getting in one room with good hacker is equiprobable, while being target after the contest is biased)

flashmt

Also, if your solution get hacked during contest, you might have a chance to fix it.

shahidul_brur

← Rev. 2 →

-40

right :)

otaliptus

+10

There will be people who are more targeted by hacks then others, which may be viewed as not fair. E.g if you are tourist , your bug in problem A will probably be found because your code is reviewed by a lot of participants who know about you personally (or see you in the first line of scoreboard), but a lot less people will read few thousands of (preliminarily) accepted solutions.

Not sure if it is a weird idea but is it possible to give some random codes for hacking? I mean, if I want to hack someone's code, then of course I have to check some codes, and some of them are gonna be written some users that I don't know. So, system can give some codes of some users to me, and then I can try to hack, and that's not unfair, is that?

It's not a bad idea, but I'm curious how many people would spend plenty of their time trying to hack random codes. Especially on harder tasks, it may take a long time to find a test which fails the code.

You're right. Also, this might be a #7, number of users who would like to hack someone may not be enough for the aimed quality.

allllekssssa

Honestly I wouldn't like contests like Educational round — it takes too long time and it is not interesting enough. I want to finish everything about contest that day, otherwise I would get up 5-6 times during the night to check my status.

I would like idea to allow some users (like top 200-300 on the contest) to create some hacky testcase 20 minutes after round. After it setter can choose some cases (or run 10-20 random codes on them) and add the most tricky cases to final tests. Testing would be longer, but I believe everything can be finished in 1h and 30 minutes after round. Normally my idea is not so precise and certainly it can be done on better way, but I am for adding cases short time after round and testing that day.

I don't think such a short time makes much sense, one will not be able to do any nontrivial hacks in that time

majk

+132

I would strongly dislike such change. The pause between the end of coding phase and the end of system test is already way too long!

Letting people hack submissions in Archive, without retroactively changing the standings of a particular round, and perhaps displaying the number of such successful hacks somewhere in the user's profile would be a good change in my opinion.

brdy

Would such a hack change later tests on a problem?

I already know of a problem where many n^2 solution pass but n = 1e5, but there is no system to allow me to submit the relevant test case (generator).

-18

I expect system tests to run right after the contest so you'll get "your solutions are probably correct" (and expected rating changes) in the same time (not sure how it's done in educational rounds).

Even now, I think removing of cheaters happens after the rating change (and rating changes may change). It will be like removing cheaters and incorrect ACs.

+21

It ain't over till it's over. I'll still be scared shitless 'bout my hoodie.

rotavirus

+68

everyone can hack my hash solution? no, thanks

you can use random modulo and hack hash solutions of people above you in the scoreboard instead

radoslav11

-12

But it's inconvenient to do during rounds.

Wild_Hamster

-9

Rip problems with random-based solutions then. Nobody wants to learn about mt1337 stuff.

I think, the better idea would be stress-testing all the solutions on the small tests created by some random generator by the author(I believe, that's how most hacks in educational round are done). Most solutions that should get WA will get WA then.

Bur it's not really clear what to do with TLE solutions(stress-testing on big tests can work for eternity for incorrect TLE solution)

You already may be hacked if you don't use proper random.

The probability of being hacked is much smaller then(submitting 5 minutes before the end already gives you probability close to 0).

Also you can do something like http://codeforces.com/contest/1039/submission/42518451

Giving the possibility to hack after contest is equivalent to forcing most participants to learn about mt1337 or forcing authors not to give random-based/string problems.

That's certainly true (but I find it funny that one will write more code(calculating hash) and/or spend points waiting till the end of the contest instead of using proper random)

← Rev. 6 →

Stress testing is interesting idea that worth investigating (but I'm not sure whether it's different then just generating more tests with the same generators in advance).

But I was thinking mostly of cases where you read solution and it's obvious it's not what intended, but it's hard to hack it without seeing it. e.g 44797129 is a solution for Multi hedgehog, which after finding center goes recursively instead of checking that this center is correct. It's pretty trivial to break* once you know the solution, but I'd argue if you don't get the same wrong idea as a tester, it's hard to make such a test in advance (or to make generator in advance)

I'm not sure how often this is actually done in educational rounds (I mean actually reading solutions and thinking on them instead of bruteforce). If it doesn't happen a lot, then maybe adding open hacks phase doesn't make much sense

*test where it gives Yes while it shouldn't

kalimm

+16

"Nobody wants to learn about mt1337 stuff."

My regards to that.

teja349

+33

My opinion is currently the format is enjoyable. I enjoy regular rounds more than educational rounds.Though the newly proposed is not exactly educational kindof. I also feel fine with sometimes tests being weak as long as there is proper attempt put in preparing them.

Also I feel, since there is no huge loss for me like when my solution fails systests its a huge loss but some 3 people squeezed under bad tests is okay unless tourist is losing rank 1 because of the issue :(.

Also currently if systests take long time its already frustating. Though maybe your model is better its not interesting at first thought.

Xellos

E.g if I know, that mnbvmar's solution worked in 998ms out of 1s, I'll probably try to run (almost) same tests few times.

A dick move. CF running times fluctuate. You can have a solution that takes 950 ms on average, it will run in 998 ms once and 1024 ms when you rerun it, resulting in TLE. There's a low chance of that, sure, but given the volume of submissions, you're going to end up biasing failing submissions you have no reason to if only one TLE out of two runs is necessary to fail.

The scheme "one run and good luck if your code barely fits in TL" works fine. Averaging several runs would be better, but ain't nobody got time for that (you'd have to do that for TLE solutions to be fair).

As far as I know, if solution gets TL it is run several times (and if at least on of them is OK then it's considered OK) so it's not THAT bad. But of course that's why I listed it in "possible issues" section. It may be needed to be addressed

Ah yeah, there was that thing. Then yeah, running close-to-TL solutions several times too and deciding somehow if it seems like TL would be good. Not through worst-case time, of course, that should be just one factor.

mango_lassi

+36

Open hacking could maybe overlap with system tests. That way it wouldn't delay getting results (assuming it's short enough)

№	Пользователь	Рейтинг
1	tourist	3757
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

№	Пользователь	Вклад
1	maomao90	171
2	adamant	164
3	awoo	163
4	TheScrasse	159
5	nor	156
6	maroonrk	154
7	-is-this-fft-	152
8	Petr	147
9	orz	145
9	pajenegod	145

Блог пользователя riadwaw