kozliklekarsky's blog

By kozliklekarsky, history, 16 months ago, In English

Hello dear codeforces

Today, someone unburied a cadaver for me, that is, "Well, I think pretests should not be a guarrantee" and what not. And I think finally there should be a clear blog post where we should be to debate specifically which is "better*": weak pretests or strong pretests.

I await to see how each matter is argued, hopefully I and others could have something to learn from this,

Francisc

*: from a contest quality and overall quality perspective

  • Vote: I like it
  • +90
  • Vote: I do not like it

| Write comment?
»
16 months ago, # |
Rev. 4   Vote: I like it +10 Vote: I do not like it

My opinion in this matter is that pretests should be strong (and as such I would really like to see a 'reasonable' argument for weak pretests).

My argument is the following: generally, in problemsetting, there is a saying "The tradeoff to allow those with slightly slow technically optimal solutions while also allowing those with brutally optimised technically unoptimal solutions". The general idea when this is employed is that those that have the right path to the problem should always be encouraged by getting as many points as possible (if one could also encourage further optimisation to that person to cutoff "technically unoptimal" solutions, i.e. by employing a subtask with n = 1e5 and another with n = 5e5 it would be great). I have seen this principle applied to the preparation of very many problems ranging from unimportant to highly important (i.e. TST) by plenty of highly experienced problem setters.

Now, how does this principle apply to the matter at hand? Well, why should anyone discourage further development on the contestant party by weak pretests? What happens when you add weak pretests (**intentionally) devolves into two cases:

First, the case was intended to break a certain foreseeable bug, thing which will inevitably lead to the discouragement od the contestant. Although one should always be aware of what mistakes they can do, it is not fair to imply that everyone should always loow out for bugs after they get Pretests Accepted. Because, then it only begins to be an unnecesarry gamble of time where you have to factor in what is the chance that you do have a bug vs the chance that you are wasting your time and should proceed (because checking later is never an option as codeforces takes the time penalty of the last submission). This leads to a very depressing moment to him who hasn't gambled their time to find the bug, and a frustrating one to him who gambles too much time to yield nothing, or him who gambles time and observes that the pretests did not cover the case which he found.

Second, the case is a very minimal thing that breaks something by pure chance, nothing purely intended in that scope. In this case, I find we can give the problemsetter the benefit of the doubt whether it is a small thing (like a random hash collision coincidence that could have never been thought of) of a big thing (*like not putting some god forsaken tests with maximal limit*). Of course, this is subkective, but the pure fact that it was not done with (maybe ill) intent could be absolved as a mistake and we should carry on.

Another reason why we should enforce good pretests is that if we dont land in the former case (where by some pure chance our code fails), we open the gate to hacking, which is one of the most evil things I ever find imaginable. The reason for this is that when hacked, the test is autocontained for the hacker and the hacked, giving only the hacked the opporunity to autoreflect and to submit again. This creates a significantly big disproportion in ulterior performance only based on factors that are not controllable by us, but by luck (to be assignes into what room). This is yet another gamble which has nothing to do with the essence and true meaning of this sport.

TL;DR, FSTs are the most depressing thing that can happen to anybody. If you fail pretests and ulteriourly not solve the problem, as opposed to FST it, at least you aren't lied to that you did a great job, only to be gut punched in the last moment. The most tragic case is when you realise your mistake only when you find the nature of your counter-example, and the correction was to the degree of the trivial. Getting WA puts anyone I've ever met into a more cautious and alert state, rather than (maybe fake) AC

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    AC'ing a problem with a wrong solution is even worse because you haven't done a good job and you don't even know about it until maybe you explain your solution to someone that knows better. This happens a lot more than you might think.

    TL;DR, don't use AC as a measure of you "doing great"

»
16 months ago, # |
  Vote: I like it +73 Vote: I do not like it

yes, pretests should be strong to minimize differences between the verdict on pretests and systests

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    yeah i agree

    i always thought of pretests as a system to minimize the load on the judges instead of running the billion systests all at once

»
16 months ago, # |
Rev. 5   Vote: I like it +87 Vote: I do not like it

IMO, pretest shouldn't be too strong. Why? If the pretests are strong, there is more incentive to just try until you pass the tests. You still do get the $$$-50$$$ for every wrong submission, but there is less incetntive to try to prove that your logic is correct. With weaker pretests, you will have to verify that your solution always works to minimize the risk of getting hacked or FST. I also believe that having to prove that the solution works will make everyone a better problem solver

In today's round, the pretests of B were very weak. As controvertial as it is, I don't actally think they were bescessarily too weak. Yes, they definitely were on the weaker side, but personally I think that the pretests of other problems are in general just too strong.

But there is a different issue that arises with very weak pretests: Hacking. For example today, the winner of the contest was decided by the amount of hacks #1 and #2 made. #1 was able to make 17 succesful hacks while #2 only made 11 succesful hacks. #2 would've had around 200 more points than #1 if no hacks were made at all. Why is this unfair? The amount of hacks you were able to make was almost completely based on the amount of pretests passed, but incorrect submissions for B. This was completely down to luck, since you couldn't affect which people you were in a room with. I think we need to change the way hacks give bonus points to people. I have seen some suggestions for formulas calculating how much score you should gain from hacks. My formula is a little different, but the main idea is the same: If you make a lot of succesful hacks, the later ones should give less score boost.

My suggestion is the following: Every unsuccesful hack is $$$-50$$$:

The first succesful hack for every problem is $$$+100$$$

The second succesful hack is for every problem is $$$+100/2 = +50$$$

The third succesful hack is for every problem is $$$+100/3 = +33$$$

The fourth succesful hack is for every problem is $$$+100/4 = +25$$$

The fifth succesful hack is for every problem is $$$+100/5 = +20$$$

And so on...

So, suppose you made three succesful hacks in problem B and one succesful hack in problem C. You also made one unsuccesful hack in problem B. Your hacking score would be $$$100 + 50 + 33 + 100 - 50 = 233$$$ points.

If you made 15 succesful hacks in the same problem, your score would be $$$332$$$ points (assuming all values were rounded to the nearest integer before the addition).

UPD: If we decide to keep having strong pretests, I am not completely against that. In any case, I think my proposed change to hacking would be good.

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    honestly I think things would get funny if the setters made the pretest intentionally weak and then a cheater leaks code that eventually fails the systests

»
16 months ago, # |
  Vote: I like it +6 Vote: I do not like it

Pretests should be strong.

The point of pretests is so that people know when they have an incorrect solution without allowing them to "cheat" in a correct answer based on which test case they get wrong. If it's ok to have weak pretests that don't truly judge the contestant's problem, there shouldn't be pretests at all.

»
16 months ago, # |
  Vote: I like it -29 Vote: I do not like it

My opinion is that pretests don't have to be strong, because it should not be a guarrantee. Contestants should pay more attention to proof their logic and examine their implementation rather than guess some random conclusion and try if it passes the pretests.

»
16 months ago, # |
  Vote: I like it +111 Vote: I do not like it

Personally, pretest should be extremely strong or extremely weak. Half-way strong is the worst because participants should make a guess of strongness of the pretest, and some random WA get WA and other random WA get PT passed.

»
16 months ago, # |
  Vote: I like it +3 Vote: I do not like it

I am in favour of strong pretests. It is quite infuriating to have a good solution which gets AC during the contest, but fails on some random corner case during systests. Sure, strong pretests encourage some "guessing" of the solution and may make you a weaker CP-er, but I think the purpose of Codeforces is not to make you find every way your solution could fail, but rather to provide fun algorithmic problems while minimizing frustrations like FSTs. Anyway, if you have the tendency to "guess" solutions at problems, you will eventually get punished at other, more important contests, such as OIs or ACM-ICPC, where you may lose a lot of time because your guessed solutions were wrong.

  • »
    »
    16 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    So you prefer getting punished when it matters instead of getting punished always? Personally that sounds way more frustrating.

    • »
      »
      »
      16 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      What I meant was that important contests are a much stronger incentive to prove your solution than weak Codeforces pretests, and I think that this incentive is enough (at least for me, personally). If you want to get rid of the habit of guessing solutions, you could just try getting AC on first try when solving a problem on CF.

»
16 months ago, # |
  Vote: I like it +15 Vote: I do not like it

Would love to see the contest without pretests at all :D

»
16 months ago, # |
  Vote: I like it +79 Vote: I do not like it

What's the point of weak pretests. Just provide samples and make a no-feedback contest then.

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Yes, in my opinion the system tests should not act like roulette.

»
16 months ago, # |
  Vote: I like it +35 Vote: I do not like it

I don't care much about systests. Getting WA in them due to a stupid bug like forgetting to write "+1" somewhere has happened to me before but I never blamed anything/anyone other than my own implementation skill for that. I wouldn't expect the testers/setters to know exactly which case could break or not break my solution because they don't know it before the contest starts.

Sometimes it happens that I guess a problem but then it's mostly a matter of "welp I hope it doesn't WA on systest" instead of "it gets AC in pretests so it should be good". In other words: being aware that something looks fishy in your idea can happen, it's a good skill to have and if you go through with that fishy idea you shouldn't go into systests expecting AC even if it ACs in the pretest.

»
16 months ago, # |
Rev. 3   Vote: I like it +8 Vote: I do not like it

Every time I see such problemquestion I answer — surely pretest must be not strong. For me is natural and when I start to think why, I came up to my experience.

I started cp in late school and have participated in ROI. In my year final and 1/4 had an offline system testing i.e. I didn't knew my score until end.

After school I frequently participated in brand new codeforces and old one topcoder rounds. Topcoder had only bunch of samples, codeforces samples + pretests. And this was test to your skill of careful programmer and confident thinker. If you had FST you will be more concentrate in next contest. If accepted — you feel cool (these days you only feel about rating change).

I must say that in past usual cf round had no more than 5 problems. And still 2 hours to code them. And debug. Now you have at least 6 and no time to debug. I see cycle here: more problems in contest lead to strong pretests, strong pretests lead to more problems (no time to test — go solve more).

And final, I will never love strong pretests until cf rules changes (i.e. remove pretests and hacks and become like atcoder)

»
16 months ago, # |
  Vote: I like it +90 Vote: I do not like it

I can't say that I'm in favour of weak pretests, because hacking is random and I personally don't enjoy the process of hacking. But I don't understand the need to fix everything to your liking. Codeforces has a ruleset for rounds. There are different platforms with different rulesets. Why are you so against diversity?

In my opinion, pretests should check that you understand the statement correctly, and, in most cases, that you didn't just press keys randomly to type it. Adding at least one (not necessarily strong, usually random) maxtest is a must for problems with tight TL, when it is not clear how optimized your solution needs to be or even what complexity is expected. Everything else is on you. Proving that your solution works is on you. That includes correctness, corner cases, right complexity (which is not necessarily tested by random max test in pretests), and everything. If you are in "Proof by AC" camp — that's on you, and I hate you.

  • »
    »
    16 months ago, # ^ |
    Rev. 3   Vote: I like it +1 Vote: I do not like it

    Wow I can't believe I have the chance to argue with Um_nik! This is truly a once in a lifetime experience.

    But I don't understand the need to fix everything to your liking.

    Please read carefully my post. I believe I have expressed my point of view pretty clearly: I await to see how each matter is argued, hopefully I and others could have something to learn from this. I never argued either way should be correct. If anything, I have merely expressed my point of view why weak pretests should not be preffered in view of trying to build strong pretests, or rather reserving the strong pretests solely to systests.

    Why are you so against diversity?

    Although many have argued that strong pretests encourage gambling by "Proofs by AC", I have counter-argued that weak pretests encourage gambling by "How much time am I willing to allocate to this problem". Although this is a skill that needs a lot of attention in something like OI, in codeforces you are given the order of problems, which makes the gamble for time only a matter of verifying in the dark whether your solution could have either flaw (assuming it is correctible). And I for one find that competitive programming should be more about solving problems, not correcting them. Correcting solutions has an entire industry dedicated to it, that is Software Engineering (I admit, this is a far fetch)-- Finally, I am against such diversity because it empeaches creating a clear 'deterministic' strategy, if you will. And I find that damaging to the whole problemsetting experience. For that reason I find that there should be a concrete and universal target when building tests.

    • »
      »
      »
      16 months ago, # ^ |
        Vote: I like it +4 Vote: I do not like it

      In the end, my only problem is that to pertain the idea that one's contest perfoemance reflects on one's abilities to problem solve, or to do anyhting in particular, when it is also encpuraged that corner cases be "systests". Your problem solving skills (I find) should not be amounted to how many small mistakes you do, but but how much can you build.

      • »
        »
        »
        »
        16 months ago, # ^ |
        Rev. 2   Vote: I like it +11 Vote: I do not like it

        Problem solving includes corner cases. If you have solved a problem under extra constraints you haven't solved the whole problem, have you? Also, if you solved it under those constraints while understanding what you did, surely you would've also known that there's the other part of the problem (when your assumptions aren't true) that you haven't solved, so go solve that.

        I disagree that competitive programming is only about problem solving, implementation is also equally or even more important in my opinion but people that started in this trend of "all idea no code" problems might not agree with me on this one.

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    There are different platforms with different rulesets. Why are you so against diversity?

    Because of my timezone, codeforces is the only platform where I can realistically participate regularly

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

It's better to make samples weak instead of pretests if there's chance of guesses getting AC

»
16 months ago, # |
  Vote: I like it -8 Vote: I do not like it

I think that pretests should be strong, by default.
However, if the authors want a lively hacking phase, like in Topcoder, they should add a message informing the users in the problem statement.

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

My opinion is that pretests should neither be too weak nor too strong as in real life, whenever we encounter a new problem, the test cases will not be pre-written. It will be our responsibility to do exhaustive testing or prove that our solution works. If we deploy a non-proven solution then we will(should) face consequences in real-life(contests).

»
16 months ago, # |
  Vote: I like it -17 Vote: I do not like it

I think we shouldn't have pretests or example cases. This is because both weaken the mind, and someone with a strong mental will use complex mathematics to prove their code works, even if they don't have example cases. This will truly make codeforces contests alot better.