aberent's blog

By aberent, history, 5 weeks ago, In English

Is the checker occasionally failing to detect that a test has completed, and so giving a TLE?

In Codeforces round 727 problem C, when the system tests were run, I got TLE on my submission on test 18. Looking at the results my code printed the result before getting the TLE, and my code should exit immediately after printing the result. At first I assumed that I was just unlucky, and my code had taken precisely one second to run, so had been stopped just before exiting.

After the contest I tried modifying my code in various ways to speed it up, for example https://codeforces.com/contest/1539/submission/120193428 and https://codeforces.com/contest/1539/submission/120195587. In all cases I got exactly the same result; my code printed an answer but still got a TLE. It seems extremely unlikely that every single version of my code would take precisely 1 second to run.

To confirm this, I tried running my most recent version under Python3 instead of PyPy3. This is normally slower, but on this occasion passed all tests without any TLEs. See https://codeforces.com/contest/1539/submission/120195637

 
 
 
 
  • Vote: I like it
  • +18
  • Vote: I do not like it

»
5 weeks ago, # |
  Vote: I like it +8 Vote: I do not like it

Auto comment: topic has been updated by aberent (previous revision, new revision, compare).

»
5 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Exactly same happened to me :(

»
5 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

It's possible that your code finished running, but it was measured to take more than 1 second of CPU time to complete the job.

Killing processes that are running too long can't be done immediately and with perfect timing accuracy, so there may be some safety headroom (for example, forced killing may kick in after some delay). And if your code finished on its own without getting killed, then its time may be still compared with the time limit for deciding whether it was AC or TLE. Wallclock time vs. CPU time is another thing that may make everything a little bit more complicated.

  • »
    »
    5 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    See my reply to @bjy. What is surprising is not that this happened once, but that it kept happening identically even with significant changes to my code. These changes should have significantly changed the performance for either better or worse.

»
5 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

To your question about 'precisely 1 second' -- all that indicates is that submitted programs get cut off once they hit/exceed the time limit ("go home, yer drunk, closing time is NOW" etc.)

Otherwise, there are already a number of individual posts and comments in the contest threads about the cause of the TLE itself concluding that the 32-bit version of pypy here at codeforces gets slow with large numbers (and sorting piles of them, for instance)... see https://codeforces.com/blog/entry/90184 -- there are some other posts/comments about i/o too, but my extremely limited (local, lazy) testing hit a good-enough-for-me point of diminishing returns whenever I switched to sys.stdin.readline a while back... ymmv! (or at least, there's the custom invocation feature to do better tests)

Fwiw, searching a few related terms pulls up posts from about 2 years ago referencing their approach to sandboxing user-submitted code on windows: said approach works on 32-bit but the technique/vector they use to alter/block system calls is itself blocked/unavailable on 64-bit.

tl;dr it's related to a site-specific system thing, and without any grand announcements of roadmaps and timelines (or linux boxes? or maybe a centralized faq of versions and intended upgrades...), I don't suspect it'll change any time soon, which is a bummer, but understandable.

  • »
    »
    5 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Yes, the TLE could be caused by the installed version of PyPy being 32 bit, and "precisely one second" might not actually be 1000ms. My impression in the past, however, has been execution is normally cut off within a few ms of the time limit (otherwise one would see printed results combined with a TLE far more often).

    What was surprising is that every single version I tried gave the same behaviour of printing the result and then failing with a TLE. There were some significant differences between the different versions (e.g. changing between loops and list comprehension and switching to using itertools.accumulate and bisect to find the final answer). These changes should have significantly altered the performance, and did on the previous tests. Given this I would have expected at least one of my 6 versions to either pass test 18 or fail with a TLE before printing the result.

    • »
      »
      »
      5 weeks ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      ack, sorry, didn't process the successful output yet TLE... yeah, kinda weird I suppose.

      I think it's just a crappy coincidence though, ie many of the non-heroic-measures pypy solves were extremely close to 1000ms, so lucky us, we happened to hit that rare category of output but also TLE hit during aftermath of cleanup/teardown/exit or whatever.

      Even if approaches feel diverse, they might have enough in common to hit the same rare issue. There was a passing solve that, like me, only subbed in readline, but, unlike us, they used heapq rather than a second sort. Others, I reckon got their milliseconds out of the bytesIO template or even stdout.write...?

      I suppose a next step could be auditing how different i/o methods play together, maybe some weird sync/flush behavior...? (buuut that's where I nope)