IngaleAnkur10's blog

By IngaleAnkur10, history, 13 days ago, In English,

I tried this problem: https://codeforces.com/contest/1136/problem/D.
And came up with a solution similar to the solution in the editorial.
But I first submitted my Python code in both Python3 and PyPy3 and had TLE.
But when I submitted an ALMOST equivalent code in C++, it got accepted.
I wonder what's wrong in my python code or is it the compiler's/interpreter's doing?

Python Code 1 : https://codeforces.com/contest/1136/submission/51256343.
Python Code 2 : https://codeforces.com/contest/1136/submission/51262124.
C++ code: https://codeforces.com/contest/1136/submission/51257855.

Thanks in advance.

 
 
 
 
  • Vote: I like it  
  • -6
  • Vote: I do not like it  

»
13 days ago, # |
  Vote: I like it +36 Vote: I do not like it

Erm. Didn't you know that python runs slower than my grandma can crawl?

  • »
    »
    13 days ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Yeah, you are right! But I have been using Python for a long time and feel that this problem is more than just a comparison between the speeds of two.

    • »
      »
      »
      13 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      If you want to deeply understand the reason behind the empirical speed of both languages, you are gonna have to study compilers and programming languages.

»
13 days ago, # |
  Vote: I like it 0 Vote: I do not like it

C++ is a lot faster than Python, so there are some problems which you can't solve in Python because it's not fast enough. I think that's the problem.

  • »
    »
    13 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Are you sure that's the only problem? Because I usually solve problems in PyPy3/Python and never had such problem.

    • »
      »
      »
      13 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I'm not sure, but I think Python can do about 10^6 operations in a second. I saw your Python code and, correct me if I'm wrong, but I think the time complexity of your code is not good enough if you use Python.

      • »
        »
        »
        »
        13 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Can you explain how the time complexity is not good enough?

    • »
      »
      »
      12 days ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I'm just saying, you seem really stubborn about using Python/PyPy for your solutions. This is really going to be a problem for harder problems, when the model solution uses about $$$10^8$$$ operations and no amount of PyPy optimization will help. The setters don't even guarantee that all problems can be solved in TL with Python.

      If you are serious about solving more problems, you are going to have to switch to C++ or Java eventually.

»
13 days ago, # |
  Vote: I like it +37 Vote: I do not like it

Firstly pretty much always use pypy, it is a lot quicker than cpython in pretty much every way.

But the actual reason you get TLE is that the built in IO is very slow, especially when you have to read in multiple lines. One way to speed up the input is to do

input = io.BytesIO(os.read(0, os.fstat(0).st_size)).readline

This reads in the input once, and then stores it in a stream. This one-liner will speed up your code immensely. With it your code runs in 1044 ms. There are also similar tricks you can do to speed up the output, but that wasn't necessary in this case.

  • »
    »
    12 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thanks a lot! Also, can you give me some resources to other tricks to speed up io?

    • »
      »
      »
      12 days ago, # ^ |
      Rev. 2   Vote: I like it +8 Vote: I do not like it

      Most of the tricks I've been using just comes from me and c1729 trying out different methods. He has got a guide but I think it might be a bit outdated at this time.

      About IO, as a general rule of thumb, read the entire input in one go, write the entire output in one go. IO in python doesn't have to be slow, it is just the built in stuff that is slow. For example take this huge IO SPOJ problem. I've been able to do it in 0.10 s on pypy, while the quickest Java solution takes 0.17 s.

      I think that for input using

      input = io.BytesIO(os.read(0, os.fstat(0).st_size)).readline

      will be good enough for most problems. And for output I think that using the pypy feature __pypy__.builders.StringBuilder is very good. Here is an example of using a string builder for output:

      import os,sys,__pypy__
      
      out = __pypy__.builders.StringBuilder()
      A = [137]*10000000
      for a in A:
          out.append(str(a))
          out.append(' ')
      
      if sys.version_info[0] < 3:
          os.write(1,out.build())
      else:
          os.write(1,out.build().encode())
      

      This takes around $$$420 \mathrm{ms}$$$ in pypy3 and $$$200 \mathrm{ms}$$$ in pypy2. (this is including the upstart time of around 100 ms that pypy got, so the print especially in pypy2 is super quick).

      The reason why pypy2 is quicker is that python 2 uses bytestrings but in python 3 they've switched from using bytestrings to using unicode-strings by default, this drastically slows down the IO. So in python3 you have to manually switch to using bytestrings in order to get fast IO (this is done with .encode(), but unfortunately encode is pretty slow). Bytestrings are also one of the reasons os.read is so quick.

      Using these methods you should see drastically better speeds in heavy IO problems.

      • »
        »
        »
        »
        12 days ago, # ^ |
          Vote: I like it +11 Vote: I do not like it

        Thanks, updated the blog post with our most recent io techniques and some explanations.

      • »
        »
        »
        »
        12 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        That was really helpful. Thank you so much.