Please subscribe to the official Codeforces channel in Telegram via the link https://t.me/codeforces_official. ×

IngaleAnkur10's blog

By IngaleAnkur10, history, 3 years ago,

I tried this problem: https://codeforces.com/contest/1136/problem/D.
And came up with a solution similar to the solution in the editorial.
But I first submitted my Python code in both Python3 and PyPy3 and had TLE.
But when I submitted an ALMOST equivalent code in C++, it got accepted.
I wonder what's wrong in my python code or is it the compiler's/interpreter's doing?

Python Code 1 : https://codeforces.com/contest/1136/submission/51256343.
Python Code 2 : https://codeforces.com/contest/1136/submission/51262124.
C++ code: https://codeforces.com/contest/1136/submission/51257855.

• -6

 » 3 years ago, # |   +36 Erm. Didn't you know that python runs slower than my grandma can crawl?
•  » » 3 years ago, # ^ | ← Rev. 2 →   0 Yeah, you are right! But I have been using Python for a long time and feel that this problem is more than just a comparison between the speeds of two.
•  » » » 3 years ago, # ^ |   0 If you want to deeply understand the reason behind the empirical speed of both languages, you are gonna have to study compilers and programming languages.
 » 3 years ago, # |   0 C++ is a lot faster than Python, so there are some problems which you can't solve in Python because it's not fast enough. I think that's the problem.
•  » » 3 years ago, # ^ |   0 Are you sure that's the only problem? Because I usually solve problems in PyPy3/Python and never had such problem.
•  » » » 3 years ago, # ^ |   0 I'm not sure, but I think Python can do about 10^6 operations in a second. I saw your Python code and, correct me if I'm wrong, but I think the time complexity of your code is not good enough if you use Python.
•  » » » » 3 years ago, # ^ |   0 Can you explain how the time complexity is not good enough?
•  » » » 3 years ago, # ^ |   +8 I'm just saying, you seem really stubborn about using Python/PyPy for your solutions. This is really going to be a problem for harder problems, when the model solution uses about $10^8$ operations and no amount of PyPy optimization will help. The setters don't even guarantee that all problems can be solved in TL with Python.If you are serious about solving more problems, you are going to have to switch to C++ or Java eventually.
 » 3 years ago, # |   +37 Firstly pretty much always use pypy, it is a lot quicker than cpython in pretty much every way. But the actual reason you get TLE is that the built in IO is very slow, especially when you have to read in multiple lines. One way to speed up the input is to doinput = io.BytesIO(os.read(0, os.fstat(0).st_size)).readlineThis reads in the input once, and then stores it in a stream. This one-liner will speed up your code immensely. With it your code runs in 1044 ms. There are also similar tricks you can do to speed up the output, but that wasn't necessary in this case.
•  » » 3 years ago, # ^ |   0 Thanks a lot! Also, can you give me some resources to other tricks to speed up io?
•  » » » 3 years ago, # ^ | ← Rev. 2 →   +8 Most of the tricks I've been using just comes from me and c1729 trying out different methods. He has got a guide but I think it might be a bit outdated at this time.About IO, as a general rule of thumb, read the entire input in one go, write the entire output in one go. IO in python doesn't have to be slow, it is just the built in stuff that is slow. For example take this huge IO SPOJ problem. I've been able to do it in 0.10 s on pypy, while the quickest Java solution takes 0.17 s. I think that for input usinginput = io.BytesIO(os.read(0, os.fstat(0).st_size)).readlinewill be good enough for most problems. And for output I think that using the pypy feature __pypy__.builders.StringBuilder is very good. Here is an example of using a string builder for output: import os,sys,__pypy__ out = __pypy__.builders.StringBuilder() A = [137]*10000000 for a in A: out.append(str(a)) out.append(' ') if sys.version_info[0] < 3: os.write(1,out.build()) else: os.write(1,out.build().encode()) This takes around $420 \mathrm{ms}$ in pypy3 and $200 \mathrm{ms}$ in pypy2. (this is including the upstart time of around 100 ms that pypy got, so the print especially in pypy2 is super quick).The reason why pypy2 is quicker is that python 2 uses bytestrings but in python 3 they've switched from using bytestrings to using unicode-strings by default, this drastically slows down the IO. So in python3 you have to manually switch to using bytestrings in order to get fast IO (this is done with .encode(), but unfortunately encode is pretty slow). Bytestrings are also one of the reasons os.read is so quick.Using these methods you should see drastically better speeds in heavy IO problems.
•  » » » » 3 years ago, # ^ |   +11 Thanks, updated the blog post with our most recent io techniques and some explanations.
•  » » » » 3 years ago, # ^ |   0 That was really helpful. Thank you so much.
•  » » » » 16 months ago, # ^ |   0 Hey, I use the following template (was copied from someone's solutions and is probably owned by you). Can you tell me if the following one is faster or the one you mentioned above ? Спойлер# Template: https://codeforces.com/profile/kabeer27 from __future__ import division, print_function def solve(): pass def main(): tests = 1 tests = int(input().strip()) for test in range(tests): solve() ######## Python 2 and 3 footer by Pajenegod and c1729 # Note because cf runs old PyPy3 version which doesn't have the sped up # unicode strings, PyPy3 strings will many times be slower than pypy2. # There is a way to get around this by using binary strings in PyPy3 # but its syntax is different which makes it kind of a mess to use. # So on cf, use PyPy2 for best string performance. py2 = round(0.5) if py2: from future_builtins import ascii, filter, hex, map, oct, zip range = xrange import os, sys from io import IOBase, BytesIO BUFSIZE = 8192 class FastIO(BytesIO): newlines = 0 def __init__(self, file): self._file = file self._fd = file.fileno() self.writable = "x" in file.mode or "w" in file.mode self.write = super(FastIO, self).write if self.writable else None def _fill(self): s = os.read(self._fd, max(os.fstat(self._fd).st_size, BUFSIZE)) self.seek((self.tell(), self.seek(0,2), super(FastIO, self).write(s))[0]) return s def read(self): while self._fill(): pass return super(FastIO,self).read() def readline(self): while self.newlines == 0: s = self._fill(); self.newlines = s.count(b"\n") + (not s) self.newlines -= 1 return super(FastIO, self).readline() def flush(self): if self.writable: os.write(self._fd, self.getvalue()) self.truncate(0), self.seek(0) class IOWrapper(IOBase): def __init__(self, file): self.buffer = FastIO(file) self.flush = self.buffer.flush self.writable = self.buffer.writable if py2: self.write = self.buffer.write self.read = self.buffer.read self.readline = self.buffer.readline else: self.write = lambda s:self.buffer.write(s.encode('ascii')) self.read = lambda:self.buffer.read().decode('ascii') self.readline = lambda:self.buffer.readline().decode('ascii') sys.stdin, sys.stdout = IOWrapper(sys.stdin), IOWrapper(sys.stdout) input = lambda: sys.stdin.readline().rstrip('\r\n') # Cout implemented in Python import sys class ostream: def __lshift__(self,a): sys.stdout.write(str(a)) return self cout = ostream() endl = '\n' # Read all remaining integers in stdin, type is given by optional argument, this is fast def readnumbers(zero = 0): conv = ord if py2 else lambda x:x A = []; numb = zero; sign = 1; i = 0; s = sys.stdin.buffer.read() try: while True: if s[i] >= b'0' [0]: numb = 10 * numb + conv(s[i]) - 48 elif s[i] == b'-' [0]: sign = -1 elif s[i] != b'\r' [0]: A.append(sign*numb) numb = zero; sign = 1 i += 1 except:pass if s and s[-1] >= b'0' [0]: A.append(sign*numb) return A if __name__== "__main__": main() 
•  » » » » » 16 months ago, # ^ |   +1 What you got there is my general template. It should be both about as fast as possible while being very general. It works both in python2 and 3, and it even works well on interactive problems.
•  » » » » » » 16 months ago, # ^ | ← Rev. 2 →   0 Thanks, One more question. Can you reason out why exactly same code gives runtime error on some cases when using pypy while passess the same testcase when using python3 ? I mean, in general what could be the possible reason ?
•  » » » » » » » 16 months ago, # ^ |   +1 You should just avoid doing any type of deep recursion in Python in general. Python was not made with deep recursion in mind.There are ways to solve graph problems iteratively, without ever calling a function. If you want to code in Python you should probably learn to do it that way.If you really want to do recursion then there is still a way to do it that works fairly well. A while back I made https://github.com/cheran-senthil/PyRival/blob/master/pyrival/misc/bootstrap.py, this thing might look weird, but with it I can make your code AC 80648471. This is what I would use if I need to do deep recursion.