I wish to continue the discussion on C++ input/output performance, which was started by freopen long ago. freopen compared the speed of the two common input/output methods in C++: the stdio library, inherited from C (
<stdio>), and the newer iostreams library (
<iostream>/…). However, these tests did not account for the fact that there are several important iostreams optimizations which can be used. They have been mentioned more than once on Codeforces (first, second, third). I have written a program that compares performance of the stdio and iostreams libraries with these optimizations turned on.
UPD1: Added information on
scanf() calls on strings for completeness.
What are the optimizations?
The first one is enabled by placing this line in the beginning of the program, before any input/output:
This command turns off iostreams and stdio synchronization (description). It is on by default, which means that calls to iostreams and stdio functions can be freely interleaved even for the same underlying stream. When synchronization is turned off, mixing calls is no longer allowed, but iostreams can potentially operate faster.
The second optimization is about untying
cin is tied to
cout, which means that
cout is flushed before any operation on
cin (description). Turning this feature off allows iostreams, again, to operate faster. One should be careful with this optimization in interactive problems: it should either not be used, or an explicit
flush should be issued each time.
I should also note that frequent use of
endl also negatively affects iostreams performance, because
endl not only outputs a newline character, but also flushes the stream's buffer (decription). You can simply output
"\n" instead of
What tests are included in the program?
I have tried to reproduce the most typical cases that occur when solving problems.
intinput/output with stdio, iostreams and, for comparison, custom functions
- Character input/output
- String input/output: both
What tests are not included in the program?
long long— I estimate it to give roughly the same relative results as
- Manual conversion of
intto a custom
charbuffer of a fairly large size, which would then get directly output with
cout.write()(and the same for input)
- Rather unusual character input/output method with
- Any tests that change the stream buffer size (it seems that in GCC iostreams is unaffected by user settings for standard streams). This can be potentially explored more thoroughly.
How do I run this?
Compile the program, not forgetting optimization (
-O2/Release), then run it with the same working directory as where the program binary is. If you get a Access denied message on Windows, running the program with elevated privileges could help. The program will need about two hundred megabytes of free space in the directory for temporary files.
- Why does each test need a separate process?
ios_base::sync_with_stdio(false) disallows combined stdio and iostreams usage, and also, theoretically, prohibits using
freopen() to redirect
- Why is it needed to remove the test file before each new test?
To have equal conditions for all runs. However, this could be disputable. Maybe it's better to rewrite the file?
- Why does the child process measure the time, and not the parent process?
To exclude process creation/destruction time from the results.
- Why can't you use something more precise like
I can. That is, when I understand how to do it in Windows :-)
I ran the tests on a PC with Pentium 4, so the figures might look a bit big.
- For Visual C++ 2010: http://pastie.org/4680309
int, printf 9.45 9.48 9.44 int, cout 22.03 22.01 22.21 int, custom/out 11.17 11.06 11.20 int, scanf 5.04 4.77 4.82 int, cin 20.26 20.16 20.16 int, custom/in 10.25 10.25 10.25 double, printf 19.23 18.98 18.95 double, cout 37.49 37.52 37.44 double, scanf 12.11 11.75 11.73 double, cin 26.88 26.57 26.57 char, putchar 13.29 13.76 13.48 char, cout 23.52 24.15 23.41 char, getchar 12.87 12.82 12.74 char, cin 16.13 16.22 16.50 char *, printf 6.88 6.74 6.57 char *, puts 3.95 3.82 3.95 char *, cout 6.36 6.32 6.43 string, cout 6.40 6.40 6.61 char *, scanf 6.16 6.10 6.13 char *, gets 3.98 3.96 3.96 char *, cin 8.72 8.91 8.85 string, getline 11.70 11.47 11.53
Here, everything is obvious. stdio is a lot faster than iostreams. It is notable that
scanf() are even faster than the custom-written functions for
int (but see addendum below).
gets() are faster than
scanf() on strings — this is understandable. Writing a
std::string takes the same time as for
char *, but reading to a
std::string is slower — certainly because of the need to dynamically allocate memory.
- For MinGW (GCC 4.7.0): http://pastie.org/4680314
int, printf 9.72 9.61 9.61 int, cout 6.08 6.05 6.10 int, custom/out 2.73 2.75 2.76 int, scanf 5.01 5.01 5.01 int, cin 3.99 4.04 4.04 int, custom/in 0.86 0.86 0.87 double, printf 22.51 22.40 22.42 double, cout 110.98 111.77 111.01 double, scanf 12.18 12.20 12.17 double, cin 118.87 118.84 118.87 char, putchar 1.67 1.65 1.64 char, cout 3.93 3.87 3.85 char, getchar 0.78 0.80 0.80 char, cin 3.29 3.31 3.29 char *, printf 5.55 5.47 5.49 char *, puts 5.37 5.32 5.41 char *, cout 8.72 8.72 8.78 string, cout 8.74 8.71 9.06 char *, scanf 7.07 7.04 7.02 char *, gets 3.84 3.79 3.77 char *, cin 5.30 5.38 5.35 string, getline 14.15 14.12 14.16
This one is not so one-sided. Quite unexpectedly, it turns out that iostreams is about 20-30% faster than stdio for
int. The custom
int functions beat both by a significant margin, though. For
double it's reversed: iostreams is very slow.
getchar() work about 2-3 times faster than
cin for character input/output. String input/output does not differ as much, but also here stdio is faster.
gets() are again faster than
scanf() on string input/output. As in the previous case,
std::string takes the same time as
char * to be output, but more time to be input.
I leave it up to the readers to draw conclusions and decide what to use.
Flame constructive discussion is welcome.
For Visual C++, there is a method to significantly speed up basic operations on stdio streams by turning off stream locking for the
putchar() and some other functions. To do this, add this line before any
(description). This will work only if the following conditions are also met:
- The program must be statically linked with the standard library (
/MT; Codeforces seems to do this)
- The program can include
<stdio.h>, but must not include
<cstdio>or any of the iostreams headers (
With this optimization character input/output speed increases nearly ninefold (!), and so does the speed of custom
int, custom/out 1.70 1.70 1.72 int, custom/in 1.28 1.26 1.28 char, putchar 1.72 1.62 1.61 char, getchar 1.36 1.34 1.36
MinGW does this by default and is not subject to the aforementioned restrictions.