andreyv's blog

By andreyv, 5 years ago, translation, In English,

Hello!

I wish to continue the discussion on C++ input/output performance, which was started by freopen long ago. freopen compared the speed of the two common input/output methods in C++: the stdio library, inherited from C (<stdio>), and the newer iostreams library (<iostream>/…). However, these tests did not account for the fact that there are several important iostreams optimizations which can be used. They have been mentioned more than once on Codeforces (first, second, third). I have written a program that compares performance of the stdio and iostreams libraries with these optimizations turned on.  

UPD1: Added information on _CRT_DISABLE_PERFCRIT_LOCKS.
UPD2: Added printf()/scanf() calls on strings for completeness.

What are the optimizations?

The first one is enabled by placing this line in the beginning of the program, before any input/output:

ios_base::sync_with_stdio(false);

This command turns off iostreams and stdio synchronization (description). It is on by default, which means that calls to iostreams and stdio functions can be freely interleaved even for the same underlying stream. When synchronization is turned off, mixing calls is no longer allowed, but iostreams can potentially operate faster.

The second optimization is about untying cin from cout:

cin.tie(NULL);

By default, cin is tied to cout, which means that cout is flushed before any operation on cin (description). Turning this feature off allows iostreams, again, to operate faster. One should be careful with this optimization in interactive problems: it should either not be used, or an explicit flush should be issued each time.

I should also note that frequent use of endl also negatively affects iostreams performance, because endl not only outputs a newline character, but also flushes the stream's buffer (decription). You can simply output '\n' or "\n" instead of endl.

What tests are included in the program?

I have tried to reproduce the most typical cases that occur when solving problems.

  • int input/output with stdio, iostreams and, for comparison, custom functions
  • double input/output
  • Character input/output
  • String input/output: both char * and std::string

What tests are not included in the program?

  • long long — I estimate it to give roughly the same relative results as int
  • Manual conversion of int to a custom char buffer of a fairly large size, which would then get directly output with fwrite()/cout.write() (and the same for input)
  • Rather unusual character input/output method with cin.rdbuf()->sgetc() and cout.rdbuf()->sputc()
  • Any tests that change the stream buffer size (it seems that in GCC iostreams is unaffected by user settings for standard streams). This can be potentially explored more thoroughly.

How do I run this?

Compile the program, not forgetting optimization (-O2/Release), then run it with the same working directory as where the program binary is. If you get a Access denied message on Windows, running the program with elevated privileges could help. The program will need about two hundred megabytes of free space in the directory for temporary files.

Additional notes

  • Why does each test need a separate process?

Because ios_base::sync_with_stdio(false) disallows combined stdio and iostreams usage, and also, theoretically, prohibits using freopen() to redirect cin/cout.

  • Why is it needed to remove the test file before each new test?

To have equal conditions for all runs. However, this could be disputable. Maybe it's better to rewrite the file?

  • Why does the child process measure the time, and not the parent process?

To exclude process creation/destruction time from the results.

  • Why can't you use something more precise like getrusage() instead of clock()?

I can. That is, when I understand how to do it in Windows :-)

The results

I ran the tests on a PC with Pentium 4, so the figures might look a bit big.

int, printf        9.45   9.48   9.44
int, cout         22.03  22.01  22.21
int, custom/out   11.17  11.06  11.20
int, scanf         5.04   4.77   4.82
int, cin          20.26  20.16  20.16
int, custom/in    10.25  10.25  10.25
double, printf    19.23  18.98  18.95
double, cout      37.49  37.52  37.44
double, scanf     12.11  11.75  11.73
double, cin       26.88  26.57  26.57
char, putchar     13.29  13.76  13.48
char, cout        23.52  24.15  23.41
char, getchar     12.87  12.82  12.74
char, cin         16.13  16.22  16.50
char *, printf     6.88   6.74   6.57
char *, puts       3.95   3.82   3.95
char *, cout       6.36   6.32   6.43
string, cout       6.40   6.40   6.61
char *, scanf      6.16   6.10   6.13
char *, gets       3.98   3.96   3.96
char *, cin        8.72   8.91   8.85
string, getline   11.70  11.47  11.53

Here, everything is obvious. stdio is a lot faster than iostreams. It is notable that printf()/scanf() are even faster than the custom-written functions for int (but see addendum below). puts()/gets() are faster than printf()/scanf() on strings — this is understandable. Writing a std::string takes the same time as for char *, but reading to a std::string is slower — certainly because of the need to dynamically allocate memory.

int, printf        9.72   9.61   9.61
int, cout          6.08   6.05   6.10
int, custom/out    2.73   2.75   2.76
int, scanf         5.01   5.01   5.01
int, cin           3.99   4.04   4.04
int, custom/in     0.86   0.86   0.87
double, printf    22.51  22.40  22.42
double, cout     110.98 111.77 111.01
double, scanf     12.18  12.20  12.17
double, cin      118.87 118.84 118.87
char, putchar      1.67   1.65   1.64
char, cout         3.93   3.87   3.85
char, getchar      0.78   0.80   0.80
char, cin          3.29   3.31   3.29
char *, printf     5.55   5.47   5.49
char *, puts       5.37   5.32   5.41
char *, cout       8.72   8.72   8.78
string, cout       8.74   8.71   9.06
char *, scanf      7.07   7.04   7.02
char *, gets       3.84   3.79   3.77
char *, cin        5.30   5.38   5.35
string, getline   14.15  14.12  14.16

This one is not so one-sided. Quite unexpectedly, it turns out that iostreams is about 20-30% faster than stdio for int. The custom int functions beat both by a significant margin, though. For double it's reversed: iostreams is very slow. putchar()/getchar() work about 2-3 times faster than cout/cin for character input/output. String input/output does not differ as much, but also here stdio is faster. puts()/gets() are again faster than printf()/scanf() on string input/output. As in the previous case, std::string takes the same time as char * to be output, but more time to be input.

I leave it up to the readers to draw conclusions and decide what to use. Flame constructive discussion is welcome.

Addendum

For Visual C++, there is a method to significantly speed up basic operations on stdio streams by turning off stream locking for the getchar(), putchar() and some other functions. To do this, add this line before any #includes:

#define _CRT_DISABLE_PERFCRIT_LOCKS

(description). This will work only if the following conditions are also met:

  • The program must be statically linked with the standard library (/MT; Codeforces seems to do this)
  • The program can include <stdio.h>, but must not include <cstdio> or any of the iostreams headers (<iostream>/…)

Alternatively to the magic above, you can just use _putchar_nolock()/_getchar_nolock() instead of putchar()/getchar(). Linux also has similar functions: link.

With this optimization character input/output speed increases nearly ninefold (!), and so does the speed of custom int functions:

int, custom/out    1.70   1.70   1.72
int, custom/in     1.28   1.26   1.28
char, putchar      1.72   1.62   1.61
char, getchar      1.36   1.34   1.36

MinGW does this by default and is not subject to the aforementioned restrictions.

 
 
 
 
  • Vote: I like it  
  • +62
  • Vote: I do not like it  

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you tell me about custom/in or custom/out. What it means? Thank you.

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

(I see that this is kinda old entry, but it was brought up into recent actions I will make my input).

I think it would also be interesting to have results for different degrees of precisions "not set" (which is probably always a bad idea to use at contests) vs small (e.g. 2) vs big (e.g. 10).

»
12 months ago, # |
  Vote: I like it +1 Vote: I do not like it

Is ios::sync_with_stdio(0); the same as ios_base::sync_with_stdio(0)?

  • »
    »
    12 months ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    Do you mean std::basic_ios? Yes, it is inherited from the std::ios_base class.