dario2994's blog

By dario2994, history, 4 years ago, In English

While trying to squeeze my solution of 1336E2 - Chiori and Doll Picking (hard version) into the time limit, I have encountered an unexpectedly large difference in the execution times among the various C++ compilers offered by Codeforces. Since I think it is something worth knowing, I am sharing this discovery.

Consider the following minimal working example (it generates $$$2N=30$$$ random $$$56$$$-bits numbers and computes the sum of the bits of the xors of all the subsets of the $$$30$$$ numbers).

Code

The important lines are the following ones, where the functions __builtin_popcountll and xor are called $$$2^{30}$$$ times.

 ULL res = 0;
 for (int i = 0; i < (1<<N); i++) for (int j = 0; j < (1<<N); j++) {
     res += __builtin_popcountll(c[i]^d[j]);
 }

Executing the above program in Codeforces custom invocation yields these execution times:

 Compiler                         Execution Time
 GNU GCC C11 5.1.0                4040 ms
 GNU G++11 5.1.0                  4102 ms
 GNU G++14 6.4.0                  1123 ms
 GNU G++17 7.3.0                  1107 ms
 GNU G++17 9.2.0 (64bit, msys 2)  374 ms

Notice that the 64bit-native compiler produces a much faster executable (and notice also that among the other compilers there is quite a difference). Hence, next time you have to optimize a solution with a lot of bit-operations on 64bit integers (and, in fact, this situation is not so uncommon), consider using the compiler GNU G++17 9.2.0 (64bit, msys 2).

It might be that the differences among the execution times are due to the way I have written the program (maybe the wrong PRAGMAS? Maybe preventing some compilers from optimizing because of a certain idiom? Maybe something else?), if this is the case, please enlighten me!

  • Vote: I like it
  • +33
  • Vote: I do not like it