### dario2994's blog

By dario2994, history, 15 months ago,

While trying to squeeze my solution of 1336E2 - Chiori and Doll Picking (hard version) into the time limit, I have encountered an unexpectedly large difference in the execution times among the various C++ compilers offered by Codeforces. Since I think it is something worth knowing, I am sharing this discovery.

Consider the following minimal working example (it generates $2N=30$ random $56$-bits numbers and computes the sum of the bits of the xors of all the subsets of the $30$ numbers).

Code

The important lines are the following ones, where the functions __builtin_popcountll and xor are called $2^{30}$ times.

 ULL res = 0;
for (int i = 0; i < (1<<N); i++) for (int j = 0; j < (1<<N); j++) {
res += __builtin_popcountll(c[i]^d[j]);
}


Executing the above program in Codeforces custom invocation yields these execution times:

 Compiler                         Execution Time
GNU GCC C11 5.1.0                4040 ms
GNU G++11 5.1.0                  4102 ms
GNU G++14 6.4.0                  1123 ms
GNU G++17 7.3.0                  1107 ms
GNU G++17 9.2.0 (64bit, msys 2)  374 ms


Notice that the 64bit-native compiler produces a much faster executable (and notice also that among the other compilers there is quite a difference). Hence, next time you have to optimize a solution with a lot of bit-operations on 64bit integers (and, in fact, this situation is not so uncommon), consider using the compiler GNU G++17 9.2.0 (64bit, msys 2).

It might be that the differences among the execution times are due to the way I have written the program (maybe the wrong PRAGMAS? Maybe preventing some compilers from optimizing because of a certain idiom? Maybe something else?), if this is the case, please enlighten me!

• +33