By Qualified, history, 4 weeks ago,

KACTL ModMul. It says that it runs around 2x faster than naive

(__int128_t)a * b % M


When I ran my benchmarks with -O2, the results were similar. Am I mistaken?

• -29

 » 4 weeks ago, # |   0 This is only one mod operation, try doing more operations and benchmarking.
•  » » 4 weeks ago, # ^ |   0 I did that. I did 1e5 runs and their running time was basically the same.
•  » » » 4 weeks ago, # ^ |   0 1e5 operations is not very many. That should take around 1 millisecond. Try something like 1e10 of them to spot a consistent difference. Also make sure the compiler can't optimize it out.