### TheScrasse's blog

By TheScrasse, history, 3 years ago,

Hi everyone, I have just tried to solve the problem 161D.
If I use a matrix dp[50010][510], I get a tle verdict, even if the time complexity of the solution is $O(nk)$, $nk < 10^8$ and the constant factors are quite small. But if I use a matrix dp[510][50010] and I swap the indices, I get ac with a time of 498 ms (much less than the time limit).
Why does it happen?
Thanks

Submission with tle verdict: 73781168
Submission with ac verdict: 73781989

• +17

 » 3 years ago, # |   +5 When the inner dimension of the array is close to a large power of two, accessing a[i][j] for different i but same j plays badly with processor cache. Oversimplifying a bit, cache lines are addressed by the last few bits of the actual memory address, so a[i][j] for different i but same j compete for the same slots in the cache. As a result, the program needs accessing the actual RAM instead of the processor cache too often, which is a few times slower.
 » 3 years ago, # |   +4 There's an excellent talk by Scott Meyers that discusses this particular issue.