Блог пользователя TheScrasse

Автор TheScrasse, история, 4 года назад, По-английски

Hi everyone, I have just tried to solve the problem 161D.
If I use a matrix dp[50010][510], I get a tle verdict, even if the time complexity of the solution is $$$O(nk)$$$, $$$nk < 10^8$$$ and the constant factors are quite small. But if I use a matrix dp[510][50010] and I swap the indices, I get ac with a time of 498 ms (much less than the time limit).
Why does it happen?
Thanks

Submission with tle verdict: 73781168
Submission with ac verdict: 73781989

  • Проголосовать: нравится
  • +17
  • Проголосовать: не нравится

»
4 года назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

When the inner dimension of the array is close to a large power of two, accessing a[i][j] for different i but same j plays badly with processor cache. Oversimplifying a bit, cache lines are addressed by the last few bits of the actual memory address, so a[i][j] for different i but same j compete for the same slots in the cache. As a result, the program needs accessing the actual RAM instead of the processor cache too often, which is a few times slower.

»
4 года назад, # |
  Проголосовать: нравится +4 Проголосовать: не нравится