Блог пользователя minimario

Автор minimario, история, 8 лет назад, По-английски

Here's the problem: here.

My submission got an AC, sure, but it ran in O(N^2 log(10^6)), and it took >2000 ms to run (even with optimised MOD). But there are so many submissions, which I think are the same complexity, which run so fast!

For example, just look at the first 2 here!

The first one, I don't understand, I think it uses some magic instructions (wtf, 100'000 instead of 100000 and mod_t!!?) (by ordcoder). The second one, by mmaxio is even in Java, and I'm almost completely sure it's the same O(N^2 log(10^6)) algorithm as everyone else.

So how did these people (or anyone else with some fast submission) get their runtime down so low?

Can someone elaborate?

Edit 1: Thanks to ned_chu, I've replaced the inverse modding to be linear. My new submission is here

Thanks,

minimario

  • Проголосовать: нравится
  • +15
  • Проголосовать: не нравится

»
8 лет назад, # |
Rev. 2   Проголосовать: нравится +16 Проголосовать: не нравится

Initializing mod inverse in you solution cost , but it is linear in Java solution. Actually, my teammate did things like this 2 days ago and got TLE and get AC after change mod inverse to online version. You can calculate mod inverse online instead of initializing it. Or you can initial it in linear time like that Java solution. Considering upper bound of mod inverse usage, it is always faster to calculate inverse online on single-case OJ like CF if you have N of 105 or 106. If it's multi-case OJ, it may be better to initialize mod inverse. You can learn how to initialize mod inverse in linear time. Maybe you need this, just skip all Chinese and go to about 60% this article, there some formulas about linear time mod inverse initialization.

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Thanks, it shaved about 100 ms. off my time. But I guess it doesn't really optimise it that much. New submission link here.

»
8 лет назад, # |
  Проголосовать: нравится +29 Проголосовать: не нравится

mod operations are extremely costly. Inside your bottleneck loop, you have four (!) of them; that Java solution has zero.

(Incidentally, the new fastest solution is just your code with the modulos removed. :P)

»
8 лет назад, # |
  Проголосовать: нравится +24 Проголосовать: не нравится

The first one, I don't understand, I think it uses some magic instructions (wtf, 100'000 instead of 100000 and mod_t!!?)

"100'000" is one of C++14 features. https://en.wikipedia.org/wiki/C%2B%2B14#Digit_separators

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +21 Проголосовать: не нравится

    And "using mod_t = uint64_t" is a C++11 synonym for "typedef uint64_t mod_t".