Блог пользователя Radewoosh

Автор Radewoosh, история, 5 лет назад, По-английски

Once, during one training, my team and I noticed something strange. I'll write a little bit later what it was exactly. The point is that after the contest we decided to investigate how c++ rand() behaves on our machines (we use Linux). So, we used a Berlekamp–Massey algorithm on the results of rand() taken modulo 2.

The result is shocking — the length of the linear recurrence is only 527! No, the results aren't cyclic, just the recurrence is short.

"Cool, but who cares? Just some random fact which I wouldn't notice and it wouldn't have any impact on me." — wrong!

We discovered it while solving a task which required calculating the rank of a binary matrix. We wanted to check something, so we just generated big matrix using rand(), which turned out to be a bad decision. As only the first 527 columns were, let's say, independent, the rest were determined by first 527 ones, so the rank didn't exceed 527, and we lost much time just wondering what's happening.

Moreover, if the number of columns is even, the rank is only 496 (for matrices big enough of course). Guess what's the length of the recurrence if we look only at every second value of rand().

Yep, you're right — 496. One can notice that 527 - 496 = 31 and that 31 divides 527! Illuminati!

Here are some useful links: 1 2.

If anybody can explain precisely what's happening for even number of columns, it would be great. Also, if anybody knows how does rand() look inside and can say a few words about it, it also would be great.

  • Проголосовать: нравится
  • +119
  • Проголосовать: не нравится

»
5 лет назад, # |
  Проголосовать: нравится +12 Проголосовать: не нравится

Where's Blogwoosh??? xD

»
5 лет назад, # |
  Проголосовать: нравится +63 Проголосовать: не нравится

Have noone ever told you to never use rand()?

»
5 лет назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

Wow, rand() is getting worse ...

»
5 лет назад, # |
  Проголосовать: нравится +20 Проголосовать: не нравится
  • »
    »
    5 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится -12 Проголосовать: не нравится

    This has nothing to do with the blog.

    EDIT: it does, my bad.

    • »
      »
      »
      5 лет назад, # ^ |
      Rev. 2   Проголосовать: нравится +40 Проголосовать: не нравится

      It does. The post contains the exact implementation of rand() that Codeforces uses, which is a linear congruential generator.

      The compiler Radewoosh is using has a better implementation of rand() than Codeforces, but it's most likely still a linear congruential generator, which means rand() % 2 has a relatively low period.

      Try mt19937 instead and see what you get.

      EDIT: this one might actually be a linear-feedback shift register rather than a linear congruential generator, but the point about rand() % 2 having low-quality randomness is still true.

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +200 Проголосовать: не нравится

Yep, you're right — 496. One can notice that 527 - 496 = 31 and that 31 divides 527! Illuminati!

Not just that, but 31 also divides 496! Now that cannot be a coincidence.

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится -31 Проголосовать: не нравится

Here is a possible simple fix for this problem using the standard library function random_shuffle(). 1

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

The GLIBC random number generator: https://www.mathstat.dal.ca/~selinger/random/

»
5 лет назад, # |
  Проголосовать: нравится -18 Проголосовать: не нравится

I've learned to use a linear congruential generator, with the rand() as a seed. Seems to work pretty well.

»
5 лет назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

The length of the linear recurrence of Xorshift64 is 64, I surprised. It seems to that calculating linear recurrence is weak points of some RNGs. At least it of mt19937 isn't short.

https://ideone.com/UxRMg8 (xorshift) https://ideone.com/8ueYut (mt19937) https://en.wikipedia.org/wiki/Xorshift (the code of xorshift64)

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +18 Проголосовать: не нравится

It's more likely to be a feature, but on my machine this code outputs only zeros (on Codeforces it doesn't):

for (int i = 0; i < 100; i++) {
    srand(i);
    cout << rand() % 7 << " ";
}

And 0 is the only seed among all seeds from 0 to 100000, such that the first number generated by rand() isn't a multiple of 16807 = 75.
This caused me a lot of problems when I was stressing my solution once and wrote something like this:

srand(some number);
int n = rand() % 7 + 1;