By Radewoosh, history, 5 years ago,

Once, during one training, my team and I noticed something strange. I'll write a little bit later what it was exactly. The point is that after the contest we decided to investigate how c++ rand() behaves on our machines (we use Linux). So, we used a Berlekamp–Massey algorithm on the results of rand() taken modulo 2.

The result is shocking — the length of the linear recurrence is only 527! No, the results aren't cyclic, just the recurrence is short.

"Cool, but who cares? Just some random fact which I wouldn't notice and it wouldn't have any impact on me." — wrong!

We discovered it while solving a task which required calculating the rank of a binary matrix. We wanted to check something, so we just generated big matrix using rand(), which turned out to be a bad decision. As only the first 527 columns were, let's say, independent, the rest were determined by first 527 ones, so the rank didn't exceed 527, and we lost much time just wondering what's happening.

Moreover, if the number of columns is even, the rank is only 496 (for matrices big enough of course). Guess what's the length of the recurrence if we look only at every second value of rand().

Yep, you're right — 496. One can notice that 527 - 496 = 31 and that 31 divides 527! Illuminati!

Here are some useful links: 1 2.

If anybody can explain precisely what's happening for even number of columns, it would be great. Also, if anybody knows how does rand() look inside and can say a few words about it, it also would be great.

• +119

| Write comment?
 » 5 years ago, # |   +12 Where's Blogwoosh??? xD
•  » » 5 years ago, # ^ |   +60 Ask Radwoosh.
•  » » 5 years ago, # ^ | ← Rev. 2 →   +175 Probably gone, just like the black R.
 » 5 years ago, # |   +63 Have noone ever told you to never use rand()?
 » 5 years ago, # |   +8 Wow, rand() is getting worse ...
•  » » 5 years ago, # ^ |   +74 rand() is not getting worse, Berlekamp-Massey is getting better!
 » 5 years ago, # |   +20
•  » » 5 years ago, # ^ | ← Rev. 2 →   -12 This has nothing to do with the blog.EDIT: it does, my bad.
•  » » » 5 years ago, # ^ | ← Rev. 2 →   +40 It does. The post contains the exact implementation of rand() that Codeforces uses, which is a linear congruential generator.The compiler Radewoosh is using has a better implementation of rand() than Codeforces, but it's most likely still a linear congruential generator, which means rand() % 2 has a relatively low period.Try mt19937 instead and see what you get.EDIT: this one might actually be a linear-feedback shift register rather than a linear congruential generator, but the point about rand() % 2 having low-quality randomness is still true.
 » 5 years ago, # | ← Rev. 2 →   +200 Yep, you're right — 496. One can notice that 527 - 496 = 31 and that 31 divides 527! Illuminati! Not just that, but 31 also divides 496! Now that cannot be a coincidence.
 » 5 years ago, # | ← Rev. 2 →   -31 Here is a possible simple fix for this problem using the standard library function random_shuffle(). 1
•  » » 5 years ago, # ^ |   0 Easier fix: use rand() % 1000007 % 2
•  » » » 5 years ago, # ^ |   -11 Thanks for the constructive comment; an interesting and much simpler fix, indeed. The previous link includes a pefroamance comparison between both fixes.
 » 5 years ago, # |   0 The GLIBC random number generator: https://www.mathstat.dal.ca/~selinger/random/
 » 5 years ago, # |   -18 I've learned to use a linear congruential generator, with the rand() as a seed. Seems to work pretty well.
 » 5 years ago, # |   +3 The length of the linear recurrence of Xorshift64 is 64, I surprised. It seems to that calculating linear recurrence is weak points of some RNGs. At least it of mt19937 isn't short.https://ideone.com/UxRMg8 (xorshift) https://ideone.com/8ueYut (mt19937) https://en.wikipedia.org/wiki/Xorshift (the code of xorshift64)
 » 5 years ago, # | ← Rev. 2 →   +18 It's more likely to be a feature, but on my machine this code outputs only zeros (on Codeforces it doesn't): for (int i = 0; i < 100; i++) { srand(i); cout << rand() % 7 << " "; } And 0 is the only seed among all seeds from 0 to 100000, such that the first number generated by rand() isn't a multiple of 16807 = 75.This caused me a lot of problems when I was stressing my solution once and wrote something like this: srand(some number); int n = rand() % 7 + 1;