So, I ended up making a talk on Sprague-Grundy as part of a club at my school, and thought that the denizens of codeforces would appreciate it. As follows is roughly what was on the handout, which goes over Sprague-Grundy along with the application of its theory to Nim. As this is my first blog post on Codeforces, please do let me know if there are any mistakes in formatting.

I hope you find this interesting, as writing and explaining this helped me understand how to apply it to problems. This handout was written as more math-oriented, so it may be a bit heavy. The exercises are left unsolved, as they are intended to be worked through by the reader.

If you just want to get to what the Sprague-Grundy is defined as, search for "The Sprague-Grundy function of a game" after reading how games were defined.

# Way too much theory

## A necessary definition of games

For this talk, a game will be a two-player sequential game of perfect information, where a player who cannot move loses. Any position in the game can be represented as the set of positions to which a player can move. An empty set is thus a losing position, and a losing game is a game wherein regardless of what the current player does, they will lose. A winning game is a game that is not a losing game.

### A casual definition of a Nim heap

A heap in nim is a game where any positive number of stones can be removed from the heap on any turn.

### Formal definition of a Nim heap

We want to represent a Nim heap as a game by the definition above. Let us define *Nim*_{k} as the game representation. Since we cannot make a move on an empty heap, *Nim*_{0} = {}.

Note that

Examine why the following recursive definition meets the definitions defined above:

While this definition isn't apparently useful, it shows some of how the structure of the set notation of games can get really large for relatively simple games.

## Definition of XOR

Some more definitions need to be examined before we can get to the guts of this theorem. The exclusive or, *xor*, takes two nonnegative integers *a* and *b* and results in the nonnegative integer *n* where the *k*-th bit in the binary representation of *n* is 1 if and only if it is 1 in exactly one of *a*, *b*. An important thing to notice is that , where & is defined similar to , except the bit is 1 if and only if both bits are 1. We can see this by noting that shifting left in binary is equivalent to multiplying by 2, and in this way we have performed the traditional addition algorithm, as the *a*&*b* represents the locations where carries occur, and represents where carries do not need to occur.

## The Mex function

Let us define the Mex, minimum excludant, function as , where *N* is the natural numbers (nonnegative integers). The definition is as follows: Equivalently stated, *M*(*S*) is the least nonnegative integer *x* such that *x* is not in *S*.

## Properties of the mex function, with XOR

Let be the scalar XOR operator, which takes a set and XORs every element of it with an integer to obtain another set. Additionally, let . Then, let *C* be .

Lemma: .

Proof: We shall prove this by first proving that cannot be in *C*, then we shall prove that every nonnegative integer less than is in *C*.

Assume for contradiction that is in *C*, then or . Assume WLOG, that it is at least in , then for some . Thus, , which is a contradiction. Therefore, is not in *C*.

Now, we must prove that all nonnegative integers less than are in *C*. Let . Let . *x* cannot be zero as . Consider the largest bit that is set in *x*. Give it value *k*. Assume for contradiction that this bit is set in *z*. Then, , as the most significant bit that differs is larger in *z*. This is a contradiction. Thus, this bit must be set in either *M*(*A*) or *M*(*B*). Without loss of generality, assume *k* is set in *M*(*A*). Then, since *k* is *x*'s most significant bit, 2*k* > *x*. Hence, , so . Let . Then, , as desired.

Therefore, since every nonnegative integer less than is in *C*, and is not in *C*, is thus the mex of *C*.

## The Sprague-Grundy function of a game

### Definition

Now, consider a state *S* in the set notation of a game. Then, define the Sprague-Grundy function as follows:

### Sprague-Grundy of Nim

That can be proven through strong induction. Since *Nim*_{0} is the empty game, . Now, if we assume that for all *j* < *k*, then This completes the inductive step, so

### Losing states

Let be the maximum number of moves a game *S* can take. I shall prove inductively on that a game *S* is losing if and only if . Note that if , then there are no moves, so we have a lost game and , so the base case is finished.

The inductive hypothesis assumes that for all , it is true. Consider a game *S* with . If , then, for all possible next states , , from the definition of the Sprague-Grundy function. Thus, since , all turns in this game lead to a winning games for the next player by the inductive hypothesis. Thus, *S* is a losing game.

Now, if , then there is a game *s*' in *S* such that is 0, and since , *s*' is a losing game, and thus *S* is a winning game.

## Combining games

It is necessary to consider how simultaneous play of multiple games work to apply this to NIM, and other NIM-like games.

### Combining two games

Consider two games, *A* and *B*. Let *C*(*A*, *B*) be the combination of these two games, where a move consists of a move in either game. since no moves can be made from either game. We can recursively define the game *C*(*A*, *B*): Theorem: . Proof omitted for brevity, but it can be proven inductively on the value of from the property of the MEX function with respect to .

### Combining more than two games

Let *C*(*a*_{1}, *a*_{2}, ... , *a*_{n}) denote the combination of games *a*_{1}, *a*_{2}, ... , *a*_{n}. Then *C*(*a*_{1}, ... *a*_{n}) = *C*(*C*(... *C*(*C*(*a*_{1}, *a*_{2}), *a*_{3}), ... ), *a*_{n}). The Sprague-Grundy of the combination of these games is thus

## Back to Nim

Typically, a game of Nim is defined as a multiset of heaps of a given size. Combining previous statements, a game of Nim is losing if and only if the XOR of its stack sizes is 0.

## Some bounds on

It may become the case that we want to know the maximum value of the Sprague-Grundy function for the case when studying a game. There are a few easy bounds. We shall mention two. is obvious from the definition of Mex. In addition, , as from any state with Sprague-Grundy value of *x*, there exists a sequence of reachable games Sprague-Grundy values *x*, *x* - 1, *x* - 2, ..., 0, thus , as desired.

# Let's solve some problems

Now that we have a working definition of Sprague-Grundy, let's solve some games with it!

## Simple Exercises

### The simplest subtraction game

Consider the subtraction game, where on any given turn, a player takes away a positive integer that is at most *k* stones from a single heap. Calculate, in closed form, the Sprague-Grundy function of a heap of *n* stones in this game.

### Slightly more complicated

Consider a game where a player is able to replace any heap of size *k* with up to *k* - 1 heaps of any positive size strictly less than *k*. Prove that in this game, the Sprague-Grundy function of a heap of size *k* is 0 if *k* = 1, and 2^{k - 2} otherwise

### One more for the road

Consider a game on heaps, where a turn consists of taking a heap of size *n*, and a positive integer *d* that divides *n*, where 1 ≤ *d* < *n*, and splits it into *n* / *d* heaps of size *d*.

Analyze the Sprague-Grundy function of this game. (Try doing it with just the odd heap sizes first)

## Counting losing positions

Sometimes, it is useful to count the number of losing positions given a bound on the size of a stack. Assume we've already computed the Sprague-Grundy function for all stack sizes that we are allowed.

### 1 stack

If there is only one stack, then the number of losing positions within the bounds is the number of losing positions.

### 2 stacks, order matters

If there are two stacks, then the number of positions gets more complicated. From the theorems proved earlier, a combined position is losing if and only if its *XOR* is 0. Thus, we can just find all pairs of stack sizes where the XOR of their Sprague-Grundys is 0. We can do this by counting how many stacks within the bounds have a Sprague-Grundy value of *x* for all *x*, and summing the square of that count across all *x* will give us the answer.

### 2 stacks, order does not matter

All stack pairs are double counted except for the pairs of two identical stacks, which are all valid, so the answer in this case is half the sum of the previous case and the number of stacks.

### 3 stacks

Sum across all possible set of 3 Sprague-Grundy values that XOR to 0. The third can be computed in terms of the first and second. In the instance where the stacks are ordered by size, i.e., we are asked to count multisets, not ordered tuples, the principle of inclusion-exclusion can be used to finish counting.

## Harder counting (But they need way more stacks!)

Sometimes, one may happen upon a problem asking to count the number of losing positions for a given game with a large number of stacks (10^{12}), modulo some odd number *o*. The number of stacks is too many for a computer to handle by generating all the combinations. For this, we should attempt to calculate the counts of how many games with *k* stacks have any given Sprague-Grundy function. The number with 1 stack is fairly easy to calculate from the Sprague-Grundy function of all stack sizes. In addition, let *F* = 2^{d} be the smallest power of 2 greater than the Sprague-Grundy function of the stacks. It is then the case that the Sprague-Grundy function of any combinations of stacks cannot be at least *F*, as no bit with value at least *F* can be set.

Note that a game with *k* stacks is just a game with *k* - 1 stacks appended to a 1-stack game. Let be the number of valid lists of *k* stacks where the XOR of their Sprague-Grundy is *x*. Then,

.

*i*, *j* are bounded within [0, *F*). This is just XOR-convolution of and , which we will denote as . Since XOR-convolution is associative, we want to actually compute . The number of losing positions is . can thus be calculated in XOR-convolutions by binary exponentiation.

However, a naive implementation of XOR-Convolution takes *O*(*F*^{2}) time per convolution. This could be fast enough for some problems, but when the Sprague-Grundy value gets large, this isn't enough.

A simple example of a game where this becomes a problem is a subtraction game where we can subtract between 1 and 1000000 from any stack, and we are given stack sizes between 1 and 12983567378251, and want exactly 98670634123 stacks.

Calculating the Sprague-Grundy is fast because it is periodic, and even calculating the values within can be done relatively quickly *O*(1000000) due to this periodicity.

However, it is not really fast to perform (2^{20})^{2} = 2^{40} ≈ 10^{12} operations per convolution, which we to perform on the order of tens of times.

We thus need a faster way to perform XOR-convolutions. Note that the space of numbers on which the values operate is equivalent to , where XOR becomes bitwise addition. Thus convolution can be performed with a normal *d*-dimensional FFT. However, adding the dimensionality is unnecessary, as we can perform the FFT along each dimension in linear time with respect to the size of the signal. Convolving Fourier Transformed values takes linear time with respect to the size of the signal, as it just becomes element-wise multiplication.

The following pseudo-code performs the FFT in *O*(*dF*) time. Inverse is, as usual with FFTs, the same, except dividing by 2^{d} afterward, or by 2 on the inside at every step. This reverse step requires an odd modulus, which is why it was mentioned earlier.

```
Begin with D[i] being an array of F elements representing
the values of a function to be transformed.
for i = 0..d:
for j = 0..F:
if the bit with value 2^i is not set in j:
D[j],D[j+2^i] = D[j]+D[j+2^i],D[j]-D[j+2^i]
```

Each convolution then takes *O*(*F*) time, and so we only need time to finish it overall. This is much smaller than , and should be enough to solve it for most problems we would come across, if we can calculate the counts of each value of the Sprague-Grundy quickly.

Note that if the totient function of the modulus is easily calculated, the power to which we take the convolution can be reduced modulo the totient function.

Is it just me or anyone else is not able to make sense of it ?

Sprague and Grundy are turning in their graves, after reading this tutorial.

This topic is difficult, become stable violet to understand it

the topic is not difficult, become stable yellow and it will be easy for you

Getting contribution 101 :

Step 1 : Be >= yellow.

Step 2 : Write a shitty lengthy tutorial which no one will bother to read.

Step 3 : Sit back and let the greens upvote it without even reading it.

I don't think the tutorial was particularly shitty. It definitely is very terse and math-heavy compared to what you normally expect from a Codeforces blog. And I suppose it could have been written in a better way.

But at a certain level you can't just expect everything to be spoon-fed to you. You need to work through not-so-simple texts to learn. I think the blog is valuable as it goes to more theory than just a typical "Here is how to use Spargue-Grundy to solve Nim-like problems. Magic!".

You think it is an easy read beacuse you are already well versed in the topic.

The blog seems more like he's telling himself about the topic rather than explaining it to someone else.

No, I am not well-versed in the topic. Please do not make assumptions.

Also I am not saying (and have never said) that it is an easy read.

However, a mathematical background and being used to reading texts like this is immensely helpful for understanding this blog. And frankly, I think you ought to have some mathematical maturity to understand topics like this.

Also, consider the context. He says this was given as a handout during a talk. He probably took a bit more time, gave more examples etc. when explaining this in person.

"...mathematical maturity" are you serious? I think this topic is affordable to anyone. I can understand it, and I'm not a mathematician.

Seems like I offended a lot of yellows :P

You offend me by going here to complain and not to solve any problems, for example.

Eh, you don't need to be a mathematician. But by "mathematical maturity" I mean you are able to parse mathematical text like this without huge confusion, understand what a definition is, understand what a proof is etc.

You may consider these to be trivial but they are not. A lot of people get very confused and have a lot of difficulty trying to read texts that don't spoon-feed everything.

A winning game is a game that is not a losing game.Wow.

When I wrote this, it was intended as an introduction to the theory of Sprague-Grundy, rather than the application of it, so I feel that this is not really a tutorial. I am currently working on another blog post that should have more code, as well as examples of actually using it, that can act as a stronger introduction into usage of Sprague-Grundy.

The version posted here is an edited version of an accompaniment to a talk to my school's Recreational Math Club, where members research into a topic they are not strongly familiar with, and then deliver a presentation or talk on it to other members. During the talk, we ran out of time before I could get to the section on counting losing positions.

A few questions asked during the talk were as follows:

An example is as follows. Consider two games of the form {{}},{{}}. Their combination is as follows.

C({{}}, {{}}) = {C({}, {{}}),C({{}}, {})} = {{C({}, {})}, {C({}, {})}} = {{{}}, {{}}} = {{{}}}M({0, 1, 2, 4}) = 3M({1, 2, 7}) = 0M({0, 2, 3, 4, 5}) = 1Please let me know if you think any wording is unclear, or would like clarification on any aspects of it.

I am familiar with Sprague-Grundy Theorem and this blog post really helped me with some extra stuff, I don't know why people are offending it. It could be improved by providing extra examples and problems to solve, but overall it's good, thank you Sir for your efforts.

Hi! I thoroughly read your article and here are some of my remarks. I write them on fly, so I may refer to things which will be explained later in the article, but I will leave remarks as is to show what I thought at the moment of reading that particular part.

This is mostly my opinion and I don't claim it to be absolutely objective.

That's not really a formal definition of a

game, since it relies on the termgameitself. What do you mean bygame of perfect informationis somewhat unclear to me. I would suggest to seek for simpler definitions here, based on yours I suppose you're not considering games in which positions may be repeated, thus you may say, thatgamemay be represented by directed acyclic graph in which players move alternatively such that the player whose turn happens in sink looses.Also it is better to say

positionor evenstateinstead ofgamewhen you saywinning gameorlosing game. Or make any reasonable disambiguation betweengameandpositionand not mess them up.You're messing up formal and informal here. What are stones? What do you mean by removal? It probably would be better to say that you're considering the game in which set of states is and you can move from to any such that .

OR if you only use casual terms, it's still doesn't seem to be complete definition, it should be more verbose.

Okay, now I see that

gameisexactlyset ofgameswhich you may pass to the next player. It was not clear at first because there was much of unnecessary stuff in your definition of the game which distracted from the most important part.And I still think it's not the best way to make a formal definition of the game, since it doesn't allow you to define games with possible cycles due to axiom of regularity, even if they don't have draws.

I think is more common for what you defined here, but that's up to you.

You're using symbol without defining it. Although it can be understood from context and is common notation for xor, you should explicitly state that you mean xor by it if you really decided to define xor.

What the hell is ? It's the symbol of Weierstrass's elliptic functions, you clearly meant "Set of all subsets of ", so why not to write here? Okay, you also may mean set of finite subsets of , but it's in turn should probably be written as . Okay, this notation is not very common and maybe vague, but you should at least explain what does mean! Also you should be careful with including 0 in , but at least you explicitly said that it's included.

No! No! No! Why would you go to things like ? It is really common in mathematical texts to define some obscure lemmas before proving the theorem, but please, you're not writing mathematical article, so kindly explain why do you want to prove anything before doing so.

And the way you prove it itself is terrible. You're just checking god-given fact via some technicalities, it's not how you should proof things in tutorials. You could left it as exercise and it still would be better than that.

Okay, I'm not going to deeply dive in what's going on here and I will just believe you that the lemma is true, for now.

This paragraph is concise and simple, I liked it.

For real? I assume that's where you need to use that lemma and you... Proved the lemma and just skipped the proof of the theorem? I don't even have right words to describe how bad it is. I expected from this article to get some nice intuitive notion of why and I didn't get one! Boo! BOO!

Nah, I'm too lazy, not going into it now.

Umgh, what the stack is? I guess, it's the same as nim heap, but you didn't define it earlier.

Wtf I just read... Like, you're saying that

n=n, thanks!Okay, this one seems completely unrelatable to game theory :(

Nope, you can calculate one WHT, raise values in frequency domain to the power and calculate inverse WHT.

Did you mean ?..

To summarize: I like the idea of providing mathematically strict and concise theory, but, in my opinion, you didn't succeed in it. Furthermore, this article is hard to comprehend by ones not familiar with the theory because it's hard to pick out real key ideas:Ais losingAnd it, in my opinion, doesn't contain any new and useful information for experienced ones.

Anyway, I don't want to, in any way, discourage you from writing articles since I understand that your intentions are good and writing is not an easy thing and I eagerly wait for the next article you've promised! :)

P. S.

If the topic is really new to you, I must say bravo for your enthusiasm and bravery. You did good job diving into it and I hope that no criticism will prevent you from continuing doing the right thing!

Thank you for the feedback!

The topic was one among many I crammed before my ICPC regional, so I had an idea of how to use the mex function, but I wanted to help my team understand it, and to further my understanding of it. I still don't have an intuitive understanding of why it works, further than the fact that the mex function holds that property, but I think I've managed to internalize its process and application. I'll try to include the proof that brings it all together in the next post. It was omitted in the handout, but outlined during the talk before we started talking about the problems themselves.

My math background is from Olympiad, and I often just tried to prove things in a manner where I didn't have to worry about moving in and out of large sections of proof. As a result, I have, at times, written proofs in an order such that everything is ready when it's needed. But, I do agree that the ordering was pretty weird. I was just trying to get a proof onto paper, or in this case, into

The was because Wikipedia's Power set article, which I looked at because I forgot the notation, had 5 different notations for it, and I just picked one of them.

Thanks for the notes on places where my wording was ambiguous, or not defined. I'll try to improve upon that for the next one.

The counting section was motivated by a few problems on ProjectEuler.

The very last part was something I found interesting that came from usage of the results of the Sprague-Grundy, where we could speed up finding the number of losing games, when someone was feeling sadistic with the problem limits. I did manage to find a problem or two where I could apply this technique when I was either testing my understanding, and trying to bring some problems together for the next post.

I think there was a problem forum on ProjectEuler where the message board for solvers had a challenge to find the answer (modulo 1000000007) when heap sizes were up to 10

^{1234321}, and there were 10^{123456789987654321}heaps, for a specific heap-based nimmish game, where we could count how many heap sizes in a given range had a certain Sprague-Grundy very rapidly.I didn't know about the Walsh-Hadamard transform at the time of writing the talk, and so I only knew how to think about it as a discrete FFT on a 2x2x2x2x2x2x...x2 dimensional tensor. These are indeed equivalent though.

I was trying to motivate the transform in a similar manner to how I learned normal FFT, which comes from turning convolution into element-wise multiplication.

What I meant by each convolution taking

O(F) was that after the signals have been put through WHT, convolution only takesO(F). I was thinking in terms of exponentiation's cost per multiplication, wherein we only need to apply WHT once going in, and once going out.I'm hoping the next one will be more interesting from the code side of things, but I'm trying to find more problems to include. I'm currently looking for some that involve more complicated states of play, or where the Sprague-Grundy is an important aspect of the solution, but is only a part of a harder whole.