[Educational] Combinatorics Study Notes (2)

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	atcoder_official	148
5	-is-this-fft-	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

Hello Codeforces!

Today I'll be writing about what I have learnt about combinatorics, which played, and, in my opinion, will still be playing a important role in both Codeforces and CP (short for competitive programming).

However, combinatorics is such a great subject that made that impossible for me to write it all in one blog. So, this is just the second blog, which is friendly to beginners. If you are interested, please, pay attention to this account and I'll give posts as series for a long term.

If you have found some mistakes in the text, or if you have some questions about the topic, please, leave a comment, and I'll check it weekly and reply. Also, if you find some grammar mistakes, a kind comment will be also welcomed.

Previous Blogs

Combinatorics (1)

Blogs in the future

If finished, I'll link them.

Combinatorics (3)
Bitmask
Probabilities

Content

Quick power
Fermat's little theorem
extend-gcd
The multiplicative inverse of an integer
Prework optimize
homework

Part 0 — Pre-knowledge

In order not to make this blog too long (it's already very long now xD), I won't explain these things anymore. You can find them on wikis or google them.

What Combinatorics (1) mentioned.
CP basics.
Reading simple C++ code.

Part 1 — Quick power

How to calculate the value of $$$a^b\bmod p$$$?

You may write out this code easily:

long long power(long long a, long long b, long long p) {
	long long res = 1;
	while(b--) res = res * a % p;
	return res;
}

Seems easy, right? This code has $$$\mathcal O(b)$$$ time complexity. But what if $$$1\leq a,b\leq 10^9$$$? Can you solve it?

Here, we use divide ans conquer to solve this:

$$$a^b=\begin{cases}a^{\left\lfloor \frac{b}{2} \right\rfloor} \times a^{\left\lfloor \frac{b}{2} \right\rfloor} & n \text{ is even} \\ a^{\left\lfloor \frac{b}{2} \right\rfloor} \times a^{\left\lfloor \frac{b}{2} \right\rfloor} \times a & n \text{ is odd}\end{cases}$$$

According to this, we can write out a code easily:

long long powpow(long long a, long long b, long long p) {
	if(b % 2 == 1) {
		long long r = powpow(a, b / 2, p);
		return r * r % p * a % p;
	} else {
		long long r = powpow(a, b / 2, p);
		return r * r % p;
	}
}

Now certainly the time complexity is $$$\mathcal O(\log b)$$$. But this code contains a recursion. Since the recursion is simple, we can get rid of recursion:

long long quickpow(long long a, long long b, long long p) {
	long long res = 1, nowa = 1;
	while(b) {
		if(b % 2 == 1) res = res * nowa % p;
		nowa = a * a % p;
		a = nowa;
		b /= 2;
	}
	return res;
}

Now I'll give a fast version of it without explaining, you can use it in the later implementaions.

long long quickpow(long long a, long long b, long long p) {
	long long res = 1;
	while(b) {
		if(b & 1) res = res * a % p;
		a = a * a % p;
		b >>= 1;
	}
	return res;
}

The time complexity remains the same, but the constant is smaller now. If you still wonders why, you can wait for my bitmask post.

Exercise: Calculate $$$(a\times b) \bmod p$$$, $$$10^9 \leq p\leq 10^{18}$$$.

In C++, long long varible is between $$$-2^{63}$$$ and $$$2^{63}-1$$$, which is about $$$10^{18}$$$. When you do a * b % p, then things below will happen:

$$$a\times b$$$ is calculated first, it can overflow and lead to error.
$$$a\times b\bmod p$$$ is calculated, the result is already wrong.

So multiply it with brute force is impossible, can you think of an $$$\mathcal O(\log b)$$$ solution?

Solution

First, $$$a^b=a\times a\times a\times ... \times a$$$, it can be solved with divide and conquer.

Then, $$$a\times b=a+a+a+\dots +a$$$, it's similar. So, it can be also solved with divide and conquer.

Just change the quick power code:

long long mul(long long a, long long b, long long p) {
	long long res = 0;
	while(b) {
		if(b & 1) res = (res + a) % p;
		a = (a + a) % p;
		b >>= 1;
	}
	return res;
}

You may wonder, if we divide it into $$$2$$$ parts, why don't we divide it into $$$k$$$ parts?

Actually, the time complexity is $$$\mathcal O(k\times\log_{k} n)$$$.

Look at the image we draw out $$$y=k+k\times \log_{k}n$$$:

And we draw out $$$\color{green} {x=2}$$$ and $$$\color{red}{x=3}$$$ when $$$n$$$ is very large:

So we find that $$$k=2,3$$$ is optimal. But $$$k=3$$$ uses $$$3$$$ if sentences, this lead to larger constant than expected, so we use $$$k=2$$$ instead.

Part 2 — Fermat's little theorem

Here we start with a simple lemma, that is given here without a proof:

$$$\textbf{Lemma: } a^{p-1}\equiv 1\pmod p, p \text{ is a prime}$$$

Then we have $$$a^{p-2}\equiv \frac{1}{p}\pmod p$$$, so we can do modulus operation to all rational numbers, since $$$\frac{p}{q}=\frac 1q \times p$$$.

Then you may find that we can do division operation modulus a number in $$$\mathcal O(\log p)$$$ time with quick power!

long long div(long long d, long long p) {
	return quickpower(d, p - 2, p);
}

This is very important, since many problems requires you to output the result modulus $$$998244353$$$ or $$$10^9+7$$$, and they are all primes.

Now you can try to calculate single combination in $$$\mathcal O(n\log p)$$$ time.

Part 3 — Extend-gcd

First, let's review the Euclidean algorithm to calculate gcd:

int gcd(int a, int b) { if(a == 1) return b; return gcd(b % a, a); }

For example:

$$$\gcd(66,26)$$$
- $$$66\div 26=2\cdots\cdots 14$$$
- $$$26\div14=1\cdots\cdots 12$$$
- $$$14\div12=1\cdots\cdots 2$$$
- $$$12\div2=6\cdots\cdots 0$$$
- $$$\therefore\gcd(66,26)=2$$$

Let's undo the progress:

$$$\gcd(66,26)=2$$$
- $$$14=66-26\times 2$$$
- $$$12=26-14\times 1$$$
- $$$2=14-12\times 1$$$
- $$$0=12-2\times 6$$$

So easily we get:

$$$2=14-12\times 1$$$
$$$\;\;\;=14-(26-14\times 1)\times 1$$$
$$$\;\;\;=(66-26\times 2)-(26-(66-26\times 2)\times 1)\times 1$$$
$$$\;\;\;=2\times 66+(-5)\times 26$$$

Then you find that, we get a particular solution of the equation $$$ax+by=\gcd(a,b)$$$ when $$$a,b$$$ is given. This progress is called Extend Euclidean algorithm, also Extend GCD. For example, when $$$a=66,b=26$$$, one of the possible solutions is $$$(x,y)=(2,-5)$$$.

Ok, then we can write out the algorithm:

long long exgcd(long long a, long long b, long long &x, long long &y){
	if(b == 0){
		x = 1;
		y = 0;
		/*
		* When b = 0, ax = gcd(a, 0) = a. So x = 1.
		* The func returns the value of gcd, since a, b is not connected to x, y.
		* But x, y is connected to a, b.
		* Some also writes exgcd in the type of void,
		* But in both versions,
		* your varible x and y will be turned into one of the solutions of the equation.
		*/
		return a;
	}
	long long res = exgcd(b, a % b, y, x);
	y -= (a / b * x);
	return res;
}

But what's this used for? Let's return to the question we discussed in Part 2, to solve the congruence equation $$$x\equiv \frac{1}{a}\pmod p$$$ when $$$p$$$ is a prime, is to solve the equation $$$ax \equiv 1 \pmod p$$$, which means $$$ax + py = 1$$$. Because $$$p$$$ is a prime, therefore, $$$\gcd(p,a)=1$$$. So $$$ax+py=\gcd (a,p)$$$, then you only need to do exgcd(a, p, x, y).

Show you the code to do division operation in another way:

int div(int x, int p) {
	exgcd(x, p, x, *new int);
	return(x % p + p) % p;
}

Time complexity proof:

First the time complexity of Exgcd is the same as Euclidean algorithm, it's trivial.

Then we find two situations when we do Euclidean algorithm:

$$$a<b$$$, and we have $$$\gcd(a,b)=\gcd(b,a)$$$.
$$$a\ge b$$$, and we have $$$\gcd(a,b)=\gcd(b, a\bmod b)$$$, and when doing modulus, the new $$$a$$$ will be at most $$$\left\lfloor\frac{a}{2}\right\rfloor$$$ now. If the worst, every time exactly $$$\left\lfloor\frac{a}{2}\right\rfloor$$$, and the time complexity is $$$\mathcal O(\log a)$$$.

We find that the two conditions take turns to appear, so finally the complexity is $$$\mathcal O(\log a)$$$. What's more, when $$$a,b$$$ is adjacent in a fibonacci sequence for example $$$8,13$$$. it reaches the limit.

Note:

There is another way to implement exgcd (but not very common) without proof getting rid of recursion.

int exgcd(int a, int b, int &x, int &y) {
	int x1 = 1, x2 = 0, x3 = 0, x4 = 1;
	while (b != 0) {
		int c = a / b;
		x1 = x3, x2 = x4, x3 = x1 - x3 * c, x4 = x2 - x4 * c, a = b, b = a - b * c;
	}
	x = x1, y = x2;
	return a;
}

Part 4 — The multiplicative inverse of an integer

The inverse $$$e$$$ of an integer $$$a$$$ under the meaning of operation $$$*$$$, is defined as:

We note $$$o$$$ as the identity element.
- The identity element is defined as $$$\forall a, a*o=a$$$. When $$$*=+$$$, $$$o=0$$$, when $$$*=\times$$$, $$$o=1$$$, and so on.
- This does not stop here, but if we have to discuss it more formally, it will be related to the group theory, which is another tremendous branch of the combinatorics.
$$$e*a=o$$$.

For example, the inverse of an integer $$$7$$$ under the meaning of operation addition is $$$-7$$$.

Also, the inverse of an integer $$$7$$$ under the meaning of operation multiplication is $$$\frac{1}{7}$$$.

So, the inverse of an integer $$$7$$$ under the meaning of operation multiplication modulus $$$5$$$ is $$$7^{5-2}=343$$$.

According to the definition, the inverse of an integer $$$x$$$ under the meaning of operation multiplication modulus $$$p$$$ when $$$p$$$ is a prime, is to calculate $$$a$$$ what we've talked about in the last two parts: $$$a\equiv \frac{1}{x}\pmod p$$$.

To divide a number by $$$x$$$ under the meaning of operation multiplication modulus $$$p$$$, is to multiply the inverse of an integer $$$x$$$ under the meaning of operation multiplication modulus $$$p$$$.

Let's go to the revision straightly:

const int mod = 998244353;
int pw(int a, int b, int p = mod) {
	int res = 1;
	while(b){
		if(b & 1) res = res * a % p;
		a = a * a % p;
		b >>= 1;
	}
	return res;
}
int gcd(int x,int y){return!y?x:gcd(y,x%y);}
int exgcd(int a, int b, int &x, int &y) {
	int x1 = 1, x2 = 0, x3 = 0, x4 = 1;
	while (b != 0) {
		int c = a / b;
		x1 = x3, x2 = x4, x3 = x1 - x3 * c, x4 = x2 - x4 * c, a = b, b = a - b * c;
	}
	x = x1, y = x2;
	return a;
}
int inv(int x, int p = mod) {
	exgcd(x, p, x, *new int);
	return(x % p + p) % p;
	//or pw(x, p - 2, p);
}

Part 5 — Prework optimize

Last post, I talked about the $$$\mathcal O(n^2)$$$ initiation of the combination. Now we can do it in $$$\mathcal O(n+\log p)$$$ time.

Step 1:

We talked about $$$n!=(n-1)!\times n$$$ last time, so we can initiate factorial in $$$\mathcal O(n)$$$ time.

const int N = 10000005, mod = 998244353;
long long fac[N];
void init() {
	fac[0] = 1;
	for(int i = 1; i < N; i++) fac[i] = fac[i - 1] * i % mod;
}

Then call to fac[i] you will get the value of $$$i!$$$ modulus $$$998244353$$$.

Step 2:

Since $$$n!=(n-1)!\times n$$$, so $$$\frac{1}{n!}=\frac{1}{(n+1)!}\times (n+1)$$$. We define ifac[i] as the value of $$$\frac{1}{i!}$$$ modulus $$$998244353$$$.

Then we can dp it but in a reversing way:

const int N = 10000005, mod = 998244353;
long long fac[N], ifac[N];
void init() {
	ifac[0] = fac[0] = 1;
	for(int i = 1; i < N; i++) fac[i] = fac[i - 1] * i % mod;
	ifac[N - 1] = inv(fac[N - 1]);
	for(int i = N - 2; i; i--) ifac[i] = ifac[i + 1] * (i + 1) % mod;
}

Step 3:

$$$\binom{n}{k}=\frac{n!}{k!(n-k)!}$$$, so:

long long binom(int n, int m) {
	return fac[n] * ifac[n - m] % mod * ifac[m] % mod;
}

But to prevent conditions like $$$n<0,n-m<0$$$ or $$$m<0$$$, instead of pending it when calling to the function, we contain it in the function and return value $$$0$$$, it's okay.

So, the final code:

const int N = 10000005, mod = 998244353;
long long fac[N], ifac[N];
void init() {
	ifac[0] = fac[0] = 1;
	for(int i = 1; i < N; i++) fac[i] = fac[i - 1] * i % mod;
	ifac[N - 1] = pw(fac[N - 1], mod - 2);
	for(int i = N - 2; i; i--) ifac[i] = ifac[i + 1] * (i + 1) % mod;
}
long long binom(int n, int m) {
	if(n < 0 || n < m || m < 0) return 0;
	return fac[n] * ifac[n - m] % mod * ifac[m] % mod;
}

Calling to binom(n, m) to get the value of $$$\binom{n}{m}$$$.

Part 6 — Homework

Task A

Complete the implementation of Task A, B of last time's homework.

Task B

Solve Gym409982D in $$$\mathcal O(n)$$$ time.

Maybe you have enter this first.

Task C

Initiate the array inv[], when $$$inv[i]$$$ denotes to the inverse of the integer $$$i$$$ under the meaning of operation multiplication modulus $$$998244353$$$ in $$$\mathcal O(n)$$$ time.

Task D

We have a string $$$S$$$ with $$$\tt ($$$ and $$$\tt )$$$.

A valid string $$$S$$$ is:

If $$$S$$$ is empty, $$$S$$$ is valid.
If $$$T$$$ is valid, $$$S=\texttt ( T \texttt )$$$ is valid.
If $$$s,t$$$ are valid, $$$S=st$$$ is valid.

For example $$$\texttt{(()(())), ()()()(), ()}$$$ are all valid while $$$\texttt{)(, ()(()}$$$ aren't.

Count the number of valid strings when the length of $$$S$$$ is given.

At last

The second blog is completed, it discussed the implementation. Next blog we will discuss things about maths again, get ready. If you are puzzled with something in the text, you can comment under the blog!

It really takes time, please, give your valuable comments and advise!

Thanks for reading! This series will be written on.

Last but not least: Happy new year!