Boring backstory of this blog
In 300IQ contest 3, there was a problem where you had to count the number of permutations with two disjoint longest increasing subsequences. We VCd this contest while practising for ICPC, and didn't solve this problem during the contest, so I was very interested to see how it could be solved. Turns out the solution uses something called Young diagrams, and unless you already know what they are, the editorial is impossible to understand.
I asked if someone knew about Young diagrams on the competitive programming discord, and got linked a paper from the Chinese IOI selection camp, written by yfzcsc. If you can speak chinese, you should just read the paper, it covers a lot more than I do here. With the help of my teammate YaoBIG I believe I understand enough to write a English-language blog on this topic for the rest of us :)
Definitions
We call a sequence $$$\lambda$$$ of positive integers a partition of $$$n$$$, if $$$\lambda_i \geq \lambda_{i + 1}$$$ and $$$\sum_{i = 1}^{|\lambda|} \lambda_i = n$$$. We denote by $$$P_n$$$ the set of partitions of $$$n$$$.
For a partition $$$\lambda$$$, we define the Young diagram $$$Y(\lambda)$$$ as the set of pairs $$$(i, j)$$$ of positive integers such that $$$i \leq |\lambda|$$$ and $$$j \leq \lambda_i$$$. We call the pairs $$$s = (i, j) \in Y(\lambda)$$$ the squares of the diagram.
A normalised Young Tableau $$$P$$$ of size $$$n$$$ is a pair $$$(\lambda, p)$$$ of a partition $$$\lambda$$$ of $$$n$$$, and a permutation $$$p$$$ indexed by $$$Y(\lambda)$$$, such that $$$p_{i, j} < min(p_{i + 1, j}, p_{i, j + 1})$$$.
We use natural definitions of a transpose: we denote by $$$\lambda^T$$$ the transpose partition, that is $$$\lambda^T_j = |\{i : \lambda_i \geq j\}|$$$, let $$$Y^T(\lambda) = Y(\lambda^T)$$$, and define the transpose $$$P^T = (\lambda^T, p^T)$$$ of a normalised Young diagram such that $$$p_{i, j} = p^T_{j, i}$$$.
What is a Young tableau and why are they interesting
Why is it interesting to define Young tableaus? It turns out that there is a one-to-one mapping $$$f$$$ from permutations $$$\pi$$$ of length $$$n$$$ to pairs $$$(P, Q) = ((\lambda, p), (\lambda, q))$$$ of standard Young tableaus of size $$$n$$$ that share the same partition. The mapping has the following properties:
- $$$\sum_{i \leq m} \lambda_i$$$ is the maximum total size of $$$m$$$ disjoint increasing subsequences of the permutation
- $$$\sum_{j \leq m} \lambda_j^T$$$ is the maximum total size of $$$m$$$ disjoint decreasing subsequences of the permutation
- If $$$\pi^{-1}$$$ is the inverse permutation, then $$$f(\pi^{-1}) = (Q, P)$$$
- If $$$rev(\pi)$$$ is the reversed permutation, then $$$f(rev(\pi)) = (P^T, ??)$$$ (actually, the second component is the normalised result of applying the Schutzenberger algorithm to $$$Q$$$, but we don't care about that)
- The mapping can be computed in $$$\mathcal{O}(n \sqrt{n} \log n)$$$ time. The inverse can be computed in $$$\mathcal{O}(n^2)$$$ time
If $$$c(s)$$$ is the number of valid $$$p$$$ for the partition $$$s$$$, the existence of the mapping shows that $$$n! = \sum_{s \in P_n} c(s)^2$$$. By property 3, the number of permutations $$$\pi$$$ such that $$$\pi = \pi^{-1}$$$ is $$$\sum_{s \in P_n} c(s)$$$. The value we want to compute in the problem presented at the start of the blog is $$$\sum_{s \in P_n : s_1 = s_2} c(s)^2$$$.
The number of partitions of $$$75$$$ is small, so we can loop over all of them. It remains to compute $$$c(s)$$$ fast. For this, we can use the hook-length formula, which we can compute in $$$\mathcal{O}(n)$$$ time.
The mapping: Robinson–Schensted–Knuth correspondence
The name is fancy, but the mapping isn't hard (though proofs of its properties are).
Let's first describe how the first tableau $$$P$$$ is computed. We loop over the rows from first to last, and in each row, find the first element that is larger than the value we are inserting. If we found such a value, we swap it with the value we are inserting and proceed to the next row. Otherwise, we add the value we are inserting to the end of the row and are done.
In every step, the length of exactly one row of the tableau $$$P$$$ increases by $$$1$$$. In step $$$i$$$, we insert $$$i$$$ to tableau $$$Q$$$ at the end of the row whose size increased in the first tableau. This way, both tableaus always have the same shape. By a trivial induction proof, the values in the cells of every row and column in both tableaus are strictly increasing. Thus, the mapping is a injection.
The naive way to perform the insertion takes $$$\mathcal{O}(n)$$$ time, thus we can compute the mapping in $$$\mathcal{O}(n^2)$$$ time.
To compute the inverse, we undo the insertions one by one, starting from the last. Let $$$i$$$ be the row s.t. $$$q_{i, s_i} = n$$$. This is the row whose size increased by $$$1$$$ in the insertion of the last value. Let $$$v \leftarrow p_{i, s_i}$$$, then delete cell $$$(i, s_i)$$$ from both tableaus.
Now, if $$$i = 1$$$, we have $$$\pi_n = v$$$ and can recurse to find the rest of $$$\pi$$$. Otherwise, the reason $$$v$$$ entered row $$$i$$$ was because some smaller value replaced it on row $$$i - 1$$$. This value is the largest value smaller than it. Swap that value with $$$v$$$ and subtract $$$1$$$ from $$$i$$$.
O(n^2) code for the mapping and its inverse#include <bits/stdc++.h>
using namespace std;
using Tableau = vector<vector<int>>;
// Calculates the Robinson–Schensted–Knuth mapping in O(n^2)
pair<Tableau, Tableau> rskMapping(const vector<int>& pi) {
Tableau p, q;
for (int ind = 0; ind < pi.size(); ++ind) {
int cur = pi[ind];
bool found = 0;
for (int i = 0; i < p.size() && !found; ++i) {
int j = upper_bound(p[i].begin(), p[i].end(), cur) - p[i].begin();
if (j == p[i].size()) {
p[i].push_back(cur);
q[i].push_back(ind);
found = 1;
} else {
swap(p[i][j], cur);
}
}
if (! found) {
p.emplace_back(vector<int>(1, cur));
q.emplace_back(vector<int>(1, ind));
}
}
return {p, q};
}
// Calculates the inverse of the Robinson–Schensted–Knuth mapping in O(n^2)
vector<int> rskInverse(int n, Tableau p, Tableau q) {
vector<int> pi(n);
for (int ind = n-1; ind >= 0; --ind) {
int i = 0, j = 0;
for (; i < q.size(); ++i) {
for (j = 0; j < q[i].size() && q[i][j] != ind; ++j) {}
if (j < q[i].size()) break;
}
int cur = p[i][j];
p[i].pop_back();
q[i].pop_back();
for (--i; i >= 0; --i) {
j = upper_bound(p[i].begin(), p[i].end(), cur) - p[i].begin();
swap(p[i][j - 1], cur);
}
pi[ind] = cur;
}
return pi;
}
For the faster algorithm for computing the mapping, note that computing just the first $$$\lceil \sqrt{n} \rceil$$$ rows of $$$P$$$ can be done in $$$\mathcal{O}(n \sqrt{n} \log n)$$$ time. Using property $$$4$$$, we can compute the first $$$\lceil \sqrt{n} \rceil$$$ columns of $$$P$$$ in the same time. Since there cannot be more than $$$\lceil \sqrt{n} \rceil$$$ rows with at least $$$\lceil \sqrt{n} \rceil$$$ cells, this way we find the entire tableau in $$$\mathcal{O}(n \sqrt{n} \log n)$$$ time. To find $$$Q$$$, we use property $$$3$$$ and then repeat the above.
I am not aware of any way to calculate the inverse mapping in $$$\mathcal{O}(n \sqrt{n} \log n)$$$ time.
O(n sqrt(n) log(n)) code for the mapping
#include <bits/stdc++.h>
using namespace std;
using Tableau = vector<vector<int>>;
// Calculates the first k rows of P in the Robinson-Schensted-Knuth mapping in O(nk)
Tableau partialRSK(const vector<int>& pi, int k) {
Tableau p(k);
for (int ind = 0; ind < pi.size(); ++ind) {
int cur = pi[ind];
for (int i = 0; i < k; ++i) {
int j = upper_bound(p[i].begin(), p[i].end(), cur) - p[i].begin();
if (j < p[i].size()) swap(p[i][j], cur);
else {
p[i].push_back(cur);
break;
}
}
}
while(p.back().empty()) p.pop_back();
return p;
}
// Calculates the Robinson-Schensted-Knuth mapping in O(n sqrt(n) log(n)) time
pair<Tableau, Tableau> fastRSKMapping(vector<int> pi) {
int n = pi.size(), k = 1;
while(k*k < n) ++k;
vector<int> inv_pi(n);
for (int i = 0; i < n; ++i) inv_pi[pi[i]] = i;
Tableau p = partialRSK(pi, k), q = partialRSK(inv_pi, k);
reverse(pi.begin(), pi.end()); reverse(inv_pi.begin(), inv_pi.end());
Tableau p_cols = partialRSK(pi, k), q_cols = partialRSK(inv_pi, k);
p.resize(p_cols[0].size()); q.resize(q_cols[0].size());
for (int i = k; i < p.size(); ++i) {
for (int j = 0; j < p_cols.size() && p_cols[j].size() > i; ++j) {
p[i].emplace_back(p_cols[j][i]);
q[i].emplace_back(q_cols[j][i]);
}
}
return {p, q};
}
Unfortunately, the proofs for properties $$$1$$$ to $$$4$$$ are long, involved and very hard to understand (at least for my counting-challenged brain), so I won't put them here. The proof for properties $$$1$$$ and $$$2$$$ for $$$m > 1$$$ is due to Greene. Properties $$$3$$$ and $$$4$$$ appear as theorems 3.2.1 and 4.1.1 here.
Hook-Length Formula
For $$$\lambda \in P_n$$$ and $$$(i, j) \in Y(\lambda)$$$, we define $$$h_{\lambda}(i, j)$$$ as the number of squares in the diagram directly below or to the right of square $$$(i, j)$$$. Formally written, $$$h_{\lambda}(i, j) = (\lambda^T_j - i) + (\lambda_i - j) + 1$$$. The hook-length theorem states that $$$t(\lambda) = c(\lambda)$$$ for
\begin{equation} t(\lambda) = \frac{n!}{\prod_{(i, j) \in Y(\lambda)} h_\lambda(i, j)} \end{equation}
Computing $$$t(\lambda)$$$ is indeed easy to implement in $$$\mathcal{O}(n)$$$ time, and unlike many other results stated in this blog, the hook-length formula's proof is not hard and very beautiful. The below proof is from here.
Code for the hook-length formula#include <bits/stdc++.h>
using namespace std;
using Tableau = vector<vector<int>>;
using ll = long long;
const ll MOD = (ll)1e9 + 7;
ll modPow(ll a, ll b) { return (b == 0) ? 1 : ((b & 1) ? a * modPow(a, b ^ 1) % MOD : modPow(a*a % MOD, b >> 1)); }
// Calculates the number of standard Young tableau of shape lambda in O(n) using the hook-length formula
ll hookLength(int n, const vector<int>& lambda) {
vector<int> lambda_t(lambda[0]);
for (int i = 0, j = lambda[0]; i < n; ++i) {
while(j >= 0 && (i + 1 == n || lambda[i+1] <= j)) lambda_t[j] = i + 1;
}
ll num = 1, div = 1;
for (int i = 0; i < n; ++i) num = num * n % MOD;
for (int i = 0; i < lambda.size(); ++i) {
for (int j = 0; j < lambda[i]; ++j) {
ll mult = 1 + (lambda[i] - j - 1) + (lambda_t[j] - i - 1);
div = div * mult % MOD;
}
}
return num * modPow(div, MOD - 2) % MOD;
}
Proof of the hook-length formulaDenote by $$$\lambda^-$$$ the set of partitions $$$\mu$$$ of $$$n-1$$$ such that $$$Y(\mu) \subseteq Y(\lambda)$$$. These partitions have the same Young diagram as $$$\lambda$$$ with one corner square removed.
Note that $$$c(\lambda)$$$ satisfies the recurrence $$$c(\lambda) = \sum_{\mu \in \lambda^-} c(\mu)$$$ as the maximum value has to occur in some corner cell.
We define a random walk on the Young diagram $$$Y(\lambda)$$$ of the partition. The random walk starts at a uniformly random square of $$$Y(\lambda)$$$, and in every step moves from $$$(i, j)$$$ to a uniformly random square in $$$H_\lambda(i, j) \setminus \{(i, j)\}$$$ where $$$H_\lambda(i, j)$$$ is the set of squares in $$$Y(\lambda)$$$ directly below or to the right of $$$(i, j)$$$. Thus the probability of ending up in square $$$(i', j') \in H_\lambda(i, j) \setminus \{(i, j)\}$$$ is $$$\frac{1}{h_\lambda(i, j) - 1}$$$. When $$$h_\lambda(i, j) = 1$$$, the random walk stops.
After the random walk stops, remove the square it ended up at from $$$Y$$$. We'll show that the probability that the resulting partition corresponding to the diagram after removing the square is $$$\mu \in \lambda^-$$$ with probability $$$t(\mu) / t(\lambda)$$$. Since the process always results in some $$$\mu \in \lambda^-$$$, the probabilities must sum to one, so we must then have $$$t(\lambda) = \sum_{\mu \in \lambda^-} t(\mu)$$$. Since $$$t(\mu) = c(\mu) = 1$$$ for the only partition $$$\mu$$$ of $$$1$$$, the result follows by induction.
First, note that if $$$Y(\lambda) = Y(\mu) + (i, j)$$$, we have \begin{equation} \frac{t(\mu)}{t(\lambda)} = \frac{1}{n} \prod_{k < i} \frac{h_\lambda(k, j)}{h_\lambda(k, j) — 1} \prod_{k < j} \frac{h_\lambda(i, k)}{h_\lambda(i, k) — 1} = \frac{1}{n} \prod_{k < i} \left(1 + \frac{1}{h_\lambda(k, j) — 1}\right) \prod_{k < j} \left(1 + \frac{1}{h_\lambda(i, k) — 1}\right) \end{equation}
For a random walk $$$((a_1, b_1), \dots, (a_k, b_k))$$$, let $$$A = \{a_1, \dots, a_k\}$$$ be the set of visited $$$y$$$-coordinates, and $$$B = \{b_1, \dots, b_k\}$$$ be the set of visited $$$x$$$-coordinates. Since the square the walk ends up in is $$$(a_k, b_k) = (\max A, \max B)$$$, the pair $$$(A, B)$$$ uniquely determines where the cell ends up in (though it does not uniquely determine the walk).
By $$$p(A, B | a, b)$$$, denote the probability that a random walk starting at $$$(a, b)$$$ has the given sets $$$A$$$ and $$$B$$$. If $$$\alpha = \max A$$$ and $$$\beta = \max B$$$, we claim that \begin{equation} p(A, B\ | a, b) = \prod_{i \in A \setminus \{\alpha\}} \frac{1}{h_\lambda(i, \beta) — 1} \prod_{j \in B \setminus \{\beta\}} \frac{1}{h_\lambda(\alpha, j) — 1} \end{equation} this is a easy proof by induction on $$$|A| + |B|$$$: if $$$A = \{a_1, \dots, a_k\}$$$ and $$$b = \{b_1, \dots, b_k\}$$$, $$$a_1 = a, b_1 = b$$$ and $$$a_k = \alpha, b_k = \beta$$$, \begin{equation} p(A, B\ | a, b) = \frac{1}{h_\lambda(a, b) — 1} \left(p(A \setminus \{a\}, B | a_2, b) + p(A, B \setminus \{b\} | a, b_2)\right) \end{equation} which by induction equals \begin{equation} \frac{1}{h_\lambda(a, b) — 1} \left((h_\lambda(a, \beta) — 1) + (h_\lambda(\alpha, b) — 1)\right)\prod_{i \in A \setminus \{\alpha\}} \frac{1}{h_\lambda(i, \beta) — 1} \prod_{j \in B \setminus \{\beta\}} \frac{1}{h_\lambda(\alpha, j) — 1} \end{equation} We have $$$h_\lambda(a, b) - 1 = (h_\lambda(a, \beta) - 1) + (h_\lambda(\alpha, b) - 1)$$$, thus the claimed equation for $$$p(A, B\ | a, b)$$$ holds.
Now, we are done, as \begin{equation} \sum_{a \leq \alpha, b \leq \beta} \frac{1}{n} \sum_{\stackrel{A \subseteq [n]}{\stackrel{min A = a}{max A = \alpha}}} \sum_{\stackrel{B \subseteq [n]}{\stackrel{min B = b}{max B = \beta}}} p(A, B | a, b) = \frac{1}{n} \prod_{k < \alpha} \left(1 + \frac{1}{h_\lambda(k, \beta) — 1}\right) \prod_{k < \beta} \left(1 + \frac{1}{h_\lambda(\alpha, k) — 1}\right) \end{equation} you can see this by looping $$$y$$$ from $$$\alpha$$$ to $$$1$$$ and deciding if $$$y$$$ should be in $$$A$$$, then doing the same for $$$x, \beta$$$ and $$$B$$$. The $$$\frac{1}{n}$$$ factor comes from the random selection of the starting square.
The above proof for the hook-length formula also gives an unbiased sampler for $$$p$$$ given $$$\lambda$$$, though I doubt that will ever be useful in competitive programming.
Problems
300IQ contest 3 problem D
GP of southeastern europe 2021 problem D
If you know any more problems where this technique can be applied, please share a link with me. In particular, there probably exists a problem where you have to find for every $$$m$$$ the maximum total size of $$$m$$$ disjoint increasing subsequences.