zscoder's blog

By zscoder, history, 6 weeks ago, In English

Hi everyone,

April Fools is near and as usual we have an April Fools Day Contest on Codeforces this year. In addition to that, I usually try to host some form of mini April Fools Contest for my friends every year (or almost every year).

This year, I am trying to host a bigger April Fools Day contest than usual and I invite everyone to participate! In view of the April Fools contest on Codeforces, the round will begin at 31 March 10pm (GMT+8) and lasts for exactly $$$24$$$ hours (and thus it will end ~35 minutes before the CF April Fools round).

The contest will consist of several unusual tasks, and I hope that everyone will at least find something interesting. The problems will not be sorted by increasing order of difficulty (if the word difficulty is even applicable), so it is highly advisable to read (and try!) all tasks. This round is not an offical Codeforces round and thus is obviously unrated.

There is a Discord server for this contest (you can join it using this link). The Discord server might be required for some tasks. Hope to see you there!

Contest Link: here

UPD: The contest is open to both individuals and teams. However, if you decide to join as part of a team, I encourage you to use the team registration feature on CF or at least submit from just one account.

I would also like to thank gamegame and DreamingLeaf for their help in testing and feedback (which changed one of the tasks).

The contest scoring will be IOI-styled, and there are no time penalties. Most of the tasks have some form of partial scoring. There will be $$$7$$$ tasks in total, with scores $$$100-100-100-100-100-100-200$$$.

UPD 2: Registration should be open now. The contest starts in ~6 hours. Good luck and have fun! (If you haven't joined the discord server, I encourage you to do so :) ).

UPD 3: The contest has just begun! Good luck and have fun. :)

UPD 4:

Here are some statistics.

First to AC:

A: I_love_Hoang_Yen (00:07)

B: googol_S0 (03:42)

C: kefaa2 + antontrygubO_o (00:46)

D: conqueror_of_tourist (00:29)

E: conqueror_of_tourist (01:58)

F: ksun48 + tourist (11:05)

G: N/A

Number of ACs for each problem

A: 45

B: 3

C: 13

D: 27

E: 17

F: 2

G: 0

Number of ACs for Hololive Subtasks

Watame Janken: 10

Friend Fubuki: 33

Mio Yacht: 2

Yandere Rushia: 15

Apex Aqua: 8

Suisei Tetris: 10

All-Rejecting Okayu: 0 (shoutout to tourist+ksun48 and Ari for being the only ones to get $$$>4$$$ points!)

Gambling Pekora: 7

Korone Game: 6

Untitled Subaru Game: 2 (madlads AMO5 and Ari)

Hall of Fame

Okayu Top Scorers

Yacht Top Scorers

Tetris Top Scorers

Rushia Affection Leaderboard

Aqua Points Leaderboard

Okayu Stats

Global Stats

Top 10 Overall in Hololive Task

Top Scorers (Full Score = 800)

  1. ksun48 + tourist: 784 (only team to AC all A-F!)

  2. Golovanov399 AndreySergunin amethyst0: 679

  3. conqueror_of_tourist: 669

  4. kclee2172: 618

  5. googol_S0: 592

  6. Maksim1744: 592

  7. I_love_Hoang_Yen: 556

  8. dorijanlendvaj: 500

  9. Parsa84 Arian.Eft: 500

  10. Ari: 499 (highest score in G!)

Short Hints for all problems

A. Pacman and Power Pellet

B. As Easy As ABC

C. Two Hashes

D. Math Homework

E. Simple Exercise

F. Mystery

G. Can you do the hololive?

Read more »

  • Vote: I like it
  • +247
  • Vote: I do not like it

By zscoder, history, 12 months ago, In English

Welcome to Part 2 of my tutorial on generating functions. The first part focused on introducing generating functions to those without any background in generating functions. In this post, I will demonstrate a few applications of generating functions in CP problems. Let us start with some relatively straightforward examples.

Note: Unless stated otherwise, all computations are done modulo a convenient prime (usually $$$998244353$$$). Also, $$$[n]$$$ denotes the set $$$\{1,2,...,n\}$$$.

Blatant Applications in Counting Problems

Problem. AGC 005 Problem F You have a tree $$$T$$$ with $$$n$$$ vertices. For a subset $$$S$$$ of vertices, let $$$f(S)$$$ denote the minimum number of vertices in a subtree of $$$T$$$ which contains all vertices in $$$S$$$. For all $$$1 \le k \le n$$$, find the sum of $$$f(S)$$$ over all subsets $$$S$$$ with $$$|S| = k$$$.

Constraints: $$$n \le 2 \cdot 10^{5}$$$.


Convolutions of this type appear in countless FFT problems and usually the hard part is to reduce your sums into a form that can be seen as the convolution of two polynomials. However, generating functions are much more powerful than this, as we shall see the next examples.

Problem. The Child and Binary Tree You have a set of positive integers $$$C = \{c_1, c_2, ..., c_{n}\}$$$. A vertex-weighted rooted binary tree is called good if all the vertex weights are in $$$C$$$. The weight of such a tree is the sum of weights of all vertices. Count the number of distinct good vertex-weighted rooted binary trees with weight $$$s$$$ for all $$$1 \le s \le m$$$. Note that the left child is distinguished from the right child in this problem (see the samples for more info).

Constraints: $$$n, m \le 10^{5}$$$


Problem. Descents of Permutations (from Enumerative Combinatorics 1) Call a permutation $$$p_1, p_2, ..., p_n$$$ of $$$[n]$$$ $$$k$$$-good if $$$p_{i}>p_{i+1}$$$ iff $$$k \mid i$$$. Find the number of $$$k$$$-good permutations of length $$$n$$$.

Constraints: $$$n, k \le 5 \cdot 10^{5}, n \ge 3, k \ge 2$$$


Expected Value of Stopping Time

I think this is also a beautiful application of generating functions. The recent problem Slime and Biscuits can be solved using the trick I will demonstrate here (there is a short editorial using this method here). Let's look at a different example.

Problem. Switches There are $$$n$$$ switches, each of which can be on or off initially. Every second, there is a probability of $$$\frac{p_{i}}{S}$$$ that you will flip the state of the $$$i$$$-th switch. The game ends when all switches are off. What is the expected number of seconds the game will last?

Constraints: $$$n \le 100$$$, $$$\sum p_{i} = S$$$, $$$p_{i}>0$$$, $$$S \le 50000$$$


In general, the trick of introducing $$$A$$$ and $$$B$$$ can be used to solve other problems that asks for the first stopping time of some system if you have multiple possible ending states and the time taken to reach from one ending state to another is equal, and the probability to reach an ending state in a fixed amount of moves is easy to compute.

Roots of Unity Filter

Next, we introduce a trick that is more rarely used in competitive programming but nevertheless interesting to learn. The motivation is the following classic problem.

Problem. Roots of Unity Filter Find the sum $$$\displaystyle\sum_{i \equiv r \pmod{m}}\binom{n}{i}$$$ modulo an arbitrary $$$MOD$$$.

Constraints: $$$1 \le n \le 10^{18}, 2 \le m \le 2000, 0 \le r \le n - 1, 10^{8} \le MOD \le 10^{9}$$$


Next, we look at a nice harder example.

Problem. Rhyme Compute the sum of $$$\frac{n!}{x_{1}!x_{2}!...x_{k}!}$$$ over all tuples of positive integers $$$(x_{1},x_{2},...,x_{k})$$$ such that $$$d|x_{i}$$$ and $$$\sum x_{i} = n$$$, modulo $$$19491001$$$ (a prime).

Constraints: $$$n \le 10^{9}, k \le 2000, d \in \{4, 6\}$$$.


Generating Functions that you can't compute "naively"

In some problems, the constraints might be too large to even compute the generating functions you found with FFT or algorithms involving polynomial operations. In this case, you usually need to analyze some special properties of the generating function you are dealing with (and thus it is helpful to recognize some common generating functions).

We start with a relatively easy example.

Problem. Perfect Permutations Find the number of permutations of length $$$n$$$ with exactly $$$k$$$ inversions, modulo $$$10^{9}+7$$$.

Constraints: $$$1 \le n \le 10^{9}$$$, $$$0 \le k \le 10^{5}$$$, $$$n \ge k$$$. There are $$$100$$$ testcases per input file.


Next, we look at a problem that has a natural statement in generating functions, but it turns out that the computation in generating functions is quite tricky. The official editorial has a nice and simpler solution using a magical observation, but to demonstrate the power of generating functions I will show an alternative method (which seems more straightforward and generalizable).

Problem. Sum of Fibonacci Sequence Let $$$d_{n,k}$$$ be defined by the recurrence $$$d_{1,1}=d_{1,2}=1$$$, $$$d_{1,k}=d_{1,k-1}+d_{1,k-2}$$$ for $$$k \ge 3$$$, and $$$d_{n,k}=\displaystyle\sum_{i=1}^{k}d_{n-1,i}$$$ for $$$n \ge 2$$$, $$$k \ge 1$$$.

Compute $$$d_{n,m}$$$ modulo $$$998244353$$$.

Constraints: $$$1 \le n \le 200000$$$, $$$1 \le m \le 10^{18}$$$


To end this section, we conclude with a problem that heavily relies on linear recurrences. Actually, it might be a stretch to call this a generating function problem but I just want to demonstrate the trick of using generating functions to compute linear recurrences which is basically the same as the one shown here.

Problem. Sum Modulo You have a number $$$x$$$ which is initially $$$K$$$. Every second, for $$$1 \le i \le n$$$, there is a probability $$$p_{i}$$$ that you will replace $$$x$$$ with $$$(x - i) \pmod{M}$$$. Find the expected number of moves before the counter goes to $$$0$$$. $$$p_{i}$$$ are given as $$$\frac{A_{i}}{\sum A_{i}}$$$ for some positive integers $$$A_{1}, A_{2}, ..., A_{n}$$$ and your output should be modulo $$$998244353$$$ (and it is guaranteed that you don't have to worry about division by $$$0$$$).

Constraints: $$$1 \le n \le \min(500,M-1)$$$, $$$2 \le M \le 10^{18}$$$, $$$1 \le A_{i} \le 100$$$


Lagrange Inversion Formula

Finally, inspired by this Div. 1 F problem, I looked up on some applications on Lagrange Inversion Formula and found a few examples on it. I would like to end this article by demonstrating a few applications of it.

Some examples here are from Enumerative Combinatorics Volume 2 and some are from jcvb's Chinese paper on generating functions.

The idea of the Lagrange Inversion Formula is that sometimes we want to find the compositional inverse of a function but it is difficult to find. However, the coefficients of this inverse function might have a simpler formula, which we can obtain from Lagrange Inversion Formula.

There are many variants of stating the Lagrange Inversion Formula, so I will show what I think is the most helpful version of it (also given in this comment).

Theorem. Let $$$F(x), G(x)$$$ be formal power series which are compositional inverses (i.e. $$$F(G(x)) = x$$$). Suppose $$$F(0)=G(0)=0$$$, $$$[x^{1}]F(x) \neq 0$$$, $$$[x^{1}]G(x) \neq 0$$$, then

$$$[x^{n}]G(x) = \frac{1}{n}[x^{-1}]\frac{1}{F(x)^{n}}$$$

Also, for any power (or Laurent) series $$$H(x)$$$, we have

$$$[x^{n}]H(G(x)) = \frac{1}{n}[x^{-1}]H'(x)\frac{1}{F(x)^{n}}$$$

Note: Laurent Series can be intuitively seen as the generalization of power series where the powers can go negative.

Intuitively, if you "know" how to compute $$$F(x)$$$, then you can also get the coefficients of the compositional inverse of $$$F(x)$$$. Let's go through a few examples.

Tree Enumeration

Problem. Count the number of labelled trees on $$$n$$$ vertices (number of trees where vertices are labelled).


Number of $$$2$$$-edge connected graphs

Problem. Find the number of labelled $$$2$$$-edge connected graphs on $$$n$$$ vertices. A graph is $$$2$$$-edge connected graphs if it has no bridges, i.e. removing any edge does not disconnect the graph.

Constraints: $$$n \le 3 \cdot 10^{5}$$$


Coefficient of fixed $$$x^{k}$$$ in $$$f(x)^{i}$$$

This is a more of a trick than a specific problem. Let $$$f(x)$$$ be a power series with a compositional inverse ($$$[x^{0}]f(x) = 0$$$, $$$[x^{1}]f(x) \neq 0$$$). We can find the coefficient of $$$x^{k}$$$ (assume $$$k \ge 1$$$) in $$$f(x)^{i}$$$ for all $$$1 \le i \le n$$$ in $$$O(n\log n)$$$ time (assume $$$k = O(n)$$$).

Let $$$ans(i)$$$ denote the answer for fixed $$$i$$$. Instead of looking at $$$ans(i)$$$ as a sequence, let's introduce a new variable $$$u$$$ and consider the OGF

$$$A(u) = ans(0) + ans(1)u + ans(2)u^{2} + ... = \displaystyle\sum_{n \ge 0}[x^{k}]f(x)^{n}u^{n} = [x^{k}]\displaystyle\sum_{n \ge 0}(f(x)u)^{n} = [x^{k}]\frac{1}{1 - uf(x)}$$$.

Since $$$f(x)$$$ has a compositional inverse (say $$$g(f(x)) = x$$$), by Lagrange Inversion formula (with $$$H(x) = \frac{1}{1 - ux}$$$), we obtain

$$$[x^{k}]\frac{1}{1-uf(x)} = \frac{1}{k}[x^{-1}]\left(\frac{1}{1-ux}\right)'\left(\frac{1}{g(x)^{k}}\right) = \frac{1}{k}[x^{k-1}]\left(\frac{1}{1-ux}\right)'\left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right)$$$.

Note that by Quotient Rule, $$$\left(\frac{1}{1-ux}\right)' = \frac{u}{(1-ux)^{2}}$$$.

Our goal is to rewrite our sum in a clear manner so that we can "read off" the coefficients of $$$u^{i}$$$. We try to change the problem of finding the coefficients of $$$u^{i}$$$ into a problem about purely finding the coefficients of $$$x^{j}$$$ of some function.

The idea is to expand the series

$$$\frac{u}{(1-ux)^{2}} = u(1-ux)^{-2} = u\displaystyle\sum_{i \ge 0}\binom{i+1}{1}(ux)^{i}$$$ (recall how to expand $$$(1-x)^{-2}$$$)

$$$= u^{i+1}\displaystyle\sum_{i \ge 0}(n+1)x^{i}$$$, thus

$$$[x^{k}]\frac{1}{1-uf(x)} = \frac{1}{k}[x^{k-1}]\displaystyle\sum_{i \ge 0}(i+1)x^{i}u^{i+1} \left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right)$$$

Let's look at $$$ans(i+1)$$$, the coefficient of $$$u^{i+1}$$$. We have

$$$ans(i+1) = [u^{i+1}]\frac{1}{k}[x^{k-1}]\displaystyle\sum_{i \ge 0}(i+1)x^{i}u^{i+1} \left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right) = \frac{i+1}{k}[x^{k-i-1}]\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}$$$.

Now, our problem reduces to computing the coefficients of one fixed function $$$P(x) = \frac{1}{\left(\frac{g(x)}{x}\right)^{k}}$$$, which we can compute the first $$$n$$$ terms of using the usual polynomial operations! Thus, we can compute $$$ans(i)$$$ for all $$$i$$$ in $$$O(n\log n)$$$ time!

If $$$f(x)$$$ does not have a compositional inverse, it is possible to "adjust" our function $$$f$$$ (create a new function related to $$$f$$$) so that it has a compositional inverse. I leave this as an exercise.

Final Boss: Div 1 F — Slime and Sequences

As the grand finale of this 2-part article, I would like to discuss the recent very difficult Div. 1 F problem which was the inspiration of the entire blog in the first place.

Problem. Slime and Sequences A sequence of positive integers $$$s$$$ is called good if for each $$$k>1$$$ that is present in $$$s$$$, the first occurrence of $$$k-1$$$ (which must exist) must occur before the last occurrence of $$$k$$$. Count the number of times each integer $$$1 \le i \le n$$$ appears over all good sequences of length $$$n$$$.

Constraints: $$$1 \le n \le 100000$$$


If you have any questions or spotted any errors, please tell me in the comments. There are probably more cool applications of generating functions in CP that I am not aware of, so feel free to share them in the comments too. :)

Read more »

  • Vote: I like it
  • +518
  • Vote: I do not like it

By zscoder, history, 12 months ago, In English

Hi everyone! Inspired by the recent Codeforces Round 641, I decided to write an introductory tutorial on generating functions here. I am by no means an expert in generating functions so I will write about what I currently know about them. Retired_MiFaFaOvO has written a really interesting blog here on Codeforces about more advanced applications of generating functions, but I think there is no English tutorial on the basics of this topic yet (or at least on CP sites). Thus, I would like to share about this topic here.

I plan to split this tutorial into two parts. The first part (this post) will be an introduction to generating functions for those who have never learned about them at all, and some standard examples and showcases of generating functions. The second part will be a collection of several applications of generating functions in CP-style problems. If you are already familiar with generating functions, you may just skip to Part 2 directly.

In this part, many examples are from the book generatingfunctionology which I think is also a great introductory material to generating functions.

What are Generating Functions?

Let's say we have a sequence $$$a_0, a_1, a_2, ...$$$. We associate with $$$a$$$ a series $$$A$$$ which "encodes" the terms in $$$a$$$ with its coefficients.

Formally, for a sequence of numbers $$$\{a_{i}\}_{i=0}^{\infty}$$$, we define the ordinary generating function (OGF) of $$$a$$$ to be $$$A(x) = \displaystyle\sum_{i=0}^{\infty}a_{i}x^{i}$$$.

For example, consider the Fibonacci sequence $$$f$$$ with the terms $$$0, 1, 1, 2, 3, 5, 8, …$$$. Then, $$$F(x) = 0 + x + x^{2} + 2x^{3} + 3x^{4} + 5x^{5} + 8x^{6} + …$$$.

You may imagine that you are putting the (infinitely many) terms of the sequence in a line, and assigning a power of $$$x$$$ to each term of the sequence in order. Adding them up, you get an “infinite polynomial” which somewhat encodes the sequence. The nice thing about generating functions is that sometimes the series is nicer to play around with which will sometimes uncover surprising properties of the sequence.

There are other types of generating functions, such as the Exponential Generating Function (EGF) and Dirichlet Generating Function. We will look at some examples of EGF later in this post, but for the next few examples we will focus on OGFs.

Before that, we introduce a simple notation. For a series $$$A(x) = \displaystyle\sum_{n \ge 0}a_{n}x^{n}$$$, we let $$$[x^{n}]A(x) = a_{n}$$$ (i.e. the coefficient of $$$x^n$$$ in $$$A$$$).

Simple Examples of OGFs

Let’s start with a very simple example. What is the OGF of the sequence $$$1,1,1,...,1$$$? By definition, we have $$$A(x) = 1+x+x^{2}+x^{3}+...$$$. Does this series look familiar?

$$$A(x)$$$ is actually a geometric series with common ratio $$$x$$$! Thus, we can use the geometric series formula to write $$$A(x) = \frac{1}{1-x}$$$.

Note: Don’t we need to care about convergence issues like $$$|x|<1$$$? Well, it depends on what you want to do with your series. We will work on formal power series most of the time, which allows us to ignore the issue of convergence. However, this also means that we can’t “substitute” $$$x$$$ as some fixed value without care. For example, when $$$x=2$$$, the series $$$A(x)$$$ diverges but for say $$$x=-\frac{1}{2}$$$, $$$A(x)$$$ converges. In this post, we won’t deal with the analytic properties of the series most of the time. If you really need to substitute values though, the general rule of thumb is that you can do it if the series converges.

We can manipulate generating functions very much like how we would manipulate other algebraic expressions. Here is a classic example.

Example. Consider the sequence $$$f_{n}$$$ defined by $$$f_{0}=0$$$, $$$f_{1}=1$$$ and $$$f_{n}=f_{n-1}+f_{n-2}$$$ for $$$n \ge 2$$$. Find the OGF of $$$f$$$ (we usually denote it with a capital letter, say $$$F$$$).

Clearly, $$$f_{n}$$$ is the $$$n$$$-th Fibonacci number. We will use the recurrence relation to find the OGF of $$$f_{n}$$$.

Firstly, we need to make the terms of the series appear. The easiest way to do this is to multiply the recurrence relation by $$$x^{n}$$$ to obtain $$$f_{n}x^{n} = f_{n-1}x^{n} + f_{n-2}x^{n}$$$.

Next, we sum up the terms on both sides over all valid $$$n$$$ (in this case $$$n \ge 2$$$) to obtain:

$$$\displaystyle\sum_{n=2}^{\infty}f_{n}x^{n} = x\displaystyle\sum_{n=2}^{\infty}f_{n-1}x^{n-1} + x^{2}\displaystyle\sum_{n=2}^{\infty}f_{n-2}x^{n-2}$$$.

This is equivalent to:

$$$F(x) - f_{0}x^{0} - f_{1}x^{1} = x(F(x) - f_{0}x^{0}) + x^{2}F(x)$$$.

$$$\Rightarrow F(x) - x = (x+x^2)F(x)$$$

$$$\Rightarrow F(x)(1-x-x^2)=x$$$

$$$\Rightarrow F(x) = \frac{x}{1-x-x^2}$$$

Thus, we obtain the OGF of Fibonacci numbers.

Let’s see another example of OGF of a common sequence.

Example. The Catalan numbers $$$c_{n}$$$ are defined by $$$c_{0} = 1$$$ and $$$c_{n+1} = \displaystyle\sum_{i=0}^{n}c_{i}c_{n-i}$$$ for $$$n \ge 0$$$. Find the OGF of $$$c_{n}$$$.

Again, our strategy is to multiply a power of $$$x$$$ to both sides and summing up for all $$$n$$$. We obtain:

$$$\displaystyle\sum_{n=0}^{\infty}c_{n+1}x^{n+1} = \displaystyle\sum_{n=0}^{\infty}\sum_{i=0}^{n}c_{i}c_{n-i}x^{n+1} = \displaystyle x\sum_{n=0}^{\infty}\sum_{i=0}^{n}c_{i}x^{i}c_{n-i}x^{n-i}$$$

The LHS is easy to intepret: it is just $$$C(x) - 1$$$.

How do we interpret the RHS? We claim that it is $$$xC(x)^{2}$$$. Consider the expansion of $$$C(x)^2$$$. Which terms contribute to the coefficient of $$$x^{n}$$$? If we look at $$$C(x)^2 = (c_{0}+c_{1}x+c_{2}x^2+...)(c_{0}+c_{1}x+c_{2}x^2+...)$$$, we see that we can only obtain $$$x^{n}$$$ by picking $$$c_{i}x^{i}$$$ from the first bracket and $$$c_{n-i}x^{n-i}$$$ from the second bracket. Hence, the coefficient of $$$x^{n}$$$ in $$$C(x)^2$$$ is $$$\displaystyle\sum_{i=0}^{n}c_{i}c_{n-i}$$$, as desired.

Hence, we have $$$C(x)-1=xC(x)^{2}$$$, which is a quadratic equation in $$$C(x)$$$! Using the quadratic formula, we can obtain $$$C(x) = \frac{1 \pm \sqrt{1 - 4x}}{2x}$$$. Which sign should we choose? If we choose the + sign, then the numerator $$$\rightarrow 2$$$ as $$$x \rightarrow 0$$$, while the denominator $$$\rightarrow 0$$$, so the ratio will become infinite at $$$0$$$. However, $$$C(x)$$$ can be expanded as a power series at $$$x=0$$$, so $$$C(x)$$$ should converge at $$$x=0$$$. Thus, we should choose the minus sign to obtain $$$C(x) = \frac{1 - \sqrt{1-4x}}{2x}$$$ (indeed, by L'Hopital Rule it converges to $$$c_{0}=1$$$ at $$$x=0$$$).

Tip: Try looking for common sequences and see if you can derive the formula for their OGFs from scratch. It is really helpful if you can derive the intuition where you can see the functional equation directly from the recurrence.

OGFs in more than one variable

We don’t have to limit ourselves to one variable. We can have multivariable OGFs. Let’s look at the following simple example.

Example. The binomial coefficients $$$c(n,k)$$$ is defined by the recurrences $$$f(n,0)=1$$$ for $$$n \ge 0$$$, $$$f(0,n)=0$$$ for $$$n \ge 1$$$ and $$$f(n,k)=f(n-1,k)+f(n-1,k-1)$$$ for $$$n,k \ge 1$$$. Find the OGF of $$$f(n,k)$$$.

We define the OGF $$$F(x,y) = \displaystyle\sum_{n \ge 0}\sum_{k \ge 0}f(n,k)x^{n}y^{k}$$$. As usual, we try to relate $$$F$$$ with itself using the recurrence. We have

$$$\displaystyle\sum_{n \ge 1}\sum_{k \ge 1}f(n,k)x^{n}y^{k} = x\displaystyle\sum_{n \ge 1}\sum_{k \ge 1}f(n-1,k)x^{n-1}y^{k} + xy\displaystyle\sum_{n \ge 1}\sum_{k \ge 1}f(n-1,k-1)x^{n-1}y^{k-1}$$$.

Hence, we have

$$$F(x,y) - \displaystyle\sum_{n \ge 0}x^{n} = x(F(x,y) - \displaystyle\sum_{n \ge 0}x^{n}) + xyF(x,y)$$$

$$$\Rightarrow F(x,y) - \frac{1}{1-x} = (x+xy)F(x,y) - \frac{x}{1-x}$$$

$$$\Rightarrow (1-x-xy)F(x,y) = 1$$$

$$$\Rightarrow F(x,y) = \frac{1}{1-x-xy}$$$

From the bivariate OGF, we can deduce some interesting identities in one-variable. For example, we have

$$$F(x,y) = \frac{1}{1-x-xy} = \frac{1}{1 - x(y+1)} = \displaystyle\sum_{k \ge 0}(y+1)^{k}x^{k}$$$

Hence, $$$[x^{n}]F(x,y) = (y+1)^{n}$$$. However, $$$[x^{n}]F(x,y) = \displaystyle\sum_{k=0}^{\infty}f(n,k)y^{k}$$$, so $$$\displaystyle\sum_{k=0}^{\infty}f(n,k)y^{k} = (y+1)^{n}$$$. Note that this gives the same result as binomial theorem on $$$(y+1)^{n} = \binom{n}{0}y^{0}+\binom{n}{1}y^{1}+...+\binom{n}{n}y^{n}$$$.

It is more interesting to look at $$$[y^{k}]F(x,y)$$$ in terms of an OGF in $$$x$$$. We have

$$$F(x,y) = \frac{1}{(1-x)-xy} = \frac{\frac{1}{1-x}}{1-\frac{x}{1-x}y} = \frac{1}{1-x}(1 + \frac{x}{1-x}y + (\frac{x}{1-x})^2y^2 + …)$$$

Comparing coefficients, we obtain $$$[y^{k}]F(x,y) = \frac{x^{k}}{(1-x)^{k+1}}$$$, so using the same argument as before, we have

$$$\displaystyle\sum_{n=0}^{\infty}f(n,k)x^{n} = \frac{x^{k}}{(1-x)^{k+1}}$$$.

This identity is interesting because it allows us to “expand” $$$\frac{1}{(1-x)^{k+1}}$$$ in terms of the OGF of binomial coefficients! In particular, we have $$$[x^{n-k}]\frac{1}{(1-x)^{k+1}} = \binom{n}{k}$$$, so $$$[x^{n}]\frac{1}{(1-x)^{k}} = \binom{n+k-1}{k-1}$$$. This identity is very useful especially when dealing with sums involving binomial coefficients where $$$k$$$ is fixed and $$$n$$$ varies.

Exponential Generating Function

So far we have looked at ordinary generating functions. Now, let me introduce a new type of generating functions called the exponential generating function.

Definition: Let $$$a_{0},a_{1},a_{2},...$$$ be a sequence of numbers. Then, the EGF of $$$a$$$ (say $$$A(x)$$$ is defined as $$$\displaystyle\sum_{i=0}^{\infty}\frac{a_{i}}{i!}x^{i}$$$.

In other words, the EGF is just the OGF but every term $$$a_{i}$$$ is now divided by $$$i!$$$. Why the weird choice of division by $$$i!$$$? The next example will shed some light on this choice.

Example. Let $$$b_{n}$$$ denote the $$$n$$$-th Bell number, which counts the number of ways to partition $$$\{1,2,...,n\}$$$ into disjoint sets. For example, $$$b_{3}=5$$$, because there are $$$5$$$ ways to partition $$$[3]$$$ into sets: $$$123$$$; $$$12$$$, $$$3$$$; $$$13$$$, $$$2$$$; $$$1$$$, $$$23$$$; $$$1$$$, $$$2$$$, $$$3$$$. Find the EGF of $$$b_{n}$$$.

Our first step is to look for a recurrence relation. Suppose you have this as a Div. 2 C problem. What is the simplest dp you can come up with?

We can fix the size of the set containing the element $$$1$$$, say $$$i$$$. Then, there are $$$\binom{n-1}{i-1}$$$ ways to choose the other $$$i-1$$$ elements of the set, and $$$b_{n-i}$$$ ways to partition the remaining $$$n-i$$$ elements. Hence, we obtain the simple recurrence formula

$$$b_{n} = \displaystyle\sum_{i=1}^{n}\binom{n-1}{i-1}b_{n-i} = \displaystyle\sum_{i=0}^{n-1}\binom{n-1}{i}b_{i}$$$ for $$$n \ge 1$$$.

By precomputing binomial coefficients, this is an $$$O(n^2)$$$ dp, which should be sufficient for a Div. 2 C. However, why stop here? Suppose the problem asks you to find $$$b_{n}$$$ for $$$n \le 3 \cdot 10^{5}$$$. Can you still solve this?

The answer is yes. Let’s use our recurrence to find the EGF of $$$b_{n}$$$. Note that

$$$b_{n} = \displaystyle\sum_{i=0}^{n-1}\binom{n-1}{i}b_{i} = \displaystyle\sum_{i=0}^{n-1}\frac{(n-1)!}{i!(n-1-i)!}b_{i}$$$

$$$\Rightarrow n\frac{b_{n}}{n!} = \displaystyle\sum_{i=0}^{n-1}\frac{b_i}{i!}\frac{1}{(n-1-i)!}$$$

$$$\Rightarrow \displaystyle\sum_{n \ge 1}n\frac{b_{n}}{n!}x^{n} = \displaystyle\sum_{n \ge 1} x\displaystyle\sum_{i=0}^{n-1}\frac{b_{i}x^{i}}{i!}\frac{x^{n-1-i}}{(n-1-i)!}$$$

Now we see why EGFs are convenient for us. If our convolutions involve binomial coefficients (which is often the case when we deal with combinatorial objects), then multiplying EGFs kind of “automatically” helps us multiply our terms by a suitable binomial coefficient (more details later).

Back to our problem, we want to write everything in terms of $$$B(x)$$$. RHS is easy, since it is just $$$xB(x)e^{x}$$$ (Recall that $$$\displaystyle\sum_{n \ge 0}\frac{x^{n}}{n!}$$$ is the Maclaurin series of $$$e^x$$$). However, the LHS needs a bit of work, since we have the unfortunate $$$nb_{n}$$$ term instead of the $$$b_{n}$$$ term. To deal with this obstacle, we use a common trick when dealing with formal power series. Let us differentiate B(x), then multiply by $$$x$$$. Verify that if $$$A(x)$$$ is a formal power series with $$$A(x) = a_{0}+a_{1}x^{1}+a_{2}x^{2}+... = \displaystyle\sum_{n \ge 0}a_{n}x^{n}$$$ then $$$xA’(x) = a_{1}x^{1} + 2a_{2}x^{2} + 3a_{3}x^{3} + … = \displaystyle\sum_{n \ge 0}na_{n}x^{n}$$$.

Thus, looking back at our equation, we have

$$$xB’(x) = xB(x)e^{x}$$$, which implies $$$\frac{B'(x)}{B(x)} = e^{x}$$$. If you are familiar with calculus, you will recognize that if we integrate both sides, we get $$$\ln B(x) = e^{x} + c$$$. Since $$$b_{0}=1$$$, $$$B(0)=1$$$ and we have $$$c = -1$$$. Thus, $$$B(x) = e^{e^{x}-1}$$$ is our desired EGF.

So, how to find $$$b_{n}$$$ in faster than $$$O(n^2)$$$ time. The idea is that we can find the first $$$n$$$ terms of $$$e^{P(x)}$$$ in $$$O(n\log n)$$$ time, so we just need to compute the first few terms of our EGF and read off the answer! In this 2-part article, I will omit explaining how to do certain well-known polynomial operations in $$$O(n\log n)$$$ time or $$$O(n\log^{2} n)$$$ time like $$$\sqrt{P(x)}$$$, $$$\ln(P(x))$$$ etc. There are already tutorials written for them (for example cp-algorithms). Hence, I will just quote that we can do those polynomial operations since that is not the main focus of this article.

Algebraic Manipulation of Generating Functions

Here are some common ways to manipulate generating functions and how they change the sequence they are representing. In this section, $$$a_{i}$$$, $$$b_{i}$$$ will represent sequences and $$$A(x)$$$ and $$$B(x)$$$ are their corresponding generating functions (OGF or EGF depending on context which will be stated clearly). As an exercise, verify these statements.


For both OGF and EGF, $$$C(x)=A(x)+B(x)$$$ generates the sequence $$$c_{n}=a_{n}+b_{n}$$$.


For OGF, $$$C(x) = x^{k}A(x)$$$ generates the sequence $$$c_{n}=a_{n-k}$$$ where $$$a_{i}=0$$$ for $$$i<0$$$. For EGF, you need to intergrate the series $$$A(x)$$$ $$$k$$$ times to get the same effect.

For OGF, $$$C(x) = \frac{A(x) - (a_{0} + a_{1}x + a_{2}x^2 + ... + a_{k-1}x^{k-1})}{x^{k}}$$$ generates the sequence $$$c_{n} = a_{n+k}$$$.

For EGF, $$$C(x) = A^{(k)}(x)$$$ generates the sequence $$$c_{n} = a_{n+k}$$$, where $$$A^{(k)}(x)$$$ denotes $$$A$$$ differentiated $$$k$$$ times.

Multiplication by $$$n$$$

For both OGF and EGF, $$$C(x) = xC'(x)$$$ generates the sequence $$$c_{n}=na_{n}$$$.

In general, you can get the new generating function when you multiply each term of the original sequence by a polynomial in $$$n$$$ by iterating this operations (but I do not include the general form here to avoid confusion).


This is really the most important operation on generating functions.

For OGF, $$$C(x)=A(x)B(x)$$$ generates the sequence $$$c_{n} = \displaystyle\sum_{k=0}^{n}a_{k}b_{n-k}$$$.

For EGF, $$$C(x)=A(x)B(x)$$$ generates the sequence $$$c_{n} = \displaystyle\sum_{k=0}^{n}\binom{n}{k}a_{k}b_{n-k}$$$ (verify this!). This is also why EGF is useful in dealing with recurrences involving binomial coefficients or factorials.

Power of Generating Function

This is just a direct consequence of convolution, but I include it here because it is so commonly used.

For OGF, $$$C(x)=A(x)^{k}$$$ generates the sequence $$$c_{n} = \displaystyle\sum_{i_{1}+i_{2}+...+i_{k}=n}a_{i_{1}}a_{i_{2}}...a_{i_{k}}$$$

For EGF, $$$C(x)=A(x)^{k}$$$ generates the sequence $$$c_{n} = \displaystyle\sum_{i_{1}+i_{2}+...+i_{k}=n}\frac{n!}{i_{1}!i_{2}!...i_{k}!}a_{i_{1}}a_{i_{2}}...a_{i_{k}}$$$

Prefix Sum Trick

This only works for OGF, but is useful to know. Suppose want to generate the sequence $$$c_{n} = a_{0}+a_{1}+...+a_{n}$$$. Then, we can take $$$C(x) = \frac{1}{1-x}A(x)$$$.

Why does this work? If we expand the RHS, we get $$$(1+x+x^{2}+...)A(x)$$$. To obtain the coefficient of $$$x^n$$$ which is $$$c_{n}$$$, we need to choose $$$x^{i}$$$ from the first bracket and $$$a_{n-i}x^{n-i}$$$ from $$$A(x)$$$, so summing over all $$$i$$$ gives us $$$c_{n} = \displaystyle\sum_{i=0}^{n}a_{i}$$$.

List of Common Series

Before we delve into applications, I want to compile a short list of series that we will use frequently below. They are

$$$\frac{1}{1-x} = 1 + x + x^{2} + ... = \displaystyle\sum_{n \ge 0}x^{n}$$$

$$$-\ln (1-x) = x + \frac{x^2}{2} + \frac{x^3}{3} + ... = \displaystyle\sum_{n \ge 1}\frac{x^{n}}{n}$$$

$$$e^{x} = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + ... = \displaystyle\sum_{n \ge 0}\frac{x^{n}}{n!}$$$

$$$(1-x)^{-k} = \binom{k-1}{0}x^{0} + \binom{k}{1}x^{1} + \binom{k+1}{2}x^{2} + ... = \displaystyle\sum_{n}\binom{n+k-1}{n}x^{n}$$$

Our goal in many problems will be to reduce the EGF or OGF involved in the problem into some composition of functions that we know above.

You can find a more complete list on Page 57 on generatingfunctionology.

Generating Functions in Counting Problems

Generating functions is a powerful tool in enumerative combinatorics. There are so many applications that I can only cover a small fraction of them here. If you are interested in more examples of counting using generating functions, you can try the books generatingfunctionology and Enumerative Combinatorics.

Here, I will show some classical examples of counting problems involving generating functions. In the next post, I will focus on CP problems which utilizes generating functions.

Catalan Numbers, revisited

We have shown before that the OGF of the Catalan numbers is $$$C(x) = \frac{1 - \sqrt{1-4x}}{2x}$$$. Suppose we want to find a closed-form formula for $$$c_{n}$$$. Of course, it is well-known that $$$c_{n} = \frac{1}{n+1}\binom{2n}{n}$$$, but let's pretend we don't know that yet. We want to "expand" our generating function $$$C(x)$$$, but there is a troublesome square root in our way.

This is where the generalized binomial theorem comes to our rescue. Before that, we need to define generalized binomial coefficients.

Definition. Let $$$r$$$ be any complex number and $$$n$$$ be a nonnegative integer. Then, $$$\binom{r}{n} = \frac{r(r-1)...(r-(n-1))}{n!}$$$.

This is the same as the usual binomial coefficients, but now we no longer require the first term to be a nonnegative integer.

Next, we show a special case of the theorem.

Theorem. Let $$$r$$$ be a real number and $$$n$$$ be a nonnegative integer, then

$$$(1+x)^{r} = \displaystyle\sum_{n \ge 0}\binom{r}{n}x^{n}$$$.

The proof is just differentiating the left side $$$n$$$ times and compare the constant term. I leave this as an exercise.

In particular, our mysterious function $$$\sqrt{1-4x} = (1-4x)^{\frac{1}{2}} = \displaystyle\sum_{n \ge 0}\binom{\frac{1}{2}}{n}(-4x)^{n}$$$

$$$= \displaystyle\sum_{n \ge 0}\frac{1}{2} \cdot \frac{-1}{2} \cdot \frac{-3}{2} \cdot ... \cdot \frac{-(2n-3)}{2} \cdot \frac{1}{n!} \cdot (-4)^{n}x^{n}$$$

$$$= 1 + \displaystyle\sum_{n \ge 1}\frac{(-1)^{n-1}(1 \cdot 3 \cdot ... \cdot (2n-3))}{2^{n}} \cdot \frac{(-4)^{n}}{n!} x^{n}$$$

$$$= 1 + \displaystyle\sum_{n \ge 1}-2^{n} \cdot \frac{(2n-2)!}{2^{n-1}(n-1)!}\cdot \frac{1}{n!}x^{n}$$$

$$$= 1 + \displaystyle\sum_{n \ge 1}\frac{-2 \cdot (2n-2)!}{(n-1)!n!}x^{n}$$$

$$$= 1 + \displaystyle\sum_{n \ge 1}-\frac{2}{n} \cdot \binom{2n-2}{n-1}x^{n}$$$.

Hence, $$$C(x) = \frac{1-\sqrt{1-4x}}{2x} = \frac{1}{2x}\left[1 - 1 - \displaystyle\sum_{n \ge 1}-\frac{2}{n} \cdot \binom{2n-2}{n-1}x^{n}\right]$$$

$$$= \displaystyle\sum_{n \ge 1}\frac{1}{n} \cdot \binom{2n-2}{n-1}x^{n-1}$$$

$$$= \displaystyle\sum_{n \ge 0}\frac{1}{n+1}\binom{2n}{n}x^{n}$$$, as desired.

Some problems involving permutations

For a permutation $$$p = (p_{1},p_{2},...,p_{n})$$$, consider the graph formed by the edges $$$i \rightarrow p_{i}$$$. It is well-known that the graph is a union of several disjoint cycles.

Problem. Count the number of permutations of length $$$n$$$ with $$$k$$$ cycles.

These numbers are also called Stirling numbers of the first kind.

Let $$$c_{n} = (n-1)!$$$ be the number of permutations of length $$$n$$$ which is a cycle. Let $$$C(x) = \displaystyle\sum_{n \ge 0}\frac{c_{n}}{n!}x^{n}$$$ denote the EGF of $$$c$$$. Let $$$f_{n}$$$ be our answer and $$$F(x)$$$ be its EGF. The key observation here is that $$$F(x) = \frac{1}{k!}C(x)^{k}$$$.

Suppose for a moment our cycles are labelled from $$$1$$$ to $$$k$$$. For every permutation, label each element with the label of the cycle it is in. Let's fix the length of cycle $$$i$$$ to be $$$a_{i}$$$ (so $$$\sum a_{i} = n$$$). Then, there are $$$c_{a_{i}}$$$ ways to permute the elements in the $$$i$$$-th cycle and $$$\frac{n!}{a_{1}!a_{2}!...a_{k}!}$$$ ways to assign cycle labels to the elements of the permutation. Finally, in our actual problem, the order of cycles doesn't matter, so we need to divide by $$$k!$$$ in the end.

To summarize, the answer is $$$\frac{n!}{k!}\displaystyle\sum_{a_{1}+a_{2}+...+a_{k}=n}\frac{c_{a_{1}}c_{a_{2}}...c_{a_{k}}}{a_{1}!a_{2}!...a_{k}!}$$$. Verify that $$$\displaystyle\sum_{a_{1}+a_{2}+...+a_{k}=n}\frac{c_{a_{1}}c_{a_{2}}...c_{a_{k}}}{a_{1}!a_{2}!...a_{k}!}$$$ is $$$[x^{n}]C(x)^{k}$$$, so $$$F(x) = \frac{1}{k!}C(x)^{k}$$$ (the $$$n!$$$ disappears into $$$F(x)$$$ because we are dealing with EGFs).

Let's assume this is a CP problem and we are asked to find the answer for $$$(n,k)$$$. Then, we can calculate the answer directly using generating functions in $$$O(n\log n)$$$ since we can do exponentiation in $$$O(n\log n)$$$ ($$$P(x)^{k} = \exp(k\ln(P(x)))$$$).

Actually, what we just did is a special case of the more general Exponential Formula. However, I feel that it is easier to understand the Exponential Formula from these specific examples and you should try to understand it intuitively until it becomes common sense.

Problem. Count the number of permutations of length $$$n$$$ such that all cycle lengths are in a fixed set of positive integers $$$S$$$.

We use the same trick as the previous problem, but let $$$c_{i}=0$$$ if $$$i$$$ is not in $$$S$$$.

This time, we need to find $$$[x^{n}]\displaystyle\sum_{k \ge 0}\frac{1}{k!}C(x)^{k} = [x^{n}]\exp(C(x))$$$ because we need to sum over all values of $$$k$$$ (number of cycles), which can also be computed easily.

Problem. Find the expected number of cycles of a permutation of length $$$n$$$.

To compute the expected number of cycles, we count the sum of number of cycles over all permutations of length $$$n$$$. Let $$$g_{n}$$$ denote the sum of number of cycles over all permutations of length $$$n$$$ and $$$G(x)$$$ as the EGF of $$$g$$$. Using the same function $$$C$$$ in the previous problems, we need to find (note the extra factor $$$k$$$ which is the difference between this and the previous examples) $$$[x^{n}]G(x) = [x^{n}]\displaystyle\sum_{k \ge 0}\frac{k}{k!}C(x)^{k} = [x^{n}]C(x)\displaystyle\sum_{k \ge 1}\frac{1}{(k-1)!}C(x)^{k-1} = [x^{n}]C(x)\exp(C(x))$$$

However, $$$C(x) = \displaystyle\sum_{k \ge 1}\frac{(k-1)!}{k!}x^{k} = \displaystyle\sum_{k \ge 1}\frac{x^{k}}{k} = -\ln(1 - x)$$$. Hence, $$$C(x)\exp(C(x)) = -\frac{\ln(1-x)}{(1-x)}$$$.

For $$$n \ge 1$$$, $$$[x^{n}](-\ln(1-x)) = \frac{1}{n}$$$. By the Prefix Sum trick, $$$[x^{n}]\frac{-\ln(1-x)}{1-x} = 1+\frac{1}{2}+...+\frac{1}{n}$$$.

Thus, $$$[x^{n}]G(x) = 1+\frac{1}{2}+...+\frac{1}{n}$$$. Since $$$\frac{g_n}{n!}$$$ is the expected number of cycles of a permutation of length $$$n$$$, our answer is $$$1 + \frac{1}{2}+... +\frac{1}{n}$$$, the $$$n$$$-th Harmonic number!

We see that the exponential trick is viable when we are dealing with items that are made up from smaller pieces.

Stirling Numbers of the Second Kind

Problem. Find the number of ways to partition the set $$$\{1,2,...,n\}$$$ into $$$k$$$ subsets.

These numbers are also called Stirling numbers of the second kind.

Denote the answer by $$$f(n,k)$$$. The trick is to consider the polynomial (also known as deck enumerator) $$$D(x) = \displaystyle\sum_{n \ge 1}\frac{x^{n}}{n!}$$$. What is $$$D(x)^{k}$$$? We have $$$[x^{n}]D(x)^{k} = \displaystyle\sum_{a_{1}+a_{2}+...+a_{k}=n, a_{i} \ge 1}\frac{1}{a_{1}!a_{2}!...a_{k}!}$$$. This sum has a similar combinatorial interpretation as the ones in the previous problems. Let's assume the partition sets are labelled from $$$1$$$ to $$$k$$$. Then, $$$a_{i}$$$ denotes the size of the $$$i$$$-th set and there are $$$\frac{n!}{a_{1}!a_{2}!...a_{k}!}$$$ ways to assign a set to each element (by the multinomial theorem). However, we have counted each partition $$$k!$$$ times, since in our final answer the sets shouldn't be ordered. Thus, $$$k!f(n,k) = n![x^{n}]D(x)^{k}$$$.

Rearranging gives us $$$\frac{f(n,k)}{n!} = \frac{[x^{n}]D(x)^{k}}{k!}$$$. Hence, $$$\displaystyle\sum_{n \ge 0}\frac{f(n,k)}{n!}x^{n} = \frac{D(x)^{k}}{k!}$$$. Introducing the variable $$$y$$$ to correspond to the variable $$$k$$$, we have $$$\displaystyle\sum_{k \ge 0}\displaystyle\sum_{n \ge 0}\frac{f(n,k)}{n!}x^{n}y^{k} = \displaystyle\sum_{k \ge 0}\frac{[D(x)y]^{k}}{k!} = \exp(D(x)y)$$$.

Note: The polynomial $$$H(x,y) = \displaystyle\sum_{k \ge 0}\displaystyle\sum_{n \ge 0}f(n,k)\frac{x^{n}}{n!}y^{k}$$$ is also known as a hand enumerator.

Thus, we have the simple formula $$$H(x,y) = \exp(D(x)y)$$$. Note that $$$D(x) = \displaystyle\sum_{n \ge 1}\frac{x^{n}}{n!} = e^{x} - 1$$$, so we have $$$H(x,y) = e^{(e^{x}-1)y}$$$ (note how similar this is to the EGF of Bell numbers! In fact $$$H(x,1)$$$ is the EGF of Bell numbers (can you see why?)).

To get the answer, we just need to find $$$n![x^{n}y^{k}]H(x,y) = n![x^{n}]\frac{(e^{x}-1)^{k}}{k!}$$$ which you can compute using polynomial operations efficiently.

Graph Counting

Problem. Find the number of vertex-labeled undirected graphs with $$$n$$$ vertices so that each vertex has degree $$$2$$$.

Every such graph must be a union of disjoint cycles (why?). As usual, we start by considering the generating function for one "component" in the item we need to count. Let $$$d_{n}$$$ denote the number of undirected cycles of length $$$n$$$ and $$$D(x)$$$ denote its EGF. Then, $$$D(x) = \displaystyle\sum_{n \ge 3}\frac{(n-1)!}{2n!}x^{n} = \frac{1}{2}\displaystyle\sum_{n \ge 3}\frac{x^{n}}{n} = \frac{1}{2}\left(-\ln(1-x) - x - \frac{x^2}{2}\right)$$$.

Let $$$G(x)$$$ denote the EGF of our answer. Using the same argument as before, we find that $$$G(x) = \exp(D(x))$$$, so we get the formula $$$G(x) = \exp\left(\ln\left(\sqrt{\frac{1}{1-x}}\right) - \frac{x}{2} - \frac{x^2}{4}\right) = \frac{e^{-\frac{x}{2}-\frac{x^2}{4}}}{\sqrt{1-x}}$$$, and you can compute the coefficient of $$$x^{n}$$$ to obtain the answer.

Let's look at a trickier example.

Problem. Find the number of bipartite vertex-labeled graphs with $$$n$$$ vertices.

It is tempting to try a similar approach as the previous problem. We can relate the number of bipartite graphs with the number of connected bipartite graphs. Can we count the number of connected bipartite graphs easily? Unfortunately, it does not seem to be too easy to count.

Instead, let us color each vertex of the graph with red or blue, and count the number of colored bipartite graphs (not necessarily connected). Suppose we choose $$$k$$$ vertices to color red and $$$n-k$$$ to color blue. Then, there are $$$\binom{n}{k}$$$ ways to choose the coloring and $$$2^{k(n-k)}$$$ to choose the edges (since each edge must be between $$$2$$$ components). Thus, the number of colored bipartite graphs on $$$n$$$ vertices is $$$\displaystyle\sum_{k \ge 0}\binom{n}{k}2^{k(n-k)}$$$. Call this number $$$a_{n}$$$ and its EGF as $$$A(x)$$$.

The next step is to relate the number of colored bipartite graphs with the number of colored connected bipartite graphs. Let $$$b_{n}$$$ denote the number of colored connected bipartite graphs on $$$n$$$ vertices and $$$B(x)$$$ be its EGF. Using a similar argument as before, we have the relation $$$A(x) = \exp(B(x))$$$, and thus $$$B(x) = \ln(A(x))$$$.

Returning to our original problem, our next step is to count the number of connected bipartite graphs on $$$n$$$ vertices (call the count $$$c_{n}$$$ and EGF $$$C(x)$$$). However this is easy, since each connected bipartite graph can be colored in exactly two ways (the coloring is fixed once we choose the color of a vertex). Hence, $$$C(x) = \frac{B(x)}{2}$$$.

Finally, let $$$d_{n}$$$ be the number of bipartite graphs on $$$n$$$ vertices and $$$D(x)$$$ be its EGF. Then, we have $$$D(x) = \exp(C(x))$$$ using the exponential argument. Thus, $$$D(x) = \exp(C(x)) = \exp\left(\frac{B(x)}{2}\right) = \exp\left(\frac{\ln(A(x))}{2}\right) = \sqrt{A(x)}$$$, which is a nice formula!

Placing Rooks and PIE

Let's look at a last example which demonstrates the use of the Inclusion-Exclusion Principle (PIE).

Consider a $$$n \times n$$$ chessboard where some cells are colored black and others are colored white. Suppose we magically know the sequence $$$r_{k}$$$, the number of ways to place $$$k$$$ non-attacking rooks on white cells of the chessboard (i.e. no two are on the same row or column, no rooks are on black cells). Let $$$e_{k}$$$ denote the number of ways to place $$$n$$$ non-attacking rooks on the chessboard so that exactly $$$k$$$ of the rooks are on white squares. Can we find $$$e_{k}$$$ in terms of $$$r_{k}$$$?

The trick is that exact conditions are usually harder to count while "at least" conditions are easier to count. For a fixed subset of white cells $$$S$$$, denote $$$N(S)$$$ as the number of ways to place $$$n$$$ non-attacking rooks on the chessboard such that there is at least one rook on each cell in $$$S$$$. Let $$$n_{k} = \displaystyle\sum_{|S| = k}N(S)$$$.

We relate $$$n_{k}$$$ with $$$e_{k}$$$. Consider a subset $$$T$$$ of size $$$t$$$ and a way to place $$$n$$$ non-attacking rooks so that the white cells they occupy is exactly $$$T$$$. Every $$$k$$$-element subset of $$$T$$$ contributes to the sum $$$n_{k}$$$. Thus, we obtain the recurrence $$$n_{k} = \displaystyle\sum_{t \ge 0}\binom{t}{k}e_{t}$$$.

Let $$$N(x)$$$ and $$$E(x)$$$ be the OGFs of $$$n_{k}$$$ and $$$e_{k}$$$. We can derive a simple relation between $$$N(x)$$$ and $$$E(x)$$$. Indeed, we have

$$$N(x) = \displaystyle\sum_{k \ge 0}x^{k}\displaystyle\sum_{t \ge 0}\binom{t}{k}e_{t} = \displaystyle\sum_{t \ge 0}e_{t}\displaystyle\sum_{k \ge 0}\binom{t}{k}x^{k} = \displaystyle\sum_{t \ge 0}e_{t}(x+1)^{t} = E(x+1)$$$. Thus, we have the simple relation $$$E(x) = N(x-1)$$$.

It turns out that $$$n_{k}$$$ is usually much easier to find. In our problem, $$$n_{k} = r_{k}(n-k)!$$$, since we can first choose our set $$$S$$$ as any set of $$$k$$$ non-attacking rooks on white cells, then place the other $$$n-k$$$ rooks in $$$(n-k)!$$$ ways. Thus, we obtain $$$N(x) = \displaystyle\sum_{k \ge 0}r_{k}(n-k)!x^{k}$$$ and $$$E(x) = \displaystyle\sum_{k \ge 0}r_{k}(n-k)!(x-1)^{k}$$$. Hence, we can read $$$e_{j}$$$ from the coefficients of $$$E(x)$$$.

Proving Some Interesting Theorems via Generating Functions

This is not entirely CP related but here are some cool theorems you can prove easily with generating functions.

Partition in odd parts = Partition in distinct parts

A partition of $$$n$$$ into $$$k$$$ parts is a multiset of positive integers of size $$$k$$$ which sum up to $$$n$$$. For example, $$$\{3,1,1\}$$$ is a partition of $$$5$$$ into $$$3$$$ parts. Note that the order of elements do not matter, so $$$\{3,1,1\}$$$ and $$$\{1,3,1\}$$$ are the same partition.

You might have heard of the well-known problem of proving that the number of partitions of $$$n$$$ into odd parts is equal to the number of partitions of $$$n$$$ into distinct parts. Here, we prove a generalization of it.

Problem. Prove that the number of partitions of $$$n$$$ into parts of size not divisible by $$$k+1$$$ is equal to the number of partitions of $$$n$$$ into parts such that there are at most $$$k$$$ parts of each size.

Note that when $$$k=1$$$ we reduce this to the standard problem.

Fix $$$k$$$ and let $$$A(x)$$$ be the OGF of the first object and $$$B(x)$$$ be the OGF of the second object. Observe that choosing a partition is the same as choosing the number of times we use each integer in our multiset, so

$$$B(x) = \displaystyle\prod_{r \ge 1}(1 + x^{r} + x^{2r} + ... + x^{kr})$$$

$$$= \displaystyle\prod_{r \ge 1}\left(\frac{1 - x^{r(k+1)}}{1 - x^{r}}\right)$$$

$$$= \displaystyle\prod_{r \ge 1, (k+1) \nmid r}\left(\frac{1}{1 - x^{r}}\right)$$$

$$$= \displaystyle\prod_{r \ge 1, (k+1) \nmid r}(1 + x^{r} + x^{2r} + ...) = A(x)$$$

Binet's Formula (and solving Linear Recurrences)

Let $$$f_{n}$$$ denote the $$$n$$$-th Fibonacci number (with $$$f_{0}=0$$$, $$$f_{1}=1$$$, $$$f_{n}=f_{n-1}+f_{n-2}$$$). You may have heard of Binet's Formula, which states that $$$f_{n} = \frac{1}{\sqrt{5}}\left[\left(\frac{1+\sqrt{5}}{2}\right)^{n} - \left(\frac{1-\sqrt{5}}{2}\right)^{n}\right]$$$.

This might look very random, but it actually comes directly from the generating function. Recall that $$$F(x) = \frac{x}{1-x-x^2}$$$ is the OGF of $$$f$$$. The trick here is to use partial fractions (we will explore another example in the next part). Let $$$-\gamma_{1}, -\gamma_{2}$$$ be the roots of the equation $$$1-x-x^{2}=0$$$ (where $$$\gamma_{1} = \frac{1+\sqrt{5}}{2}$$$, $$$\gamma_{2} = \frac{1 - \sqrt{5}}{2}$$$). Then, we can write $$$F(x) = \frac{A}{x + \gamma_{1}} + \frac{B}{x + \gamma_{2}}$$$. With some calculation, we can obtain $$$F(x) = \frac{1}{\gamma_{1} - \gamma_{2}}\left(\frac{1}{1 - \gamma_{1}x} - \frac{1}{1 - \gamma_{2}x}\right) = \frac{1}{\sqrt{5}}\left(\displaystyle\sum_{j \ge 0}\gamma_{1}^{j}x^{j} - \displaystyle\sum_{j \ge 0}\gamma_{2}^{j}x^{j}\right)$$$.

Comparing coefficients, we get $$$f_{n} = \frac{1}{\sqrt{5}}(\gamma_{1}^{n} - \gamma_{2}^{n})$$$. Recalling that $$$\gamma_{1} = \frac{1+\sqrt{5}}{2}$$$ and $$$\gamma_{2} = \frac{1 - \sqrt{5}}{2}$$$ gives us Binet's Formula.

Note that this method is generalizable for general linear recurrences.

Probability that a random permutation has no cycle length which is the square of an integer

Problem. What is the probability that a random permutation has no cycle length which is the square of an integer?

This problem seems pretty random (no pun intended), but I include it here to show the power of generating functions.

Firstly, suppose we know that our permutation is of length $$$n$$$ and there are $$$a_{i}$$$ cycles of length $$$i$$$ (so $$$\sum_{i \ge 1} ia_{i} = n$$$). How many such permutations are there? With some simple counting, we can obtain the formula $$$\frac{n!\displaystyle\prod_{i\ge 1}((i-1)!)^{a_{i}}}{\displaystyle\prod_{i\ge 1}(i!)^{a_{i}} \cdot \displaystyle\prod_{i\ge 1}(a_{i}!)} = \frac{n!}{\displaystyle\prod_{i\ge 1}i^{a_{i}} \cdot \displaystyle\prod_{i\ge 1}(a_{i}!)}$$$ (Hint: Assume the cycles are labelled first, and assign the elements into cycles, then arrange the elements within each cycle. Divide some factorials to handle overcounts).

The sequence $$$a = (a_{1},a_{2},...)$$$ defined above is also called the cycle type of a permutation.

Let $$$c(a)$$$ denote the number of permutations of length $$$n = a_{1}+2a_{2}+...$$$ with cycle type $$$a$$$ and $$$p(a)$$$ denote the probability that a permutation of length $$$n = a_{1}+2a_{2}+...$$$ has cycle type $$$a$$$. Hence, $$$c(a) = \frac{n!}{\displaystyle\prod_{i\ge 1}i^{a_{i}} \cdot \displaystyle\prod_{i\ge 1}(a_{i}!)}$$$ and $$$p(a) = \frac{c(a)}{n!}$$$

Now, the trick is to consider the infinite-variable generating function in $$$x_{1},x_{2},...$$$:

$$$C(x,y) = \displaystyle\sum_{n \ge 0}\frac{y^{n}}{n!}\displaystyle\sum_{a_{1}+2a_{2}+...=n,a_{i} \ge 0}c(a)x_{1}^{a_{1}}x_{2}^{a_{2}}...$$$

From our discussion above, we know how to find $$$c(a)$$$, thus we can write $$$C(x,y)$$$ as

$$$C(x,y) = \left(\displaystyle\sum_{a_{1} \ge 0}\frac{(yx_{1})^{a_{1}}}{a_{1}!1^{a_{1}}}\right)\left(\displaystyle\sum_{a_{2} \ge 0}\frac{(y^{2}x_{2})^{a_{2}}}{a_{2}!2^{a_{2}}}\right)... = \exp(yx_{1})\exp\left(y^{2}\frac{x_{2}}{2}\right)... = \exp\left(\displaystyle\sum_{i \ge 1}\frac{y^{i}x_{i}}{i}\right)$$$.

Hence, for a fixed cycle type $$$a = (a_1,a_2,...)$$$, $$$p(a) = [x_{1}^{a_{1}}x_{2}^{a_{2}}...]\exp\left(\displaystyle\sum_{i \ge 1}\frac{y^{i}x_{i}}{i}\right)$$$.

Let us return to our problem. Call a cycle type $$$a$$$ good if $$$a_{j}=0$$$ for all perfect squares $$$j$$$. We want to find $$$\displaystyle\lim_{n \rightarrow \infty}\displaystyle\sum_{a \text{ good}}[y^{n}x_{1}^{a_{1}}x_{2}^{a_{2}}...]\exp\left(\displaystyle\sum_{i \ge 1}\frac{y^{i}x_{i}}{i}\right)$$$. We can "substitute" $$$x_{j}=1$$$ for all non-perfect squares $$$j$$$ to indicate that we don't care about the power of $$$x_{j}$$$ if $$$j$$$ is not a perfect square. so we reduce our problem to finding (noting that $$$a_{j}=0$$$ for all perfect squares $$$j$$$)

$$$\displaystyle\lim_{n \rightarrow \infty}[y^{n}]\exp\left(\displaystyle\sum_{i=z^{2} }\frac{y^{i}x_{i}}{i} + \displaystyle\sum_{i \neq z^{2}}\frac{y^{i}}{i}\right)$$$

$$$= \displaystyle\lim_{n \rightarrow \infty}[y^{n}]\exp\left(\displaystyle\sum_{i=z^{2} }\frac{y^{i}(x_{i}-1)}{i} + \displaystyle\sum_{i \ge 1}\frac{y^{i}}{i}\right)$$$

$$$= \displaystyle\lim_{n \rightarrow \infty}[y^{n}]\exp\left(\displaystyle\sum_{i=z^{2} }\frac{y^{i}(x_{i}-1)}{i} - \ln(1-y)\right)$$$

$$$= \displaystyle\lim_{n \rightarrow \infty}[y^{n}]\frac{1}{1-y}\exp\left(\displaystyle\sum_{i=z^{2}}\frac{y^{i}(x_{i}-1)}{i}\right)$$$

If we let $$$A(y)$$$ be the OGF of $$$a_{n} = [y^{n}]\exp\left(\displaystyle\sum_{i=z^{2}}\frac{y^{i}(x_{i}-1)}{i}\right)$$$, then by the Prefix Sum trick, our limit is equal to $$$\displaystyle\lim_{n \rightarrow \infty}\sum_{i \ge 0}a_{i}$$$ (assuming the sum converges).

Intuitively, we can get the "sum to infinity" by substituting $$$y=1$$$ into $$$\exp\left(\displaystyle\sum_{i=z^{2}}\frac{y^{i}(x_{i}-1)}{i}\right)$$$, and we are only interested in the terms without $$$x_{i}$$$ (for $$$i$$$ a perfect square), so we let these $$$x_{i}=0$$$, to obtain

$$$\displaystyle\lim_{n \rightarrow \infty}[y^{n}]\frac{1}{1-y}\exp\left(\displaystyle\sum_{i=z^{2}}\frac{y^{i}(x_{i}-1)}{i}\right) = \exp\left(\displaystyle\sum_{i = z^{2}}-\frac{1}{i}\right) = e^{-\frac{\pi^{2}}{6}}$$$, which is actually our answer (recall that $$$\displaystyle\sum_{i \ge 1}\frac{1}{i^2} = \frac{\pi^{2}}{6}$$$).

Snake Oil Trick in Proving (or Finding) Combinatorial Identities

To end the first part of this tutorial, I will briefly introduce a trick to simplify combinatorial identities using generating functions. The idea is that instead of dealing with the sum directly, it is easier to deal with the series obtained from the generating functions.

Problem. Find the sum $$$\displaystyle\sum_{k \ge 0}\binom{k}{n-k}$$$ for a fixed positive integer $$$n$$$.

Suppose the answer to our problem is $$$f(n)$$$. The idea is that it is easier to consider the OGF of $$$f$$$, which is $$$F(x) = \displaystyle\sum_{n \ge 0}f(n)x^{n}$$$. We have

$$$F(x) = \displaystyle\sum_{n \ge 0}f(n)x^{n}$$$

$$$= \displaystyle\sum_{n \ge 0}\displaystyle\sum_{k \ge 0}\binom{k}{n-k}x^{n}$$$

Now, we switch summation signs to obtain

$$$= \displaystyle\sum_{k \ge 0}\displaystyle\sum_{n \ge 0}\binom{k}{n-k}x^{n}$$$

The key idea is to make the inner sum easy to compute, and we know how to compute $$$\displaystyle\sum_{n \ge 0}\binom{k}{n-k}x^{n-k}$$$, since it is just $$$\displaystyle\sum_{r \ge 0}\binom{k}{r}x^r$$$ in disguise with $$$r=n-k$$$!

Thus, we factor out $$$x^{k}$$$ to obtain

$$$= \displaystyle\sum_{k \ge 0}x^{k}\displaystyle\sum_{n \ge 0}\binom{k}{n-k}x^{n-k}$$$

$$$= \displaystyle\sum_{k \ge 0}x^{k}(1+x)^{k}$$$

$$$= \displaystyle\sum_{k \ge 0}(x(1+x))^{k}$$$

$$$= \frac{1}{1 - x - x^2}$$$

Do you recognize the last expression? It is actually $$$\frac{1}{x}F(x)$$$ where $$$F(x)$$$ is the OGF of the Fibonacci numbers! Thus, $$$\frac{1}{1-x-x^2} = \displaystyle\sum_{n \ge 0}f_{n+1}x^{n}$$$, and by comparing coefficients we get $$$f(n) = f_{n+1}$$$, the $$$(n+1)$$$-th Fibonacci number!

There are many other similar applications of the Snake Oil Method but I won't go into detail here. In general, this method might be useful in CP if you encounter some math problems and reduce the problem into double or triple summation of binomial coefficients but you need an $$$O(n)$$$ solution. Sometimes, you can forcefully simplify your summations using the Snake Oil method. We will use the trick of introducing a power series and swapping summation signs again to simplify some expressions in the next part of this article.

As an exercise, try to prove the following identity with the Snake Oil method:

$$$\displaystyle\sum_{r=0}^{k}\binom{m}{r}\binom{n}{k-r} = \binom{n+m}{k}$$$ (there is an obvious bijective proof, can you see why this is true?). This identity is very useful in simplifying sums involving binomial coefficients.

This ends the first part of my tutorial on generating functions. The next part will focus more on applications on GFs in CP problems, so stay tuned!

UPD: Part 2 is now available here.

P.S. Let me know if I made any errors or typos in the post (which is likely to happen).

Read more »

  • Vote: I like it
  • +1297
  • Vote: I do not like it

By zscoder, history, 15 months ago, In English

I hope you enjoyed the contest!

Expected problem difficulty is F < A < (G ~ D) < (C ~ E) < B (though it might be different for different people).

I will mainly focus on explaining the full solution to the problems but I will briefly mention how to pass certain subtasks.

Problem A — Leakage

Code (zscoder)
Code (tmwilliamlin168)

Problem B — Confession

Code (zscoder)

Problem C — Isolation

Code (zscoder)

Problem D — Equality

Code (zscoder)

Problem E — Valentine

Code (zscoder)
Code, short version (tmwilliamlin168)

Problem F — Opposition

Code (zscoder)

Problem G — Honeymoon

Code (zscoder)


The characters in the problem statements come from different anime/manga/light novels. Anime/Manga/LN fans of the romance genre (though some of them contain different elements) are recommended to check them out.

Problem A: School Days

Problem B: Tsurezure Children

Problem C: The Empty Box and Zeroth Maria (personal favourite, Light Novel only)

Problem D: Tsuki ga Kirei

Problem E: A Certain Magical Index (good luck watching/reading this series ^_^)

Problem F: Love, Chunibyo & Other Delusions (my favourite Kyoani anime)

Problem G: Yamada-kun and the Seven Witches

Read more »

  • Vote: I like it
  • +175
  • Vote: I do not like it

By zscoder, history, 15 months ago, In English

Hello everyone!

Will you be single and bored during Valentine's Day? Never fear, as zscoder is here to cure your boredom.

I would like to invite you to Valentine's Day Contest 2020, which will take place on Friday, February 14, 2020 at 12:30 GMT. The contest is unofficial and unrated, but the quality of most (if not all) of the problems are comparable to problems from a Codeforces round. I am the author of all problems.

The contest format will be IOI format, which means that each problem is worth $$$100$$$ points, and there are subtasks for each problem. There will be no time penalty. The problems are not sorted in increasing order of difficulty. Unlike IOI, you are allowed to use any templates or notes you have.

There are 7 problems to be solved in 3.5 hours. There is an interactive problem, so feel free to learn about them here.

There will be a special shoutout to the first person to AC for each problem (and also the first person to get all 7 ACs >_<).

The difficulty of the contest is aimed at higher-rated Div. 2 (Expert) to mid-red (low International Grandmaster) level participants but everyone is welcome to join the contest. Of course, if you are not single and are still free to join the contest, you are welcome to join as well. XD

Thanks to the testers Kuroni, tmwilliamlin168, duckmoon99, gamegame, ToxicPie9, dorijanlendvaj, kostia244 and alimq for testing the problems and MikeMirzayanov for the wonderful Codeforces and Polygon systems that made this contest possible.

The contest will be held (tentatively) within a Codeforces group and the link will be posted later.

UPD: The contest will be held as a training contest on Gym. (which will appear later) The contest is now available on Gym. Registration opens $$$6$$$ hours before contest starts.

If you are a coach in Gym, remember to disable coach mode before joining the contest. ^_^

I will be on the AC Discord server to discuss the contest after it ends.

Hope to see you in the contest!

UPD 2: Contest is over! Thanks to everyone who participated and made this Valentine's Day less lonely for me. Congratulations to the top 10:

Rank 1: Radewoosh (with 577 points)

Rank 2: jiangly and kefaa2 (tied with 500 points)

Rank 4: NoLongerRed (with 426 points)

Rank 5: sigma425 (with 409 points)

Rank 6: noneTP (with 351 points)

Rank 7: wygzgyw (with 345 points)

Rank 8: chocorusk (with 340 points)

Rank 9: BigBag (with 334 points)

Rank 10: waynetuinfor (with 326 points)

Also, here is a shoutout to all the "first to AC"s:

Problem A: sigma425 at 00:16

Problem B: Unsolved during contest time :(

Problem C: kefaa2 at 00:25

Problem D: TLE at 00:44

Problem E: TLE at 02:38 (and only AC for E during contest!)

Problem F: csy.wendy.1314 at 00:12

Problem G: shirakami.rin at 00:24

UPD 3: The editorial is here!

Read more »

Announcement of Valentines Day Contest 2020
  • Vote: I like it
  • +564
  • Vote: I do not like it

By zscoder, history, 21 month(s) ago, In English

According to the official site, GCJ finals starts today at 19:30 UTC but the live stream starts at 22:00 UTC. Is 19:30 UTC the start of the practice round or the actual round? Does anyone know about it?

Read more »

  • Vote: I like it
  • +67
  • Vote: I do not like it

By zscoder, history, 3 years ago, In English

Atcoder Grand Contest 022 will be held on Saturday (or Sunday depending on your timezone). However, the time of this contest will be three hours later than the usual time for Atcoder contests, i.e. 12am JST instead of 9pm JST. Thus, please check the contest time carefully here.

I (zscoder) am the writer of this contest and this contest counts for GP30 scores.

Contest Link

Contest Announcement

Duration : 150 minutes (2 hours and 30 minutes, 40 minutes longer than usual)

Scoring Distribution : 300-600-700-1600(1000)-1600-1600

I hope you will enjoy the problems. Although the date of the contest is 1 April 12am JST, it is assured that this contest is not an April Fool joke :)

Let's discuss problems after the contest.

Read more »

  • Vote: I like it
  • +165
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English
Bank Robbery
Cutting Carrot
Naming Company
Labelling Cities
Choosing Carrot
Leha and security system
Replace All

Read more »

  • Vote: I like it
  • +126
  • Vote: I do not like it

By zscoder, 4 years ago, In English

Hi all!

On May 13, 12:35 MSK, Tinkoff Challenge — Final Round will be held. Standings of the official finalists are availiable here.

The authors of the round are me (zscoder, Zi Song Yeoh), AnonymousBunny (Sreejato Kishor Bhattacharya), hloya_ygrt (Yury Shilyaev).

Special thanks to KAN (Nikolay Kalinin) for coordinating the round, winger (Vladislav Isenbaev) and AlexFetisov (Alex Fetisov) for testing the problems. Also, thanks to MikeMirzayanov (Mike Mirzayanov) for the Codeforces and Polygon system.

There are seven problems and the duration is two hours. Scoring will be announced before the round.

Top 20 participants of the Elimination Round will compete in the Tinkoff Office.

The round is rated. Division 1 and Division 2 will have the same problemset with seven problems.

We hope everyone will find interesting problems and get high rating!

UPD : Scoring Distribution : 500 — 1000 — 1750 — 2000 — 2500 — 2750 — 3500

UPD2 : The editorial is out!

UPD3 : Congratulations to the top 10 :

  1. V--o_o--V

  2. LHiC

  3. Um_nik

  4. Petr

  5. Shik

  6. Syloviaely

  7. Seter

  8. SirShokoladina

  9. matthew99

  10. DBradac

Read more »

  • Vote: I like it
  • +404
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English

Hi everyone!

Malaysian Computing Olympiad 2017 (also known as MCO 2017) has just ended a few days ago. You can find the problems in this group.

There are 6 problems and each problem is divided into several subtasks.

Read more »

  • Vote: I like it
  • +86
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English

Weekly Training Farm 22 is over. Congratulations to the winners :

  1. W4yneb0t (perfect score in < 1 hour!)

  2. aaaaajack (perfect score)

  3. eddy1021

Here is the editorial :

Problem A

This problem can be solved by greedy. We list down the positive integers one by one. We keep a pointer that initially points to the first letter of s. Whenever the pointed character in the string s matches the corresponding digit of the integer, we move the pointer one step to the right and continue. Repeat this process until the pointer reaches the end.

However, we still need to know whether the answer can be large. The key is to note that the answer will never exceed 106, because after writing down 10 consecutive numbers, at least one of them has last digit equals to the current digit, so the pointer will move to the right at least once when we write down 10 consecutive numbers. Thus, in the worse case, we'll only list down the numbers from 1 to 106, which is definitely fast enough.


Problem B

This problem can be solved using dynamic programming. Firstly, observe that if we already determine which set of problems to solve, then it's best to solve the problem in increasing order of time needed to solve in order to minimize the time penalty. Thus, we can first sort the problems in increasing order of time needed, breaking ties arbitarily.

Let dp[i][j] denote the maximum number of problems solved and minimum time penalty acquired when doing so by using exactly j minutes and only solving problems among the first i ones. dp[0][0] = (0, 0) (the first integer denotes the number of problems solved and the second integer denotes the time penalty in order to do so). The transitions can be handled easily by simply considering whether to solve the i-th problem or not. The time complexity of this solution is O(nT) (T is the duration of the contest)


Problem C

This is an ad hoc problem. Firstly, we can use two moves to determine what the value of the first bit is. (simply flipping it twice will tell you its value. Now, if the bit is 1, you don't need to flip it anymore. If it's 0, you'll need to flip it. In any case, we'll flip the second bit as well. (if the first bit needs to be flipped, we'll flip [1, 2] and flip [2, 2] otherwise) After flipping the second bit, we can determine whether it's a 1 or 0 by calculating from the total number of 1s of the string before the flip and after the flip. We can repeat this for every 2 consecutive bits until we arrive at the last two bits. At this point, we know what the second last bit is, and we also know the total number of 1 bits. So, we can easily deduce the value of the last bit from the information as well. Now, we just need to perform one last flip to make the last 2 bits become 1. The total number of moves made is n + 1.


Problem D1

First, we can use 18 moves to determine the value of a, by asking 2 to 19 in increasing order and the first yes answer will be the value of a. If there're no "yes" answers, then the value of a is 20.

Call a number good if it can be represented as the sum of nonnegative multiples of as and b. Note that if x is good, then x + a, x + b are both good.

Now that we have the value of a, let's think about what b is. Consider the numbers ka + 1, ka + 2, ..., ka + (a - 1) for a fixed k. If none of these numbers are good, we can immediately say that b is larger than (k + 1)a. Why? Suppose b = qa + r. Clearly, r ≠ 0 since a and b are coprime. Note that xa + r for all x ≥ q will be the good, since xa + r = (qa + r) + (x - q)a = b + (x - q)a. So, b cannot be less than any of the numbers ka + 1, ka + 2, ..., ka + (a - 1), or else one of these numbers would've been good, a contradiction. Note that this also means that if y is the smallest integer such that ya + 1, ya + 2, ..., ya + (a - 1) are not all bad, then there will be exactly one good number, which will be b. Also note that for all integers k > y, there will have at least one good number among ka + 1, ka + 2, ..., ka + (a - 1). Thus, we can now binary search for the value of y. In each iteration of the binary search, we need to ask at most a - 1 ≤ 19 questions, and there are at most iterations, so the maximum number of operations needed is 19·19 + 18 = 379 < 380.


Problem D2

This problem is the same as D1, but with higher constraints. Firstly, we find the value of a in 18 moves as in problem D. To proceed, we need to think about this problem from another angle. Suppose we know a number N that is good and not a multiple of a, and we can find the maximum number k such that N - ka is good, then what does this tell us? This means that N - ka is a multiple of b. Why? We know that N - ka = ax + by for some nonnegative integers x and y since N - ka is good. If x > 0, then N - (k + 1)a = a(x - 1) + by is also good, contradicting the maximality of k. Thus, x = 0 and so N - ka = by. Note that b > 0 since we choose N so that it's not a multiple of a.

To find a value of N such that N is good and not a multiple of a, it is sufficient to take 500000a - 1, since any number greater than ab - a - b is guaranteed to be good. (this is a well-known fact)

We can find the largest k such that N - ka is good via binary search, because if N - ma is not good then N - (m + 1)a can't be good. (or else if N - (m + 1)a = ax + by, then N - ma = a(x + 1) + by) This takes at most 19 questions.

What to do after finding a value which is a multiple of b? Let C = N - ka. We consider the prime factorization of C. The main claim is that if is good, then x must be a multiple of b. The reasoning is the same as what we did before. So, we can find the prime factorization of C, and divide the prime factors one by one. If the number becomes bad, we know that the prime factor cannot be removed, and proceed to the next prime factor. Since a number less than 10000000 can have at most 23 prime factors (maximum is 223), so this takes another 23 questions.

Thus, we only used at most 18 + 19 + 23 = 60 questions to find the values of a and b.


Problem E

Firstly, note that a connected graph on n vertices with n edges contains exactly 1 cycle. Call the vertices on the cycle the cycle vertices. From each cycle vertex, there's a tree rooted at it. Thus, call the remaining vertices the tree vertices. Note that the number of useless edges is equal to the length of the cycle.

Now, we do some casework :

  • u is equal to a tree vertex

Note that this will not change the length of the cycle. Thus, we just have to count how many ways are there to change the value of au such that the graph remains connected. The observation is that for each tree node u, the only possible values of au are the nodes which are not in the subtree of u in the tree u belongs to. Thus, the number of possibilities can be calculated with a tree dp. For each tree, we calculate the subtree size of each node and add all these subtree sizes and subtract this from the total number of ways to choose a non-tree vertex u and choosing the value of au. This part can be done in O(n) time.

  • u is equal to a cycle vertex

For two cycle vertices u and v, let d(u, v) be the directed distance from u to v (We consider the distance from u to v in the functional graph for all 1 ≤ i ≤ n). Note that if we change au to x, and the root of the tree x is in is v (x = v is x is a cycle vertex), then the length of the cycle after the change will be d(v, u) + 1 + h[x], where h[x] is the height of x in its tree. The key is instead of fixing u and iterate through all other nodes x, we iterate through all endpoints x and see how it changes our answer. Note that if x is fixed, which also means that v is fixed, then we just have to add 1 to the answer for c = d(v, u) + 1 + h[x] for all cycle vertices u. However, note that d(v, u) ranges from 0 to C - 1 (where C denotes the length of the original cycle), so this is equivalent to adding 1 to the answer for c = h[x] + 1, h[x] + 2, ..., h[x] + C. Now, we can iterate through all vertices x and add 1 to the answer for c = h[x] + 1, h[x] + 2, ..., h[x] + C. To do this quickly, we can employ the "+1, -1" method. Whenever we want to add 1 to a range [l, r], we add 1 to ansl and subtract 1 from ansr + 1. Then, to find the actual values of the ans array, we just have to take the prefix sum of the ans array.

Finally, do not forget to subtract the cases where v = au from the answer. The total complexity is O(n).


Read more »

  • Vote: I like it
  • +45
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English

Hi everyone!

I would like to invite you to the Weekly Training Farm 22 ! The problemsetter is me (zscoder) and the tester and quality controller is dreamoon_love_AA.

It will be a contest in ACM-ICPC style and contains 6 problems. The difficulty is around 500-1500-1500-1750-2500-2500 (compared to Div. 2 Contests)

The contest begins at 19:30 UTC+8 and lasts for two hours.

To join the contest, join this group (as participant) first and find Weekly Training Farm 22 on the Group Contest tab.

In addition, there will be a few interactive problems in this round. Please check the Interactive Problems Guide if you're not familiar with interactive problems.

Good luck and hope you enjoy the problems!

UPD : Contest starts in around 4.5 hours.

UPD : You can find the editorial here

UPD : Since next week will be the lunar new year, there'll be no Weekly Training Farm next week. It will resume on February.

Read more »

  • Vote: I like it
  • +68
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English

Congratulations to the winners!

  1. spentplaying

  2. 0w1

  3. lawfung

Also special props to biGinNer for solving the last 3 problems (and the only one to solve F during contest)

Here are the editorials :

Problem A.

This is a simple problem. First, we calculate the position Harry ends in after making the moves 1 time. This can be done by directly simulating the moves Harry make. Now, suppose Harry is at (x, y) after 1 iteration. Note that after every iteration, Harry will move x units to the right and y units up, so after n moves he will end up in (nx, ny). The complexity of the solution is O(|s|).

Problem B.

This is a dp problem. Let dp[i] be the maximum possible sum of the remaining numbers in the range [1..i]. For 1 ≤ i ≤ k - 1 the value is just the sum of the numbers in the range. Let dp[0] = 0. For i ≥ k, we may choose to keep the element ai or remove a subsegment of length k which ends at ai. Thus, we arrive at the recurrence dp[i] = max(dp[i - 1] + ai, dp[i - k]). We can calculate the dp values in O(n).

Problem C.

Observe that we can consider each prime factor separately. For each prime p that appears in N, let's see what prime power pki should we pick from each number ai so that the sum of ki is equal to the power of p in the prime factorization of N. Firstly, we need to prime factorize all the numbers ai. We can use Sieve to find the primes and the factorization can be done in . From now on, we'll focus on a specific prime p. Now, we know the maximum prime power mi we can take from each number ai (so ki ≤ mi). From here, we can use a greedy method to decide what to take from each number ai. Note that mi ≤ 20 because 220 = 1048576 > 106. So, for each number ai, we know the cost needed if we take 1, 2, ..., mi factors of p from ai. We can store a vector and for each ai, we push wip, wi(p2 - p), wi(p3 - p2), ..., wi(pmi - pmi - 1) into the vector. Now, we sort the vector and take the first x elements, where x is the power of prime p in the prime factorization of N. If we can't take x elements, the answer is  - 1. We can repeat this for all primes and solve the problem in time.

Problem D.

To solve this problem, you need to know a bit about permutations. First, we need to determine how to find the minimum number of swaps to sort a permutation. This is a well-known problem. Let the permutation be P = p1, p2, ..., pn. Construct a graph by drawing edges from i to pi for all 1 ≤ i ≤ n. Note that the graph is formed by disjoint cycles. You can easily see swapping two elements can either split a cycle into two smaller cycles, or merge two cycles into one cycle. Since the identity permutation is formed by n cycles, the optimal way is to keep splitting cycles into two and increase the total number of cycles by 1 each step. Thus, if we denote c as the number of cycles in the current permutation, the number of moves needed to sort the permutation is n - c. Harry wins if and only if n - c is odd.

The key observation is that whenever there are exactly two question marks left, the first player will always win. Why? Consider how the current graph of the permutation looks like. It will be a union of few cycles and 2 chains (we consider the singleton, a component formed by a single vertex, as a chain). Now, the first player can either choose to close off one of the chains, or join the two chains together. The latter will leave exactly 1 less number of cycles than the former. So, one of them will guarantee the value of n - c to be odd. Thus, the first player only have to choose the correct move. This implies that if the number of question marks m is at least 2, then Harry wins if m is even and loses otherwise.

Now, the only case left is when there're only 1 question mark in the beginning. This means that Harry only have 1 possible move and we're left with the problem of deciding whether the final permutation have n - c odd. Thus, it is enough to count the number of cycles in the formed graph. This can be done by dfs. The complexity of the solution is O(n).

Problem E.

First, we form a trie of the given words. Now, the game is equivalent to the following :

  1. Start from the root of the trie.

  2. Each player can either move to one of the children of the current node, or delete one edge connecting the current node to one of the children.

The one who can't move loses. This reduced game can be solved with Tree DP. Let dp[u] denote the winner of the game if the first player starts at node u. The leaves have dp[u] = 2. Our goal is to find dp[0] (where 0 is the root). The recurrence is simple. Suppose we're finding dp[u] and the children of u are v1, v2, ..., vk. If one of the children has dp value of 2, then Player 1 can just move to that children and win. Otherwise, all children have dp value of 1. Thus, both players will try not to move down unless forced to. So, they'll keep deleting edges. If there are an even number of children, Player 2 will win, as he will either delete all edges or force Player 1 to move down. Otherwise, Player 1 wins. This gives a simple O(n) tree dp solution.

Problem F.

Firstly, we make the same observations as problem C. Swapping two elements will either split a cycle into two or merge two cycles. Note that if we swap two elements from the same cycle, the cycle will split into two. If we swap two elements from different cycles, the two cycles will combine. Also note that for a cycle of size c, we can always split it into two cycles a and b with a, b > 0 and a + b = c by choosing the appropriate two elements to swap from the cycle. Now, the game reduces to choose 2 possibly equal elements from one cycle, swap them, and delete one of the resulting cycles. So, for a given permutation, if the cycle sizes are c1, c2, ..., ck, then each move we can choose one of the sizes and the operation is equivalent to changing the size into any nonnegative number strictly smaller than it. Thus, we have reduced the problem to playing a game of Nim on c1, c2, ..., ck. Since Harry goes second, he wins if and only if the xor values of all the cycle sizes is 0. (This is a well-known fact)

Thus, we've reduced the problem into finding the number of permutations of length n which have the xor of all cycle sizes equal to 0. To do so, let dp[i][j] denote the number of permutations with length i and xor of all cycle sizes equal j. The dp transitions can be done by iterating through all possible sizes s of the cycle containing i. For each s, there are ways to choose the remaining elements of the cycle containing i and (s - 1)! ways to permute them. Thus, we can sum up the values of for all 1 ≤ s ≤ i. The whole solution works in O(n3) time.

Read more »

  • Vote: I like it
  • +44
  • Vote: I do not like it

By zscoder, history, 4 years ago, In English

Hi everyone!

I would like to invite you to the Weekly Training Farm 20 ! The problemsetter is me (zscoder) and the tester and quality controllers are dreamoon_love_AA and drazil.

It will be a contest in ACM-ICPC style and contains 6 problems. The difficulty is around 500-1250-1750-2000-2250-2250 (compared to Div. 2 Contests)

The contest begins at 20:00 UTC+8 and lasts for two hours.

To join the contest, join this group (as participant) first and find Weekly Training Farm 20 on the Group Contest tab.

Reminder : The contest will start in around 5 hours from now.

Update : Less than 1 hour before start. Good luck!

Here's the editorial.

Read more »

  • Vote: I like it
  • +58
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Codechef October Challenge has just ended few hours ago. Every time I find that my weakest spot is in solving those approximation problems. How do you start solving them? There are people who get very high points and I'm curious how they manage to do that.

Read more »

  • Vote: I like it
  • +60
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Hi everyone! Following my last article, today I'm writing about a not-so-common trick that has nevertheless appeared in some problems before and might be helpful to you. I'm not sure if this trick has been given a name yet so I'd refer to it as "Slope Trick" here.

Disclaimer : It would be helpful to have a pen and paper with you to sketch the graphs so that you can visualize these claims easier.

Example Problem 1 : 713C - Sonya and Problem Wihtout a Legend

This solution originated from ko_osaga's comment in the editorial post here.

The solution below will solve this problem in , wheareas the intended solution is O(n2).

So, the first step is to get rid of the strictly increasing condition. To do so, we apply a[i] -= i for all i and thus we just have to find the minimum number of moves to change it to a non-decreasing sequence.

Define fi(x) as the minimum number of moves to change the first i elements into a non-decreasing sequence such that ai ≤ x.

It is easy to see that by definition we have the recurrences

fi(X) = minY ≤ X(|ai - Y|) when i = 1


fi(X) = minY ≤ X(fi - 1(Y) + |ai - Y|}.

Now, note that fi(X) is non-increasing, since it is at most the minimum among all the values of f for smaller X by definition. We store a set of integers that denotes where the function fi change slopes. More formally, we consider the function gi(X) = fi(X + 1) - fi(X). The last element of the set will be the smallest j such that gi(j) = 0, the second last element will be the smallest j such that gi(j) =  - 1, and so on. (note that the set of slope changing points is bounded)

Let Opt(i) denote a position where fi(X) achieves its minimum. (i.e. gi(Opt(i)) = 0) The desired answer will be fn(Opt(n)). We'll see how to update these values quickly.

Now, suppose we already have everything for fi - 1. Now, we want to update the data for fi. First, note that all the values x < ai will have its slope decreased by 1. Also, every value with x ≥ ai will have its slope increased by 1 unless we have reached the slope = 0 point, in which the graph never goes up again.

There are two cases to consider :

Case 1 : Opt(i - 1) ≤ ai

Here, the slope at every point before ai decreases by 1. Thus, we push ai into the slope array as this indicates that we decreases the slope at all the slope changing points by 1, and the slope changing point for slope = 0 is ai, i.e. Opt(i) = ai. Thus, this case is settled.

Case 2 : Opt(i - 1) > ai

Now, we insert ai into the set, since it decreases the slope at all the slope changing points before ai by 1. Furthermore, we insert ai again because it increases the slope at the slope changing points between ai and Opt(i - 1) by 1. Now, we can just take Opt(i) = Opt(i - 1) since the slope at Opt(i - 1) is still 0. Finally, we remove Opt(i - 1) from the set because it's no longer the first point where the slope changes to 0. (it's the previous point where the slope changes to  - 1 and the slope now becomes 0 because of the addition of ai) Thus, the set of slope changing points is maintained. We have fi(Opt(i)) = fi - 1(Opt(i - 1)) + |Opt(i - 1) - ai|.

Thus, we can just use a priority queue to store the slope changing points and it is easy to see that the priority queue can handle all these operations efficiently (in time).

Here's the implementation of this idea by ko_osaga : 20623607

This trick is called the "Slope Trick" because we're considering the general function and analyzing how its slope changes at different points to find the minimum or maximum value.

The next example is APIO 2016 P2 — Fireworks

This problem was the "killer" problem of APIO 2016, and was solved by merely 4 contestants in the actual contest.

I'll explain the solution, which is relatively simple and demonstrates the idea of slope trick.

So, the idea is similar to the last problem. For each node u, we store a function f(x) which denotes the minimum cost to change the weights on edges in the entire subtree rooted at u including the parent edge of u such that the sum of weights on each path from u to leaves are equal to x. We'll store the slope changing points of the function in a container (which we'll determine later) again. In addition, we store two integers a, b, which denotes that for all x ≥ X, where X is the largest slope changing point, the value of the function is aX + b. (clearly this function exists, since when X increases one can always increase the parent node by 1)

Now, for the child nodes i, it is clear that a = 1, b =  - ci, where ci is the cost of the parent edge of i, and the slope changing points are {ci, ci}.

For a non-leaf node u, we have to combine the functions from its children first. Firstly, we set the function as the sum of all functions of its child, and we'll correct it later. We set the value a of this node as the sum of all as of its children, and similarly for b. Also, we combine all the slope-changing points together. It is important that we merge the smaller sets into the larger set. (see dsu on tree, a.k.a. small-to-large technique)

Now, the function is still incorrect. Firstly, note that all the slope-changing points that have slope  > 1 is meaningless, because we can just increase the parent edge by 1 to increase the sum of the whole subtree, so we can remove these slope-changing points while updating the values of a, b. Suppose we remove a slope-changing point x with slope a, then we decrement a, increase b by x, and remove x from the set. (this is because ax + b = (a - 1)x + (b + x)) Repeat this till a becomes at most 1.

Next, since the cost of the parent edge is ci, we have to shift the slope 0 and 1 changing points to the right by ci. Note that the slope  - 1 changing point doesn't change, because we can just reduce the weight of ci until it reaches 0. (note that the condition that the weights can be reduced to 0 helped here)

Finally, we have to decrease b by ci, since we shifted the points to the right by ci. Thus, the function for this node is now complete.

Thus, we can do a dfs and keep merging functions until we get the function for the root node. Then, we just have to find the value of the function when a = 0. (using the same method by we decrease a until it reaches 0) Finally, the answer will be the updated value of b at the root node, and we're done.

We'll use a priority queue to store the slope changing points as it is the most convenient option.

Official Solution

Beyond APIO 2016 Fireworks

Now, the next example is the generalization of this problem. It has came from Codechef October Challenge — Tree Balancing. We'll solve this using the slope trick as well.

The Codechef problem is the same as the last problem, except :

  1. The weights of the edges can be changed to negative values

  2. You must output a possible construction aside from the minimum cost needed

  3. The edges now have a cost wi, and when you change the value of an edge by 1, your total cost increases by wi.

However, it is still possible to solve this using Slope Trick.

Firstly, we suppose that wi = 1, to simplify the problem. Now, since the edges can be changed to negative values, at each node there is no point with slope that has absolute value greater than 1, since changing the parent edge will yield a better result. Thus, each node actually have only 2 slope-changing points, the point where the slope changes from  - 1 to 0 and the point where the slope changes from 0 to 1. Thus, this means that we have to pop slope-changing points from the front as well as the back of the set. The best way to store the data is to use a multiset.

With this modification, we can find the minimum cost needed like before. Now, the second part of the question is, how to reconstruct the answer? This part is not hard if you understand what we're doing here. The problem reduces to solving for each node u, if I need to make the sum of weights from the parent of u to all leaves equal to x, what should the parent edge weight be, where x is given. We start from the childrens of the root, with value x which is equal to the point where the slope changes from 0 to 1. (i.e. the point that yields minimum value)

For each node we store the 2 slope-changing points li, ri in an array while we find the minimum cost. Now, if li ≤ x ≤ ri, then the best thing to do is not change the parent edge. If x > ri, then we should increase the parent edge value by x - ri. Otherwise, we should decrease the parent edge value by li - x.

Thus, we can find the required weights for the parent nodes and it remains to push the remaining sum of weights needed to its children and recurse until we get all the weights of the edges. The time complexity is the same.

My submission for this case, which gives 20 points

To get the full AC, we need to solve the cost-weighted case. It is actually similar to this case, but we have to modify the solution a bit.

The idea is still the same. However, the slope changing points has increased by a lot. To efficiently store these slope points, we will store the compressed form of the set. For example, the set {3, 4, 5, 5, 5, 5, 6, 6} will be stored as {(3, 1), (4, 1), (5, 4), (6, 2)}. Basically, we store the number of occurences of the integers instead of storing it one by one. We can use a map to handle this.

The base case is a bit different now. Suppose the leaf node is u and the cost of its parent edge is du. Then, a = du, b =  - cu × du, where cu is the weight of its parent edge. The slope changing points is {(cu, 2du)}.

Merging the functions to its parent will be the same. Now, we have to update the slope changing points and the function ax + b. First, we remove all points with slope  > du and  <  - du, as we can just change the parent edge. Then, we have to shift every slope changing point by cu. However, shifting the whole map naively is inefficient. The trick here is to store a counter shift for each node that denotes the amount to add for each slope changing point. Now, the shifting part is equivalent to just adding cu to the counter shift. Finally, we update a and b as before.

To recover the solution, we use the same method as above, with some changes. Firstly, l and r will be the minimum slope changing point of the function and maximum slope changing point of the function respectively. Secondly, if the sum of di of all children is less than the di of the parent edge, then we do not change the weight of the parent edge, as it is sufficient to just update all the children edges.

My implementation of this solution (100 points)

That's it for this post. If you know any other application of this trick, feel free to post them in the comments.

Read more »

  • Vote: I like it
  • +192
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Hi everyone! Today I want to share some DP tricks and techniques that I have seen from some problems. I think this will be helpful for those who just started doing DP. Sometimes the tutorials are very brief and assumes the reader already understand the technique so it will be hard for people who are new to the technique to understand it.

Note : You should know how to do basic DP before reading the post

DP + Bitmasks

This is actually a very well-known technique and most people should already know this. This trick is usually used when one of the variables have very small constraints that can allow exponential solutions. The classic example is applying it to solve the Travelling Salesman Problem in O(n2·2n) time. We let dp[i][j] be the minimum time needed to visit the vertices in the set denoted by i and ending at vertex j. Note that i will iterate through all possible subsets of the vertices and thus the number of states is O(2n·n). We can go from every state to the next states in O(n) by considering all possible next vertex to go to. Thus, the time complexity is O(2n·n2).

Usually, when doing DP + Bitmasks problems, we store the subsets as an integer from 0 to 2n - 1. How do we know which elements belong to a subset denoted by i? We write i in its binary representation and for each bit j that is 1, the j-th element is included in the set. For example, the set 35 = 1000112 denotes the set {0, 4, 5} (the bits are 0-indexed from left to right). Thus, to test if the j-th element is in the subset denoted by j, we can test if i & (1<<j) is positive. (Why? Recall that (1<<j) is 2j and how the & operator works.)

Now, we look at an example problem : 453B - Little Pony and Harmony Chest

So, the first step is to establish a maximum bound for the bi. We prove that bi < 2ai. Assume otherwise, then we can replace bi with 1 and get a smaller answer (and clearly it preserves the coprime property). Thus, bi < 60. Note that there are 17 primes less than 60, which prompts us to apply dp + bitmask here. Note that for any pair bi, bj with i ≠ j, their set of prime factors must be disjoint since they're coprime.

Now, we let dp[i][j] be the minimum answer one can get by changing the first i elements such that the set of primes used (i.e. the set of prime factors of the numbers b1, b2, ..., bi) is equal to the subset denoted by j. Let f[x] denote the set of prime factors of x. Since bi ≤ 60, we iterate through all possible values of bi, and for a fixed bi, let F = f[bi]. Then, let x be the complement of the set F, i.e. the set of primes not used by bi. We iterate through all subsets of x. (see here for how to iterate through all subsets of a subset x) For each s which is a subset of x, we want dp[i][s|F] = min(dp[i][s|F], dp[i - 1][s] + abs(a[i] - b[i])). This completes the dp. We can reconstruct the solution by storing the position where the dp achieves its minimum value for each state as usual. This solution is enough to pass the time limits.

Here are some other problems that uses bitmask dp :

678E - Another Sith Tournament

662C - Binary Table

Do we really need to visit all the states?

Sometimes, the naive dp solution to a problem might take too long and too much memory. However, sometimes it is worth noting that most of the states can be ignored because they will never be reached and this can reduce your time complexity and memory complexity.

Example Problem : 505C - Mr. Kitayuta, the Treasure Hunter

So, the most direct way of doing dp would be let dp[i][j] be the number of gems Mr. Kitayuta can collect after he jumps to island i, while the length of his last jump is equal to j. Then, the dp transitions are quite obvious, because we only need to test all possible jumps and take the one that yields maximum results. If you have trouble with the naive dp, you can read the original editorial.

However, the naive method is too slow, because it would take O(m2) time and memory. The key observation here is that most of the states will never be visited, more precisiely j can only be in a certain range. These bounds can be obtained by greedily trying to maximize j and minimize j and we can see that their values will always be in the order of from the initial length of jump. This type of intuition might come in handy to optimize your dp and turn the naive dp into an AC solution.

Change the object to dp

Example Problem : 559C - Gerald and Giant Chess

This is a classic example. If the board was smaller, say 3000 × 3000, then the normal 2D dp would work. However, the dimensions of the grid is too large here.

Note that the number of blocked cells is not too large though, so we can try to dp on them. Let S be the set of blocked cells. We add the ending cell to S for convenience. We sort S in increasing order of x-coordinate, and break ties by increasing order of y-coordinate. As a result, the ending cell will always be the last element of S.

Now, let dp[i] be the number of ways to reach the i-th blocked cell (assuming it is not blocked). Our goal is to find dp[s], where s = |S|.

Note that since we have sort S by increasing order, the j-th blocked cell will not affect the number of ways to reach the i-th blocked cell if i < j. (There is no path that visits the j-th blocked cell first before visiting the i-th blocked cell)

The number of ways from square (x1, y1) to (x2, y2) without any blocked cells is . (if x2 > x1, y2 > y1. The case when some two are equal can be handled trivially). Let f(P, Q) denote the number of ways to reach Q from P. We can calculate f(P, Q) in O(1) by precomputing factorials and its inverse like above.

The base case, dp[1] can be calculated as the number of ways to reach S1 from the starting square. Similarly, we initialize all dp[i] as the number of ways to reach Si from the starting square.

Now, we have to subtract the number of paths that reach some of the blocked cells. Assume we already fixed the values of dp[1], dp[2], ..., dp[i - 1]. For a fix blocked cell Si, we'll do so by dividing the paths into groups according to the first blocked cell it encounters. The number of ways for each possible first blocked cell j is equal to dp[jf(Sj, Si), so we can subtract this from dp[i]. Thus, this dp works in O(n2).

Another problem using this idea : 722E - Research Rover

Open and Close Interval Trick

Example Problem : 626F - Group Projects

First, note that the order doesn't matter so we can sort the ai in non-decreasing order. Now, note that every interval's imbalance can be calculated with its largest and smallest value. We start adding the elements to sets from smallest to largest in order. Suppose we're adding the i-th element. Some of the current sets are open, i.e. has a minimum value but is not complete yet (does not have a maximum). Suppose there are j open sets. When we add ai, the sum ai - ai - 1 will contribute to each of the j open sets, so we increase the current imbalance by j(ai - ai - 1).

Let dp[i][j][k] be the number of ways such that when we inserted the first i elements, there are j open sets and the total imbalance till now is k. Now, we see how to do the state transitions. Let v = dp[i - 1][j][k]. We analyze which states involves v.

Firstly, the imbalance of the new state must be val = k + j(ai - ai - 1), as noted above. Now, there are a few cases :

  1. We place the current number ai in its own group : Then, dp[i][j][val] +  = v.

  2. We place the current number ai in one of the open groups, but not close it : Then, dp[i][j][val] +  = j·v (we choose one of the open groups to add ai.

  3. Open a new group with minimum = ai : Then, dp[i][j + 1][val] +  = v.

  4. Close an open group by inserting ai in one of them and close it : Then, dp[i][j - 1][val] +  = j·v.

The answer can be found as dp[n][0][0] + dp[n][0][1] + ... + dp[n][0][k].

Related Problems :

466D - Increase Sequence

367E - Sereja and Intervals

"Connected Component" DP

Example Problem : JOI 2016 Open Contest — Skyscrapers

Previously, I've made a blog post here asking for a more detailed solution. With some hints from Reyna, I finally figured it out and I've seen this trick appeared some number of times.

Abridged Statement : Given a1, a2, ..., an, find the number of permutations of these numbers such that |a1 - a2| + |a2 - a3| + ... + |an - 1 - an| ≤ L where L is a given integer.

Constraints : n ≤ 100, L ≤ 1000, ai ≤ 1000

Now, we sort the values ai and add them into the permutation one by one. At each point, we will have some connected components of values (for example it will be something like 2, ?, 1, 5, ?, ?, 3, ?, 4)

Now, suppose we already added ai - 1. We treat the ? as ai and calculate the cost. When we add a new number we increase the values of the ? and update the cost accordingly.

Let dp[i][j][k][l] be the number of ways to insert the first i elements such that :

  • There are j connected components

  • The total cost is k (assuming the ? are ai + 1)

  • l of the ends of the permutations has been filled. (So, 0 ≤ l ≤ 2)

I will not describe the entire state transitions here as it will be very long. If you want the complete transitions you can view the code below, where I commented what each transition means.

Some key points to note :

  • Each time you add a new element, you have to update the total cost by ai + 1 - ai times the number of filled spaces adjacent to an empty space.

  • When you add a new element, it can either combine 2 connected components, create a new connected components, or be appended to the front or end of one of the connected components.

Code with comments

A problem that uses this idea can be seen here : 704B - Ant Man

 × 2,  + 1 trick

This might not be a very common trick, and indeed I've only seen it once and applied it myself once. This is a special case of the "Do we really need to visit all the states" example.

Example 1 : Perfect Permutations, Subtask 4

My solution only works up to Subtask 4. The official solution uses a different method but the point here is to demonstrate this trick.

Abridged Statement : Find the number of permutations of length N with exactly K inversions. (K ≤ N, N ≤ 109, K ≤ 1000 (for subtask 4))

You might be wondering : How can we apply dp when N is as huge as 109? We'll show how to apply it below. The trick is to skip the unused states.

First, we look at how to solve this when N, K are small.

Let dp[i][j] be the number of permutations of length i with j inversions. Then, dp[i][j] = dp[i - 1][j] + dp[i - 1][j - 1] + ... + dp[i - 1][j - (i - 1)]. Why? Again we consider the permutation by adding the numbers from 1 to i in this order. When we add the element i, adding it before k of the current elements will increase the number of inversions by k. So, we sum over all possibilities for all 0 ≤ k ≤ i - 1. We can calculate this in O(N2) by sliding window/computing prefix sums.

How do we get read of the N factor and replace it with K instead? We will use the following trick :

Suppose we calculated dp[i][j] for all 0 ≤ j ≤ K. We have already figured out how to calculate dp[i + 1][j] for all 0 ≤ j ≤ K in O(K). The trick here is we can calculate dp[2i][j] from dp[i][j] for all j in O(K2).

How? We will find the number of permutations using 1, 2, ..., n and n + 1, n + 2, ..., 2n and combine them together. Suppose the first permutation has x inversions and the second permutation has y inversions. How will the total number of inversions when we merge them? Clearly, there'll be at least x + y inversions.

Now, we call the numbers from 1 to n small and n + 1 to 2n large. Suppose we already fixed the permutation of the small and large numbers. Thus, we can replace the small numbers with the letter 'S' and large numbers with the letter 'L'. For each L, it increases the number of inversions by the number of Ss at the right of it. Thus, if we want to find the number of ways that this can increase the number of inversions by k, we just have to find the number of unordered tuples of nonnegative integers (a1, a2, ..., an) such that they sum up to k (we can view ai as the number of Ss at the back of the i-th L)

How do we count this value? We'll count the number of such tuples where each element is positive and at most k and the elements sum up to k instead, regardless of its length. This value will be precisely what we want for large enough n because there can be at most k positive elements and thus the length will not exceed n when n > k. We can handle the values for small n with the naive O(n2) dp manually so there's no need to worry about it.

Thus, it remains to count the number of such tuples where each element is positive and at most k and sums up to S = k. Denote this value by f(S, k). We want to find S(k, k). We can derive the recurrence f(S, k) = f(S, k - 1) + f(S - k, k), denoting whether we use k or not in the sum. Thus, we can precompute these values in O(K2).

Now, let g0, g1, g2, ..., gK be the number of permutations of length n with number of inversions equal to 0, 1, 2, ..., K.

To complete this step, we can multiply the polynomial g0 + g1x + ... + gKxK by itself (in O(K2) or with FFT, but that doesn't really change the complexity since the precomputation already takes O(K2)), to obtain the number of pairs of permutations of {1, 2, ..., n} and {n + 1, n + 2, ..., 2n} with total number of inversions i for all 0 ≤ i ≤ K.

Next, we just have to multiply this with f(0, 0) + f(1, 1)x + ... + f(K, K)xK and we get the desired answer for permutations of length 2n, as noted above.

Thus, we have found a way to obtain dp[2i][·] from dp[i][·] in O(K2).

To complete the solution, we first write N in its binary representation and compute the dp values for the number formed from the first 10 bits (until the number is greater than K). Then, we can update the dp values when N is multiplied by 2 or increased by 1 in O(K2) time, so we can find the value dp[N][K] in , which fits in the time limit for this subtask.

My code.

Example 2 : Problem Statement in Mandarin

This solution originated from the comment from YuukaKazami here

Problem Statement : A sequence a1, a2, ..., an is valid if all its elements are pairwise distinct and for all i. We define value(S) of a valid sequence S as the product of its elements. Find the sum of value(S) for all possible valid sequences S, modulo p where p is a prime.

Constraints : A, p ≤ 109, n ≤ 500, p > A > n + 1

Firstly, we can ignore the order of the sequence and multiply the answer by n! in the end because the numbers are distinct.

First, we look at the naive solution :

Now, let dp[i][j] be the sum of values of all valid sequences of length j where values from 1 to i inclusive are used.

The recurrence is dp[i][j] = dp[i - 1][j] + i·dp[i - 1][j - 1], depending on whether i is used.

This will give us a complexity of O(An), which is clearly insufficient.

Now, we'll use the idea from the last example. We already know how to calculate dp[i + 1][·] from dp[i][·] in O(n) time. Now, we just have to calculate dp[2i][·] from dp[i][·] fast.

Suppose we want to calculate dp[2A][n]. Then, we consider for all possible a the sum of the values of all sequences where a of the elements are selected from 1, 2, ..., A and the remaining n - a are from i + 1, i + 2, ..., 2A.

Firstly, note that .

Now, let ai denote the sum of all values of sequences of length i where elements are chosen from {1, 2, ..., A}, i.e. dp[A][i].

Let bi denote the same value, but the elements are chosen from {A + 1, A + 2, ..., 2A}.

Now, we claim that . Indeed, this is just a result of the formula above, where we iterate through all possible subset sizes. Note that the term is the number of sets of size i which contains a given subset of size j and all elements are chosen from 1, 2, ..., A. (take a moment to convince yourself about this formula)

Now, computing the value of isn't hard (you can write out the binomial coefficient and multiply its term one by one with some precomputation, see the formula in the original pdf if you're stuck), and once you have that, you can calculate the values of bi in O(n2).

Finally, with the values of bi, we can calculate dp[2A][·] the same way as the last example, as dp[2A][n] is just and we can calculate this by multiplying the two polynomials formed by [ai] and [bi]. Thus, the entire step can be done in O(n2).

Thus, we can calculate dp[2i][·] and dp[i + 1][·] in O(n2) and O(n) respectively from dp[i][·]. Thus, we can write A in binary as in the last example and compute the answers step by step, using at most steps. Thus, the total time complexity is , which can pass.

This is the end of this post. I hope you benefited from it and please share your own dp tricks in the comments with us.

Read more »

  • Vote: I like it
  • +612
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Contest link


Start time : 21:00 JST as usual

Reminder that this contest actually exists on Atcoder :)

Let's discuss the problem after contest.

Read more »

  • Vote: I like it
  • +53
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

I don't know about others, but recently I've been getting quite a number of private messages on CF and Hackerrank (well basically anywhere with a PM system) that sounds like this :

"Hi, regarding codechef long challenge october.. How to do that power sum problwm.. did u get any idea.. if so. then please drop me a hint.. thanks"


I was trying POWSUMS in this month's codechef long challenge. Can I get a hint for that problem?


"Can you send me the code for Simplified Chess engine or give me how to solve it ?"

"Hi Zi,

Any hints for Shashank and the Palindromic Strings


and more (FYI "POWSUMS", "Simplified Chess engine" and "Shashank and the Palindromic Strings" are live contest problems)

Is anyone else getting these PMs too? I find them annoying like it when you see "You received 2 new messages" and all of them are asking for hints/sols/code for a live contest problem. Why do people do this? It's not like anyone is going to tell them the solution anyway.

Read more »

  • Vote: I like it
  • +77
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

We hope everyone enjoyed the problems. Here is the editorial for the problems. I tried to make it more detailed but there might be some parts that might not be explained clearly.

Div. 2 A — Crazy Computer

Prerequisites : None

This is a straightforward implementation problem. Iterate through the times in order, keeping track of when is the last time a word is typed, keeping a counter for the number of words appearing on the screen. Increment the counter by 1 whenever you process a new time. Whenever the difference between the time for two consecutive words is greater than c, reset the counter to 0. After that, increment it by 1.

Time Complexity : O(n), since the times are already sorted.

Code (O(n))

Div. 2 B — Complete The Word

Prerequisites : None

Firstly, if the length of the string is less than 26, output  - 1 immediately.

We want to make a substring of length 26 have all the letters of the alphabet. Thus, the simplest way is to iterate through all substrings of length 26 (there are O(n) such substrings), then for each substring count the number of occurrences of each alphabet, ignoring the question marks. After that, if there exist a letter that occurs twice or more, this substring cannot contain all letters of the alphabet, and we process the next substring. Otherwise, we can fill in the question marks with the letters that have not appeared in the substring and obtain a substring of length 26 which contains all letters of the alphabet. After iterating through all substrings, either there is no solution, or we already created a nice substring. If the former case appears, output  - 1. Otherwise, fill in the remaining question marks with random letters and output the string.

Note that one can optimize the solution above by noting that we don't need to iterate through all 26 letters of each substring we consider, but we can iterate through the substrings from left to right and when we move to the next substring, remove the front letter of the current substring and add the last letter of the next substring. This optimization is not required to pass.

We can still optimize it further and make the complexity purely O(|s|). We use the same trick as above, when we move to the next substring, we remove the previous letter and add the new letter. We store a frequency array counting how many times each letter appear in the current substring. Additionally, store a counter which we will use to detect whether the current substring can contain all the letters of the alphabet in O(1). When a letter first appear in the frequency array, increment the counter by 1. If a letter disappears (is removed) in the frequency array, decrement the counter by 1. When we add a new question mark, increment the counter by 1. When we remove a question mark, decrement the counter by 1. To check whether a substring can work, we just have to check whether the counter is equal to 26. This solution works in O(|s|).

Time Complexity : O(|s|·262), O(|s|·26) or O(|s|)

Code (O(26^2*|s|)
Code (O(26*|s|)
Code (O(|s|)

Div. 2 C/Div. 1 A — Plus and Square Root

Prerequisites : None

Firstly, let ai(1 ≤ i ≤ n) be the number on the screen before we level up from level i to i + 1. Thus, we require all the ais to be perfect square and additionally to reach the next ai via pressing the plus button, we require and for all 1 ≤ i < n. Additionally, we also require ai to be a multiple of i. Thus, we just need to construct a sequence of such integers so that the output numbers does not exceed the limit 1018.

There are many ways to do this. The third sample actually gave a large hint on my approach. If you were to find the values of ai from the second sample, you'll realize that it is equal to 4, 36, 144, 400. You can try to find the pattern from here. My approach is to use ai = [i(i + 1)]2. Clearly, it is a perfect square for all 1 ≤ i ≤ n and when n = 100000, the output values can be checked to be less than 1018

Unable to parse markup [type=CF_TEX]

which is a multiple of i + 1, and is also a multiple of i + 1.

The constraints ai must be a multiple of i was added to make the problem easier for Div. 1 A.

Time Complexity : O(n)

Code (O(n))

Div. 2 D/Div. 1 B — Complete The Graph

Prerequisites : Dijkstra's Algorithm

This problem is actually quite simple if you rule out the impossible conditions. Call the edges that does not have fixed weight variable edges. First, we'll determine when a solution exists.

Firstly, we ignore the variable edges. Now, find the length of the shortest path from s to e. If this length is  < L, there is no solution, since even if we replace the 0 weights with any positive weight the shortest path will never exceed this shortest path. Thus, if the length of this shortest path is  < L, there is no solution. (If no path exists we treat the length as .)

Next, we replace the edges with 0 weight with weight 1. Clearly, among all the possible graphs you can generate by replacing the weights, this graph will give the minimum possible shortest path from s to e, since increasing any weight will not decrease the length of the shortest path. Thus, if the shortest path of this graph is  > L, there is no solution, since the shortest path will always be  > L. If no path exists we treat the length as .

Other than these two conditions, there will always be a way to assign the weights so that the shortest path from s to e is exactly L! How do we prove this? First, consider all paths from s to e that has at least one 0 weight edge, as changing weights won't affect the other paths. Now, we repeat this algorithm. Initially, assign all the weights as 1. Then, sort the paths in increasing order of length. If the length of the shortest path is equal to L, we're done. Otherwise, increase the weight of one of the variable edges on the shortest path by 1. Note that this will increase the lengths of some of the paths by 1. It is not hard to see that by repeating these operations the shortest path will eventually have length L, so an assignment indeed exists.

Now, we still have to find a valid assignment of weights. We can use a similar algorithm as our proof above. Assign 1 to all variable edges first. Next, we first find and keep track of the shortest path from s to e. Note that if this path has no variable edges it must have length exactly L or strictly more than L, so either we're already done or the shortest path contains variable edges and the length is strictly less than L. (otherwise we're done)

From now on, whenever we assign weight to a variable edge (after assigning 1 to every variable edge), we call the edge assigned.

Now, mark all variable edges not on the shortest path we found as weight. (we can choose any number greater than L as ) Next, we will find the shortest path from s to e, and replace the weight of an unassigned variable edge such that the length of the path becomes equal to L. Now, we don't touch the assigned edges again. While the shortest path from s to e is still strictly less than L, we repeat the process and replace a variable edge that is not assigned such that the path length is equal to L. Note that this is always possible, since otherwise this would've been the shortest path in one of the previous steps. Eventually, the shortest path from s to e will have length exactly L. It is easy to see that we can repeat this process at most n times because we are only replacing the edges which are on the initial shortest path we found and there are less than n edges to replace (we only touch each edge at most once). Thus, we can find a solution after less than n iterations. So, the complexity becomes . This is sufficient to pass all tests.

What if the constraints were n, m ≤ 105? Can we do better?

Yes! Thanks to HellKitsune who found this solution during testing. First, we rule out the impossible conditions like we did above. Then, we assign all the variable edges with weight. We enumerate the variable edges arbitarily. Now, we binary search to find the minimal value p such that if we make all the variable edges numbered from 1 to p have weight 1 and the rest , then the shortest path from s to e has length  ≤ L. Now, note that if we change the weight of p to the length of shortest path will be more than L. (if p equals the number of variable edges, the length of the shortest path is still more than L or it will contradict the impossible conditions) If the weight is 1, the length of the shortest path is  ≤ L. So, if we increase the weight of edge p by 1 repeatedly, the length of the shortest path from s to e will eventually reach L, since this length can increase by at most 1 in each move. So, since the length of shortest path is non-decreasing when we increase the weight of this edge, we can binary search for the correct weight. This gives an solution.

Time Complexity : or

Code (O(mnlogn))
Code (O(mlogn(logm+logL))

Div. 2 E/Div. 1 C — Digit Tree

Prerequisites : Tree DP, Centroid Decomposition, Math

Compared to the other problems, this one is more standard. The trick is to first solve the problem if we have a fixed vertex r as root and we want to find the number of paths passing through r that works. This can be done with a simple tree dp. For each node u, compute the number obtained when going from r down to u and the number obtained when going from u up to r, where each number is taken modulo M. This can be done with a simple dfs. To calculate the down value, just multiply the value of the parent node by 10 and add the value on the edge to it. To calculate the up value, we also need to calculate the height of the node. (i.e. the distance from u to r) Then, if we let h be the height of u, d be the digit on the edge connecting u to its parent and val be the up value of the parent of u, then the up value for u is equal to 10h - 1·d + val. Thus, we can calculate the up and down value for each node with a single dfs.

Next, we have to figure out how to combine the up values and down values to find the number of paths passing through r that are divisible by M. For this, note that each path is the concatenation of a path from u to r and r to v, where u and v are pairs of vertices from different subtrees, and the paths that start from r and end at r. For the paths that start and end at r the answer can be easily calculated with the up and down values (just iterate through all nodes as the other endpoint). For the other paths, we iterate through all possible v, and find the number of vertices u such that going from u to v will give a multiple of M. Since v is fixed, we know its height and down value, which we denote as h and d respectively. So, if the up value of u is equal to up, then up·10h + d must be a multiple of M. So, we can solve for up to be  - d·10 - h modulo M. Note that in this case the multiplicative inverse of 10 modulo M is well-defined, as we have the condition . To find the multiplicative inverse of 10, we can find φ(M) and since by Euler's Formula we have xφ(M) ≡ 1(modM) if , we have xφ(M) - 1 ≡ x - 1(modM), which is the multiplicative inverse of x (in this case we have x = 10) modulo M. After that, finding the up value can be done by binary exponentiation.

Thus, we can find the unique value of up such that the path from u to v is a multiple of M. This means that we can just use a map to store the up values of all nodes and also the up values for each subtree. Then, to find the number of viable nodes u, find the required value of up and subtract the number of suitable nodes that are in the same subtree as v from the total number of suitable nodes. Thus, for each node v, we can find the number of suitable nodes u in time.

Now, we have to generalize this for the whole tree. We can use centroid decomposition. We pick the centroid as the root r and find the number of paths passing through r as above. Then, the other paths won't pass through r, so we can remove r and split the tree into more subtrees, and recursively solve for each subtree as well. Since each subtree is at most half the size of the original tree, and the time taken to solve the problem where the path must pass through the root for a single tree takes time proportional to the size of the tree, this solution works in time, where the other comes from using maps.

Time Complexity :


Div. 1 D — Create a Maze

Prerequisites : None

The solution to this problem is quite simple, if you get the idea. Thanks to danilka.pro for improving the solution to the current constraints which is much harder than my original proposal.

Note that to calculate the difficulty of a given maze, we can just use dp. We write on each square (room) the number of ways to get from the starting square to it, and the number written on (i, j) will be the sum of the numbers written on (i - 1, j) and (i, j - 1), and the edge between (i - 1, j) and (i, j) is blocked, we don't add the number written on (i - 1, j) and similarly for (i, j - 1). We'll call the rooms squares and the doors as edges. We'll call locking doors as edge deletions.

First, we look at several attempts that do not work.

Write t in its binary representation. To solve the problem, we just need to know how to construct a maze with difficulty 2x and x + 1 from a given maze with difficulty x. The most direct way to get from x to 2x is to increase both dimensions of the maze by 1. Let's say the bottom right square of the grid was (n, n) and increased to (n + 1, n + 1). So, the number x is written at (n, n). Then, we can block off the edge to the left of (n + 1, n) and above (n, n + 1). This will make the numbers in these two squares equal to x, so the number in square (n + 1, n + 1) would be 2x, as desired. To create x + 1 from x, we can increase both dimensions by 1, remove edges such that (n + 1, n) contains x while (n, n + 1) contains 1 (this requires deleting most of the edges joining the n-th column and (n + 1)-th column. Thus, the number in (n, n) would be x + 1. This would've used way too many edge deletions and the size of the grid would be too large. This was the original proposal.

There's another way to do it with binary representation. We construct a grid with difficulty 2x and 2x + 1 from a grid with difficulty x. The key idea is to make use of surrounding 1s and maintaining it with some walls so that 2x + 1 can be easily constructed. This method is shown in the picture below. This method would've used around 120 × 120 grid and 480 edge deletions, which is too large to pass.

Now, what follows is the AC solution. Since it's quite easy once you get the idea, I recommend you to try again after reading the hint. To read the full solution, click on the spoiler tag.

Hint : Binary can't work since there can be up to 60 binary digits for t and our grid size can be at most 50. In our binary solution we used a 2 × 2 grid to multiply the number of ways by 2. What about using other grid sizes instead?

Full Solution

Of course, this might not be the only way to solve this problem. Can you come up with other ways of solving this or reducing the constraints even further? (Open Question)

Time Complexity :


Div. 1 E — Complete The Permutations

Prerequisites : Math, Graph Theory, DP, Any fast multiplication algorithm

We'll slowly unwind the problem and reduce it to something easier to count. First, we need to determine a way to tell when the distance between p and q is exactly k. This is a classic problem but I'll include it here for completeness.

Let f denote the inverse permutation of q. So, the minimum number of swaps to transform p into q is the minimum number of swaps to transform pfi into the identity permutation. Construct the graph where the edges are for all 1 ≤ i ≤ n. Now, note that the graph is equivalent to and is composed of disjoint cycles after qi and pi are filled completely. Note that the direction of the edges doesn't matter so we consider the edges to be for all 1 ≤ i ≤ n. Note that if the number of cycles of the graph is t, then the minimum number of swaps needed to transform p into q would be n - t. (Each swap can break one cycle into two) This means we just need to find the number of ways to fill in the empty spaces such that the number of cycles is exactly i for all 1 ≤ i ≤ n.

Now, some of the values pi and qi are known. The edges can be classified into four types :

A-type : The edges of the form , i.e. pi is known, qi isn't.

B-type : The edges of the form , i.e. qi is known, pi isn't.

C-type : The edges of the form , i.e. both pi and qi are known.

D-type : The edges of the form , i.e. both pi and qi are unknown.

Now, the problem reduces to finding the number of ways to assign values to the question marks such that the number of cycles of the graph is exactly i for all 1 ≤ i ≤ n. First, we'll simplify the graph slightly. While there exists a number x appears twice (clearly it can't appear more than twice) among the edges, we will combine the edges with x together to simplify the graph. If there's an edge , then we increment the total number of cycles by 1 and remove this edge from the graph. If there is an edge and , where a and b might be some given numbers or question marks, then we can merge them together to form the edge . Clearly, these are the only cases for x to appear twice. Hence, after doing all the reductions, we're reduced to edges where each known number appears at most once, i.e. all the known numbers are distinct. We'll do this step in O(n2). For each number x, store the position i such that pi = x and also the position j such that qj = x, if it has already been given and  - 1 otherwise. So, we need to remove a number when the i and j stored are both positive. We iterate through the numbers from 1 to n. If we need to remove a number, we go to the two positions where it occur and replace the two edges with the new merged one. Then, recompute the positions for all numbers (takes O(n) time). So, for each number, we used O(n) time. (to remove naively and update positions) Thus, the whole complexity for this part is O(n2). (It is possible to do it in O(n) with a simple dfs as well. Basically almost any correct way of doing this part that is at most O(n3) works, since the constraints for n is low)

Now, suppose there are m edges left and p known numbers remain. Note that in the end when we form the graph we might join edges of the form and (where a and b are either fixed numbers or question marks) together. So, the choice for the ? can be any of the m - p remaining unused numbers. Note that there will be always m - p such pairs so we need to multiply our answer by (m - p)! in the end. Also, note that the ? are distinguishable, and order is important when filling in the blanks.

So, we can actually reduce the problem to the following : Given integers a, b, c, d denoting the number of A-type, B-type, C-type, D-type edges respectively. Find the number of ways to create k cycles using them, for all 1 ≤ k ≤ n. Note that the answer is only dependent on the values of a, b, c, d as the numbers are all distinct after the reduction.

First, we'll look at how to solve the problem for k = 1. We need to fit all the edges in a single cycle. First, we investigate what happens when d = 0. Note that we cannot have a B-type and C-type edge before an A-type or C-type edge, since all numbers are distinct so these edges can't be joined together. Similarly, an A or C-type edge cannot be directly after a B or C-type edge. Thus, with these restrictions, it is easy to see that the cycle must contain either all A-type edges or B-type edges. So, the answer can be easily calculated. It is also important to note that if we ignore the cyclic property then a contiguous string of edges without D must be of the form AA...BB.. or AA...CBB..., where there is only one C, and zero or more As and Bs.

Now, if d ≥ 1, we can fix one of the D-type edges as the front of the cycle. This helps a lot because now we can ignore the cyclic properties. (we can place anything at the end of the cycle because D-type edges can connect with any type of edges) So, we just need to find the number of ways to make a length n - 1 string with a As, b Bs, c Cs and d - 1 Ds. In fact, we can ignore the fact that the A-type edges, B-type edges, C-type edges and D-type edges are distinguishable and after that multiply the answer by a!b!c!(d - 1)!.

We can easily find the number of valid strings we can make. First, place all the Ds. Now, we're trying to insert the As, Bs and Cs into the d empty spaces between, after and before the Ds. The key is that by our observation above, we only care about how many As, Bs and Cs we insert in each space since after that the way to put that in is uniquely determined. So, to place the As and Bs, we can use the balls in urns formula to find that the number of ways to place the As is and the number of ways to place the Bs is . The number of ways to place the Cs is , since we choose where the Cs should go.

Thus, it turns out that we can find the answer in O(1) (with precomputing binomial coefficients and factorials) when k = 1. We'll use this to find the answer for all k. In the general case, there might be cycles that consists entirely of As and entirely of Bs, and those that contains at least one D. We call them the A-cycle, B-cycle and D-cycles respectively.

Now, we precompute f(n, k), the number of ways to form k cycles using n distinguishable As. This can be done with a simple dp in O(n3). We iterate through the number of As we're using for the first cycle. Then, suppose we use m As. The number of ways to choose which of the m As to use is and we can permute them in (m - 1)! ways inside the cycle. (not m! because we have to account for all the cyclic permutations) Also, after summing this for all m, we have to divide the answer by k, to account for overcounting the candidates for the first cycle (the order of the k cycles are not important)

Thus, f(n, k) can be computed in O(n3). First, we see how to compute the answer for a single k. Fix x, y, e, f, the number of A-cycles, B-cycles, number of As in total among the A-cycles and number of Bs in total among the B-cycles. Then, since k is fixed, we know that the number of D-cycles is k - x - y. Now, we can find the answer in O(1). First, we can use the values of f(e, x), f(f, y), f(d, k - x - y) to determine the number of ways to place the Ds, and the As, Bs that are in the A-cycles and B-cycles. Then, to place the remaining As, Bs and Cs, we can use the same method as we did for k = 1 in O(1), since the number of spaces to place them is still the same. (You can think of it as each D leaves an empty space to place As, Bs and Cs to the right of it) After that, we multiply the answer by to account for the choice of the set of As and Bs used in the A-only and B-only cycles. Thus, the complexity of this method is O(n4) for each k and O(n5) in total, which is clearly too slow.

We can improve this by iterating through all x + y, e, f instead. So, for this to work we need to precompute f(e, 0)f(f, x + y) + f(e, 1)f(f, x + y - 1) + ... + f(e, x + y)f(f, 0), which we can write as g(x + y, e, f). Naively doing this precomputation gives O(n4). Then, we can calculate the answer by iterating through all x + y, e, f and thus getting O(n3) per query and O(n4) for all k. This is still too slow to pass n = 250.

We should take a closer look of what we're actually calculating. Note that for a fixed pair e, f, the values of g(x + y, e, f) can be calculated for all possible x + y in or O(n1.58) by using Number Theoretic Transform or Karatsuba's Algorithm respectively. (note that the modulus has been chosen for NFT to work) This is because if we fix e, f, then we're precisely finding the coefficients of the polynomial (f(e, 0)x0 + f(e, 1)x1 + ... + f(e, n)xn)(f(f, 0)x0 + f(f, 1)x1 + ... + f(f, n)xn), so this can be handled with NFT/Karatsuba.

Thus, the precomputation of g(x + y, e, f) can be done in or O(n3.58).

Next, suppose we fixed e and f. We will calculate the answer for all possible k in similar to how we calculated g(x + y, e, f). This time, we're multiplying the following two polynomials : f(d, 0)x0 + f(d, 1)x1 + ... + f(d, n)xn and g(0, e, f)x0 + g(1, e, f)x1 + ... + g(n, e, f)xn. Again, we can calculate this using any fast multiplication method, so the entire solution takes or O(n3.58), depending on which algorithm is used to multiply polynomials.

Note that if you're using NFT/FFT, there is a small trick that can save some time. When we precompute the values of g(x + y, e, f), we don't need to do inverse FFT on the result and leave it in the FFTed form. After that, when we want to find the convolution of f(d, i) and g(i, e, f), we just need to apply FFT to the first polynomial and multiply them. This reduces the number of FFTs and it reduced my solution runtime by half.

Time Complexity : or O(n3.58), depending on whether NFT or Karatsuba is used.

Code (NFT)
Code (Karatsuba)

Read more »

  • Vote: I like it
  • +173
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Hi everyone, it's me again!

Codeforces Round #372 (Div. 1 + Div. 2) will take place on 17 September 2016 at 16:35 MSK,

After my last round, this will be my second round on Codeforces. I believe you'll find the problems interesting and I hope you'll enjoy the round.

This round would not be possible without danilka.pro who improved one of the problems that made this round possible, and also helped in preparing and testing the round. Also, thanks to all the testers, IlyaLos, HellKitsune and phobos and thanks to MikeMirzayanov for the awesome Codeforces and Polygon platforms.

ZS the Coder and Chris the Baboon's trip in Udayland is over. In this round, you'll help ZS the Coder solve the problems he have randomly came up with. Do you have what it takes to solve them all?

The problems are sorted by difficulty but as always it's recommended to read all the problems.

We wish you'll have many Accepted solutions and enjoy the problems. :)

As usual, the scoring will be published right before the contest.

UPD : There will be 5 problems in both division as usual.

Scoring :

Div. 2 : 500 — 1000 — 1500 — 2000 — 2500

Div. 1 : 500 — 1000 — 1500 — 25002750

Good luck and I hope you enjoy the problems!

UPD : Contest is over. I hope you enjoyed the contest and problems :) I'm sure some of you wants to see the editorial now, so here it is while we wait for System Test to start.

UPD : System tests is over. Here're the winners :

Division 1 :

  1. Retired_MiFaFaOvO

  2. matthew99

  3. jerry

  4. Burunduk1

  5. JoeyWheeler

  6. LHiC

  7. SanSiroWaltz

  8. dotorya

  9. Arterm

  10. zeliboba

Division 2 :

  1. amethyst0


  3. huzzah

  4. Vax

  5. KyleChen

  6. LeiQ

  7. Tomer.Adar

  8. creatnx

  9. OrangePlus

  10. Yukikaze

Congratulations to them!

Read more »

  • Vote: I like it
  • +423
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Here are the editorials for all the problems. Hope you enjoyed them and found them interesting!

Tutorial is loading...
Tutorial is loading...
Tutorial is loading...
Code (O(nkm^2))
Code (O(nkm))
Tutorial is loading...
Tutorial is loading...

Read more »

  • Vote: I like it
  • +100
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Important Update: Our friends have noticed that the upcoming round collides with their contest and also weekend is full of many another contests, so the round is now moved to Monday, 29 August 2016 15:05 MSK. We are sorry for the inconvenience caused and hope that you'll understand us.

Hi everyone!

Codeforces Round #369 (Div. 2) will take place on 27 August 2016 at 16:05 MSK. As usual, Div.1 participants can join out of competition.

I would like to thank danilka.pro for helping me with the preparation of the round, MikeMirzayanov for the amazing Codeforces and Polygon platforms and also Phyto for testing the problems.

I am the author of all the problems, and danilka.pro also helped making one of the problems harder. This is my first round on Codeforces! Hope everyone will enjoy the problems and find them interesting. It is advisable to read all the problems ;)

In this round, you will help ZS the Coder and Chris the Baboon while they are on an adventure in Udayland. Can you help them solve their problems? :)

Good luck, have fun, and wish everyone many Accepted Solutions. :)

UPD : Also thanks to IlyaLos and HellKitsune for testing the problems too.

UPD 2 : There will be 5 problems and the scoring is standard : 500-1000-1500-2000-2500.

UPD 3 : Editorial

UPD 4 :

Congratulations to the winners :

Div. 1 winners :

  1. matthew99

  2. uwi

  3. Egor

  4. Um_nik

  5. kmjp

Div. 2 Winners :

  1. zhabo

  2. ysy_ioi_pengbei

  3. YangJunzhao

  4. lych_cys

  5. Zharaskhan

Read more »

  • Vote: I like it
  • +286
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Hi everyone! I created a small group here which is open for public. There will be 3 5-hour contests held there featuring 3 problems each and the problems are taken from olympiads of different countries as well as problems from other sites (though these are rare) The contests will be in ACM-ICPC mode. (since this is the default CF mode)

Since almost all of the problems are unoriginal, it is very likely that you might have seen some of the problems before. Everyone is welcome to join the group and participate in any contest anytime.

The schedule of the contests have been posted in the group. Additionally, Silver_ told me he have uploaded some Croatian OI problems before, so he might also add it to the group as well.

Read more »

  • Vote: I like it
  • +38
  • Vote: I do not like it

By zscoder, history, 5 years ago, In English

Problem statement :

There are N planets in the galaxy. For each planet i, it has two possible states : empty, or it is connected to some other planet j ≠ i with a one-way path. Initially, each planet is in the empty state (i.e. there are no paths between any pair of planets)

There are three types of operations :

  1. If planet i is in empty state, connect it with planet j with a one-way path (i and j are distinct)

  2. If planet i is currently connected to planet j with a one-way path, remove that path. Consequently, planet i is now in empty state again

  3. For a pair of planets u, v, find a planet x such that one can travel from u to x and v to x. If there are multiple such planets, find the one where the total distance travelled by u and v is minimized (distance is the number of paths it passes through). If there are no solutions, output  - 1.

Q, N ≤ 106. Time limit is 10 seconds.

One of the official solutions uses splay tree to solve this problem, but I have no idea how it works (I haven't use splay trees before). Can anyone tell me how to use splay tree on this problem? Thanks.

Read more »

  • Vote: I like it
  • +6
  • Vote: I do not like it