This editorial corresponds to 2018 Chinese Multi-University Training, BeihangU Contest (stage 5), which was held on Aug 6th, 2018. Moreover, this problem set was also used as Jingzhe Tang Contest 1 in Petrozavodsk Winter Camp on Jan 30th, 2019.
This post will try to elaborate on notes, solutions and maybe some data generating. Before that finishes, you can refer to an old published material.
102114A - Always Online
This problem requires to calculate $$$s$$$-$$$t$$$ min cut between any two vertices on a weighted cactus graph having $$$n$$$ vertices, denoted by $$$\mathrm{flow}(s, t)$$$. You need to report $$$\sum{s < t}{(s \oplus t \oplus \mathrm{flow}(s, t))}$$$.
$$$n \leq 10^5$$$, $$$\sum{n} \leq 10^6$$$, weights are $$$\leq 10^9$$$.
Try to find some feature of this graph.
solutionBy contradiction, we can prove that for a connected undirected graph, each edge belongs to at most one cycle if and only if there are at most two non-intersect paths between any two vertices. In addition, the maximum flow problem and the minimum cut problem are equivalent, so we only need to consider how to cut some edges with the lowest sum of capacity so that the source vertex $$$s$$$ is not connected to the sink vertex $$$t$$$.
Let's take a look at the $$$s$$$-$$$t$$$ cut problem. If there are at least two edges in one cycle that are cut, we could figure out by adjusting that at most two of them should be cut, since there are at most two non-intersect paths between $$$s$$$ to $$$t$$$. In addition, if there are many cycles where some edges of each of them are cut, it can be known by adjusting and contradiction that only the edges of at most one cycle of them should be cut.
One observation is that if there is at least one edge in some cycle should be cut in the best solution of the minimum cut problem, there must be two edges in that cycle need to be cut, one of which is the edge with the smallest capacity in that cycle. Hence, we can remove the edge with the smallest capacity in each cycle and add its capacity to the other edges in the cycle, and the maximum flow between any two vertices will not change in value.
The modified graph forms a tree. If we add edges to an empty graph in descending order of its capacity, we can determine $$$\mathrm{flow}(s, t) = w$$$ $$$(s \in S, t \in T)$$$ for the two sets of vertices $$$S$$$ and $$$T$$$ that are firstly connected after joining the edge of capacity $$$w$$$. Just maintain these sets of vertices by the disjoint set union, count in each set the number of vertices whose bit in some fixed position are $$$1$$$, and then it is easy to calculate the answer by enumerating each bit. Time complexity is $$$\mathcal{O}(m \log n)$$$.
noteNote that $$$\mathrm{flow}(s, t)$$$ can be $$$2 \times 10^9$$$, which means the answer may exceed $$$2^{63}$$$. To reach the peak value, we may just construct a circle with edges of the same weight $$$10^9$$$.
When enumerating bits, we can treat the lowest $$$(\log n)$$$ bits and other bits differently, which explains why the time complexity is $$$\mathcal{O}(m \log n)$$$ rather than $$$\mathcal{O}(m \log w)$$$.
102114B - Beautiful Now
Given integer $$$n$$$, you are asked to swap digits in exactly $$$k$$$ turns. In each turn, you can swap two digits (can be the same digit), but after this turn, $$$n$$$ must not have leading zero. Calculate the maximal and the minimal value you can get in the end.
$$$100$$$ tests, $$$n, k \leq 10^9$$$.
This problem is yet another problem related to swapping. Can you solve it simply and elegantly?
solution 1For each permutation of positions, the minimum number of swaps equals to $$$n$$$ minus the number of disjoint cycles in the permutation.
We can solve the problem by enumerating all the $$$n$$$-permutations. For each permutation, we only need to check if it can be reached within $$$k$$$ swaps as swapping the same position is permitted.
Time complexity is expected to be $$$\mathcal{O}(m!)$$$, where $$$m = \left \lfloor \log_{10} n \right \rfloor + 1$$$. In order to achieve that, we can maintain some information about chains and cycles so that we can modify and retrieve efficiently during searching and backtracking.
solution 2For each permutation of positions, the minimum number of swaps equals to $$$n$$$ minus the number of disjoint cycles in the permutation.
We can precalculate the minimum number of steps to obtain a permutation from the original one, and then enumerate and check. Time complexity is $$$\mathcal{O}\left(\sum_{i = 1}^{9}{(i \cdot i!)} + T m!\right)$$$, where $$$m = \left \lfloor \log_{10} n \right \rfloor + 1$$$.
noteSome solution cannot handle the case $$$n = 10^9$$$, but it can be solved by hand.
Many greedy solutions can be hacked even by random tests, so... the dataset is almost randomized. :P
Wait, wait, wait... Does it seem like a notorious coincidence with this problem? What? This problem has an incredible data range... ($$$n < 10^{1000}$$$) Does it really solvable??? Oh, I can't believe that!!! >_<
102114C - Call It What You Want
This problem asks you to factorize polynomial $$$(x^n - 1)$$$ over the field of the integers. Besides, the statement contains some mathematical formula you may need to apply. Maybe you just need some observation to solve.
$$$n \leq 10^5$$$, $$$\sum{n} \leq 5 \times 10^6$$$.
solutionAs we know, the Möbius inversion formula can be expressed as
$$$f(n) = \sum_{d | n}{g(d)} \Leftrightarrow g(n) = \sum_{d | n}{\mu(n / d) f(d)}\text{,}$$$where $$$\mu$$$ is the Möbius function and the sums extend to all positive divisors $$$d$$$ of $$$n$$$.
Similarly, we can obtain the following formula.
$$$f(n) = \prod_{d | n}{g(d)} \Leftrightarrow g(n) = \prod_{d | n}{f(d)^{\mu(n / d)}}$$$If we apply this formula, the problem can be solved in $$$\mathcal{O}(2^{\omega(n)} n)$$$, where $$$\omega(n)$$$ means the number of distinct prime factors of $$$n$$$. To prove that, you need to know what the Möbius function is, and observe that $$$f(n)$$$ in this problem is a polynomial with only two non-zero terms. In addition, $$$\sum_{d | n}{\varphi(d)} = n$$$.
By the way, $$$\omega(n) \leq 6$$$ when $$$n \leq 10^5$$$, because $$$2 \times 3 \times 5 \times 7 \times 11 \times 13 \times 17 = 510510$$$.
noteQ: Why does Fast Fourier Transform get a verdict of TLE? It seems $$$\log n \sim 2^{\omega(n)}$$$ when $$$n \leq 10^5$$$.
A: FFT implies big constant factor. Also, the above solution does not require floating point operation.
102114D - Daylight
This problem can be explained as given an unweighted tree having $$$n$$$ vertices, you need to determine the union of two sets and report its size for $$$m$$$ queries, where each set is a set of vertices whose distances to a given vertice are no more than a fixed value, and the values for two sets are the same. However, queries are encrypted so you need to handle them one by one (online).
$$$10$$$ large tests, $$$n, m \leq 10^5$$$.
Emmm... A typical data structure problem, right?
Certainly I've found it on CodeChef. What? Why TLE???
solutionThe union is difficult to compute but the intersection is easy, so let's apply $$$|S_u \cup S_v| = |S_u| + |S_v| - |S_u \cap S_v|$$$, where $$$S_x$$$ is the set of vertices whose distances to $$$x$$$ are no more than $$$w$$$.
Note that the tree is not changed, so you can precalculate the structure of the tree's centroid decomposition, in order to calculate $$$|S_u|$$$ online.
Since the restricted distances are the same, when $$$S_u$$$ and $$$S_v$$$ have a non-empty intersection, their intersection equals to the set of vertices whose distance to the middle point of the path between them are no more than $$$\left(w - \frac{dist(u, v)}{2}\right)$$$, which can be counted similarly. To avoid many discussions, you can add extra vertices at each edge of the tree.
Time complexity and space complexity are both $$$\mathcal{O}(n \log n)$$$. However, some solution may lose out due to cache-unfriendly behavior. One way to ease that is to instead discuss the situations.
102114E - Everything Has Changed
A geometry problem to ensure you have checked in this contest. Read the statement for more details.
solutionNote that the cutting areas do not intersect, which implies the different cutting areas do not affect each other, so we can reduce the problem into calculating angles of triangles. Cosine theorem would be helpful.
Anyway, you can paste your geometry coding template to solve. It seems there are many problems similar to this one.
noteThough the data range is small, we should be careful calling functions like acos(x)
when $$$x$$$ should be ranged in $$$[-1, 1]$$$, or otherwise NaN
will give you a lesson.
102114F - Fireflies
Consider all the lattice points in a hypercube $$$\lbrace (x_1, x_2, \ldots, x_n) | 1 \leq x_i \leq p_i \rbrace$$$. Find a maximal subset such that there are no two points $$$(x_1, x_2, \ldots, x_n)$$$, $$$(y_1, y_2, \ldots, y_n)$$$ meeting the condition $$$x_i \leq y_i$$$ for all $$$i$$$. Report its size modulo $$$(10^9 + 7)$$$.
$$$n \leq 32$$$, $$$p_i \leq 10^9$$$.
How fast can you achieve to solve it?
solutionThe statement is a bit different from the above, however, they are equivalent by applying Dilworth's theorem.
We may be familiar to the case that $$$p_i = 2$$$ for all $$$i$$$, which is equal to find a maximal subset of the power set of some set $$$S$$$ such that none of the sets is a strict subset of another. Besides, the conclusion, also known as Sperner's theorem, is that the size of the maximal subset is equal to $$${|S| \choose \left \lfloor |S| / 2 \right \rfloor}$$$.
A generalization of Sperner's theorem could show the answer to this problem is the number of points satisfying $$$\sum_{i}{x_i}$$$ is equal to $$$M = \left \lfloor \frac{1}{2} \sum_{i}{(p_i + 1)} \right \rfloor$$$. The proof is similar so we omit it.
Then, as we solve many counting problems in the form of $$$\sum_{i}{x_i} = M~(1 \leq x_i \leq p_i)$$$, we can use inclusion-exclusion principle to get the answer
$$$\sum_{I \subseteq J}{(-1)^{|I|}{M - 1 - \sum_{i \in I}{p_i} \choose n - 1}}\text{,}$$$where $$$J = {1, 2, \ldots, n}$$$.
The remaining part is straightforward. We can regard $$${M - 1 - \sum_{i \in I}{p_i} \choose n - 1}$$$ as a low-degree polynomial of $$$\sum_{i \in I}{p_i}$$$ and use the meet-in-the-middle approach to split $$$I$$$ into two two halves almost evenly. Time complexity is $$$\mathcal{O}(2^{n / 2} n^2)$$$.
Here is an old problem with smaller data range.
102114G - Glad You Came
There are $$$m$$$ operations over an array $$$a_1, a_2, \ldots, a_n$$$, each operation of which is to update $$$a_i$$$ $$$(l \leq i \leq r)$$$ by $$$\max(a_i, v)$$$. You need to determine the array after all the operations.
$$$n \leq 10^5$$$, $$$\sum{n} \leq 10^6$$$, $$$m \leq 5 \times 10^6$$$, $$$\sum{m} \leq 5 \times 10^7$$$. Besides, $$$l, r, v$$$ are randomly selected.
solution 1Queries and changes don't appear in turn, so we can solve the problem offline. We can build a sparse table to postpone changes like lazy propagation on segment tree.
For each operation $$$(l, r, v)$$$, let $$$d$$$ be $$$\left \lfloor \log_2(r - l + 1) \right \rfloor$$$, and replace this operation by two operations $$$(l, l + 2^d - 1, v)$$$ and $$$(r - 2^d + 1, r, v)$$$. Then each operation is able to be recorded in the sparse table. It is convenient that when there are two operations covering the same interval, we can just keep the one with a higher value. After that, we split each recorded operation into two smaller ones in decreasing order of length.
Time complexity is $$$\mathcal{O}(m + n \log n)$$$ and space complexity is $$$\mathcal{O}(n \log n)$$$.
solution 2We can apply radix sort carefully and cache-friendly so that we are able to enumerate operations in decreasing order of value. Then using some trick to hit each position almost once may lead to success.
However, due to cache-unfriendly behavior, there was only one team achieving that.
solution 3Due to the randomness of operations, there are several approaches to pass.
For example, if we assume the number of useful operations is almost $$$\min(m, 2 n)$$$, we can use a linear approach to find operations with the largest $$$\min(m, 2 n)$$$ values, and then do whatever you want to solve this problem.
Another example is using the trick mentioned in Segment tree beats, created by Syloviaely. Amazingly, during the contest, there are tremendous Chinese teams using this approach to pass, even though many of them don't understand it.
102114H - Hills And Valleys
Given an array of length $$$n$$$, consisting of $$$0, 1, \ldots, 9$$$, your task is to reverse exactly one interval and then make the longest non-decreasing subsequence of the whole array as long as possible. The reversed interval is also required to report.
$$$n \leq 10^5$$$, $$$\sum{n} \leq 2 \times 10^5$$$.
solutionYou can write a typical DP to solve, but notice that the more complex your solution is, the more tiresome retrieving will be.
One elegant way is to enumerate the value range $$$[x, y]$$$ that needs to be reversed. In doing so, we only need to find find the longest subsequence in the form of $$$0^* 1^* \ldots x^* y^* (y - 1)^* \ldots x^* y^* (y + 1)^* \ldots 9^*$$$, where $$$k^*$$$ represents any non-negative times of occurrences of integer $$$k$$$. It also makes retrieving easy to implement.
This idea has been noticed in the tutorial for 933A - A Twisty Movement, created by visitWorld. Time complexity is $$$\mathcal{O}(10^3 n)$$$.
102114I - Innocence
Count the number of solutions for the equation $$$x_1 \oplus x_2 \oplus \cdots \oplus x_N = K$$$, where $$$x_i$$$ can be any integer in $$$[L, R]$$$. There are $$$Q$$$ queries with the same $$$N, L, R$$$ but different $$$K$$$.
$$$100$$$ large tests, $$$N \leq 10^9$$$, $$$0 \leq L, R, K < 2^{30}$$$, $$$Q \leq 100$$$.
solutionLet's solve an easier version first. In this version, we are given $$$N, {R_i}, K$$$ and we have to choose $$$N$$$ integers $$$x_1, x_2, \ldots, x_N$$$ such that $$$x_i \in [0, R_i]$$$ $$$(i = 1, 2, \ldots, N)$$$ and the bitwise XOR of them equals to $$$K$$$. Similar problems can be found at Crystals — POI 2006 and Stone Game — Hackerrank.
In case that $$$x_i \in [0, 2^m - 1]$$$, we know the problem is related to XOR equation on bits. Furthermore, when $$$(N - 1)$$$ ones of these variables are fixed, the rest one is determined. So if any solution exists, the number of solution is always $$$2^{m (N - 1)}$$$, no matter what $$$K$$$ is.
If we apply digit DP on bits, we can classify all the situations into different groups by the highest digit satisfying there is at least one integer $$$x_j$$$ that is strictly less than $$$R_i$$$ on this digit, and then we know the lower bits of at least integer $$$x_j$$$ can be arbitrary, so we can leave it first and determine it by equation after other integers are determined. It is not difficult to design a digit DP to count. Time complexity for this part is $$$\mathcal{O}(N \log R)$$$.
Now turn back to the harder version, where $$$N$$$ becomes bigger and the lower bound of $$$x_i$$$ is limited. First, we can apply the inclusion-exclusion principle to eliminate the extra lower bound, in other words, we reduce it into many states in which $$$x_i \in [0, R]$$$ or $$$x_i \in [0, L - 1]$$$. Then, let's take a look at the digit DP. We can record the sign caused by the inclusion-exclusion principle, and thus boost the same transition using fast power algorithm on matrix multiplications. Time complexity is $$$\mathcal{O}((8^3 \log N + Q) \log R)$$$.
Oops... It seems like I've missed an important part — why matrix multiplications can reduce $$$2^N$$$ states into $$$2$$$ states. On each bit, we keep the higher bits as the same as the upper bounds, use DP to determine XOR sum on this bit, and determine the lower bits by XOR equation. The only difference is due to the upper bounds of the higher bits when we apply the inclusion-exclusion principle. Fortunately, their bitwise XOR sums differ when the parity of the number of upper bounds that are $$$R$$$ differ, so we can record the parity of their number, which is equivalent to the sign caused by the inclusion-exclusion principle, and implement easily.
noteBased on the fact in case that $$$x_i \in [0, 2^m - 1]$$$, many solutions can get accepted.
For example, let $$$dp(i, j)$$$ be the number of sequences $$$x_1, x_2, \ldots, x_i$$$ such that their XOR sum is $$$j$$$. We can observe for each $$$i$$$, values of $$$dp(i, j)$$$ can be split into several intervals with respect to $$$j$$$ such that values in each interval are the same. Maintaining these intervals can get accepted.
Tester quailty: If we exploit this property more deeply, we can get the exact formula and solve the problem in time complexity $$$\mathcal{O}((10 \log N + Q) \log R)$$$.
102114J - Just So You Know
A non-empty substring $$$B$$$ is selected from a given string $$$A$$$ of length $$$n$$$ with equal probability among all the possible substrings. You are given the string $$$A$$$ and asked to determine which the substring $$$B$$$ is.
You can claim conjectures and then the jury will prove that it is true or false. You need to find a strategy to minimize the expected number of claiming and report the expected number as an irreducible fraction.
$$$n \leq 10^6$$$, $$$\sum{n} \leq 10^7$$$, character set size $$$\Sigma \leq 100$$$.
solutionA simpler version can be found at GuessTheSubstring -- TCO 2011 Semifinal 1 500, created by misof. The harder version increases the upper bound of $$$n$$$ and requires high precision.
Before introducing our linear solution, we have to confess that it is too hard to create a dataset against solutions in time complexity $$$\mathcal{O}(n \log n)$$$. Why are judging machines so fast?
The process of guessing essentially constructs a decision tree, which is equivalent to make the Huffman encoding for all substrings $$$B$$$ of $$$A$$$. What makes things difficult is that there are $$$\mathcal{O}(n^2)$$$ distinct substrings, so firstly let's reduce the number of different initial states. Note that we only care about the number of each substring's occurrences rather than what it is, and the maximal number is at most $$$n$$$. If we can categorize substrings by their frequency, we may simulate the Huffman tree's construction quickly.
There are several approaches to count all the frequency of substrings in linear time. Our approach is to build the suffix array through SA-IS algorithm, and then build the compressed suffix tree from this suffix array. Counting on the suffix tree is easy. However, due to the character set size, suffix automaton in space complexity $$$\mathcal{O}(n \Sigma)$$$ may fall down.
After counting, let's try to calculate the cost of building the Huffman tree in linear time. In this process, we need to merge two states with the lowest frequency into a new state repeatedly until there remains only one state, where the number of new state's occurrences is equal to the sum of the old two.
Denote $$$c_i$$$ as the number of states that occur $$$i$$$ times $$$(i = 1, 2, \ldots, n)$$$. For the states with the number of occurrences $$$\leq n$$$, we can speed up the consecutive same actions at once, for example, merge at most $$$\frac{c_{i}}{2}$$$ pairs of states that occur $$$i$$$ times. For the states with the number of occurrences $$$> n$$$, we can conclude the number of such states $$$< \frac{n}{2}$$$, as $$$\sum_{i}{(i \cdot c_i)}$$$ is always equal to $$$\frac{n (n + 1)}{2}$$$. We can use a queue to maintain and merge these states so that this part is finished linearly.
102114K - Kaleidoscope
Count the number of non-isomorphic rhombic hexecontahedrons such that each face of each polyhedron are colored by one of $$$n$$$ given colors and the $$$i$$$-th color occurs on at least $$$c_i$$$ faces of each polyhedron. Report the number modulo $$$p$$$.
Two states are isomorphic if one can transform into another by 3D space rotation.
In geometry, a rhombic hexecontahedron is nonconvex with $$$60$$$ golden rhombic faces with icosahedral symmetry.
$$$100$$$ large tests, $$$n \leq 60$$$, $$$1 \leq p < 2^{30}$$$.
solutionThis is a classic problem of Pólya enumeration theorem, which is similar to Dodecahedron -- Petrozavodsk Winter Camp 2008, Ural SU Contest. Both are not hard after applying this theorem.
By applying that, we can only focus on the number of ways to set colors for orbits meeting the conditions for the number of colored faces. For each situation where every $$$d$$$ faces have the same color, we could count $$$dp(i, j)$$$ as the number of ways such that the first $$$i$$$ colors have been selected to color $$$j$$$ faces with meeting the conditions individually.
There are only $$$4$$$ types of equivalence classes, so the time complexity is $$$\mathcal{O}(4 \cdot 60^2 n)$$$.
By the way, Pólya enumeration theorem requires division. In order to do it, and do it only once, in modulo $$$p$$$, you can calculate the other parts in modulo $$$|G| \cdot p$$$, where $$$|G| = 60$$$ is the total number of equivalence classes, and then divide it by $$$|G|$$$ directly. During that process, you should be careful with $$$64$$$-bit integer overflow.
102114L - Lost In The Echo
Calculate the $$$n$$$-th term of sequence A140606.
$$$n = 1, 2, \ldots, 6 \times 10^4$$$.
The solution will be added soon...
Which problem do you prefer or hate most? Feel free to share your thoughts in comments! ^_^