Problem analysis for Yandex.Algorithm 2018 final round

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

Smart Vending

Problem idea and development: GlebsHP.

First we observe that Arcadiy won't be able to spend more money than he has, i.e. the number of bottles he can buy doesn't exceed $\text{[math]}$ . However, it might happen that Arcadiy won't be able to buy exactly this number because he won't have enough coins to buy next bottle with no change and the machine won't possess enough coins to give change if Arcadiy tries to pay with banknotes only.

Note that the total number of coins doesn't change and is always c + d. Moreover, if c + d ≥ 999 999, then either Arcadiy will always have enough coins to buy a bottle with no change, or machine will have enough coins to give a change. In this case the answer is z.

If c + d < 999 999, then it can be the case that Arcadiy has enough coins to buy bottle with no change, it can be the case that machine has enough coins to give change, but it can't be the case that both is possible. That means we can simulate the process but that might work tool long.

Denote as a_i the number of coins Arcadiy has after first i moves (a₀ = c). Note that if a_i = a_j, then a_i + 1 = a_j + 1 (indeed, the way to buy next bottle is uniquely defined by value a_i) and the sequence repeats itself. We will simulate the process of buying Kvas-Klass bottle by bottle till we find some value of a_i that we already met or it is not possible to make next purchase. In case a_i repeats the answer is z, otherwise the answer is the number of step when the next purchase was not possible.

Exercise: solve the task if one banknote is 10¹² rubles.

LIS vs. LDS

Problem idea and development: Endagorion.

Let us draw n points p_i = (i, a_i) in the plane. Let I = (i₁, ..., i_k) and J = (j₁, ..., j_l) be a pair of non-intersecting LIS and LDS. From maximality of I, there can be no points strictly inside the following rectangles (described by their opposite corners):

A₀ = ((0, 0), p_i₁);
A_x = (p_{i_x}, p_{i_x + 1}) for all 1 ≤ x < k;
A_k = (p_{i_k}, (n + 1, n + 1)).

Similarly, since J is maximal, there can be no points inside these rectangles:

B₀ = ((0, n + 1), p_j₁);
B_y = (p_{j_y}, p_{j_y + 1}) for all 1 ≤ y < l;
B_l = (p_{j_l}, (n + 1, 0)).

These two chains of rectangles connect the opposite corners of the square ((0, 0), (n + 1, n + 1)). Assuming that I and J do not intersect, there must be an intersecting pair of rectangles A_x and B_y. None of A_x and B_y contains any points, hence there are only two cases how they can intersect:

i_x < j_y < j_y + 1 < i_x + 1, a_{j_y + 1} < a_{i_x} < a_{i_x + 1} < a_{j_y} (as in the first sample);
j_y < i_x < i_x + 1 < j_y + 1, a_{i_x} < a_{j_y + 1} < a_{j_y} < a_{i_x + 1} (the mirror image of the previous option).

We will try to look for possible intersections of type 1 (the other one can be found similarly after reversing the whole permutation). Let's say that (i, i') is an {\it LIS-pair} if a and b are consecutive elements of an LIS ({\it LDS-pair} is defined similarly). Suppose that there is an LIS-pair (i, i') and an LDS-pair (j, j') such that i < j < j' < i' and a_j' < a_i < a_i' < a_j, that is, there is an intersection of type 1 between certain LIS and LDS. This means that we can find an answer by constructing all LIS- and LDS-pairs and looking for such intersection.

Of course, there are too many possible pairs. However, notice that in the situation above we can replace i' with $\text{[math]}$ , and j' with $\text{[math]}$ . This means that we will only consider O(n) pairs to intersect. Among these, finding an intersection can be done with a sweep-line approach that uses segment tree.

Eat And Walk

Problem idea and development: GlebsHP.

To start with we consider the minor change in the problem: let the initial movements be free and the movements after x units of food cost x per 1 move left or right. One can show that the optimal strategy is to reach some position x and then move only in the direction of position 0, visiting some of the restaurants. If we visit restaurant i the weight increases by a_i and each move costs a_i more. Assuming we only move left after visiting any restaurants we can say that visiting restaurant i costs a_i·i units of energy in total.

Now we have a knapsack problem with n items, the i-th of them has weight a_i·i and cost a_i, the goal is to find a subset of maximum total cost and total weight not exceeding e. Standard solution is to compute dynamic programming d(i, x) — maximum possible total cost if we select among first i items and the total weight is equal to x.

Now we swap parameter and value of dynamic programming and compute d(i, c) — minimum possible total weight that the subset of cost c selected among first i items can have. Then we are going to add items in order of descending i, so value d(i, c) will be composed using elements from i-th to n-th. What are the values of c to consider? Each unit of cost (food in original terms) that comes from restaurants with indices i and greater requires at least i units of energy so we should only consider values of $\text{[math]}$ . Using the formulas to estimate harmonic series we can say that we only need to compute $\text{[math]}$ elements of dynamic programming.

To finish the solution we should count the initial cost to move to maximum index restaurant we visit (and back). Let this index be k, we should increase answer by 2k. This can be done by using extra cost in dynamic programming when we consider moves from state d(i, 0).

Search Engine

Problem idea: GlebsHP. Development: cdkrot.

Let's notice, that in the end, after prepending and appending of all letters, Alice will have string s exactly.

Suppose, that we were adding new letters and having at least one occurrences inside s, and then added one symbol and all the occurrences disappeared. In that case, the all following additions have zero occurrences count and we should have stopped on that bad step, selected one of occurrences and then growed it to the left and to the right to the bounds of s.

Because of this observation, we can reverse the process: let's say that initially we have string s, and then we are allowed to delete the letters from left or right side, with the same goal to maximize the total number of occurrences.

Let's build a suffix tree on s, which is a trie with all suffixes of s.

Then we can do a dynamic programming on the vertices of the suffix tree. Let's denote dp[v] as the maximum sum which we can achieve if we start with string, which corresponds to the path from root to v and repeatedly deleted letters from left or right, counting the total number of occurrences in the whole string s.

This dynamic programming has only two transitions — we can go to the parent vertex (which corresponds to deletion of the last letter), or go by the suffix link (which corresponds to deletion of the first letter). We can calculate this dynamic programming lazily and the number of occurrences of the string is simply the number of terminal vertices in it's subtree (this can be calculated for all vertices beforehand).

However, this way we have n² states in our dynamic programming, so we will have to use the compressed suffix tree.

In compressed suftree, transition to the parent may mean deletion of multiple symbols on the suffix. In this case, number of occurrences anywhere between the edge is equal to the number of occurrences of the lower end (so you need to multiply it by the edge length). The same also holds for the suffix links transitions.

The only thing we missed, it that the optimal answer could have transitions up by one symbol and suffix links transitions interleaved, and we can't simulate this with compressed suffix tree. However, one never needs to interleave them. Assume, that in optimal answer we were going up and then decided to go by suffix link and finish the edge ascent after it. In that case we can simply go by the suffix link first and traverse the edge after that — this way the total number of occurrences can only increase.

The complexity is $\text{[math]}$ , also the problem can be solved with the suffix automaton and dynamic programming on it. In that case, the dynamic transitions are suffix links and reverse edges.

Exercise: solve the problem if the game starts from some substring (defined by l_i and r_i) and you need to process up to 100 000 queries.

Guess Me If You Can

Problem idea and development: GlebsHP.

Note that if we name all elements in some order, at the end we will have position similar to the initial but shifted by 1. This shift doesn't affect neither answer, nor intermediate values returned by interactor.

Consider some triple of elements i, j and k such that p_i + 1 = p_j = p_k - 1. In case we list elements in some order such that j goes before i and k, in the moment p_j is increased by 1 the number of distinct element decreases and we can say that position j doesn't contain maximum.

What if we select a random permutation and name all elements in this order. For triple i, j, k we have chance $\text{[math]}$ that j will go before i and k. Thus, if we use 50 random permutations, the probability to fail to find out that j is not a maximum is only $\text{[math]}$ . The events corresponding to separate triples might not be independent so we estimate the probability to make at least one mistake as 2·10^- 9·n ≤ 2·10^- 6.

Lazy Hash Table

Problem idea: GlebsHP. Development: Arterm.

Note that $\text{[math]}$ if and only if m|a_i - a_j. So, we have to find minimal m that doesn't divide any of differences a_i - a_j.

First find all possible values of d = a_i - a_j. Set

$\text{[math]}$ $\text{[math]}$

where M = max a_i. Find product T(x) = P(x)·Q(x) using Fast Fourier Transform. As

$\text{[math]}$

coefficient with x^M + d is nonzero if and only if there are i, j with a_i - a_j = d.

Now for each m in range [1, M] check if there is difference divisible by m. We can check this checking $\text{[math]}$ numbers: $\text{[math]}$ . So total complexity for this part is $\text{[math]}$ .

Overall complexity is $\text{[math]}$ , где M = max a_i ≤ 2 000 000.

Rev.	By	When	Δ	Comment
en5	GlebsHP	2018-05-29 10:42:06	986
ru5	GlebsHP	2018-05-29 10:41:03	1066
ru4	GlebsHP	2018-05-29 02:33:37	2208
en4	GlebsHP	2018-05-29 02:31:53	2289
en3	GlebsHP	2018-05-29 00:45:20	2	Tiny change: 'ac{2}{3}}^50 \leq 2 \c' -> 'ac{2}{3}}^{50} \leq 2 \c'
ru3	GlebsHP	2018-05-29 00:44:59	2	Мелкая правка: 'ac{2}{3}}^50 \leq 2 \c' -> 'ac{2}{3}}^{50} \leq 2 \c'
en2	GlebsHP	2018-05-29 00:35:09	254
ru2	GlebsHP	2018-05-29 00:33:32	277	(опубликовано)
en1	GlebsHP	2018-05-29 00:28:51	7640	Initial revision for English translation
ru1	GlebsHP	2018-05-29 00:22:54	8136	Первая редакция (сохранено в черновиках)

Smart Vending

LIS vs. LDS

Eat And Walk

Search Engine

Guess Me If You Can

Lazy Hash Table

History