As usual, a challenge comes with every problem. I tried not to repeat the mistakes of my previous editorials and made sure that all challenges have a solution =) (except for the italics parts that are open questions, at least for me). Go ahead and discuss them in the comments! General questions about problems and clarification requests are welcomed too.
UPD: I added codes of my solutions for all the problems. I didn't try to make them readable, but I believe most part of them should be clear. Feel free to ask questions.
Let me first clarify the statement (I really wish I didn't have to do that but it seems many participants had trouble with the correct understanding). You had to erase exactly one substring from the given string so that the rest part would form the word
CODEFORCES. The (somewhat vague) wording
some substring in the English translation may be the case many people thought that many substrings can be erased; still, it is beyond my understanding how to interpret that as 'more than one substring'. Anyway, I'm sorry for the inconvenience.
Right, back to the problem. The most straightforward approach is to try over all substrings (i.e. all starting and ending positions) to erase them and check if the rest is the wanted word. When doing this, you have to be careful not to forget any corner cases, such as: erase few first letters, erase few last letters, erase a single letter, and so on. A popular question was if an empty substring may be erased or not. While it is not clarified explicitly in the statement, the question is irrelevant to the solution, for it is guaranteed in the statement that the initial string is not
CODEFORCES, so erasing nothing will not make us happy. From the technical point of view, you could erase a substring from the string using standard functions like
substr in C++ or similar, or do some bare-hands work and perform conditional iterating over all symbols. Depending on the implementation, this would be either O(n 2) or O(n 3) solution; both of these fit nicely.
One way of solving this in linear time is to compute the longest common prefix and suffix for the given string and the string
CODEFORCES. If their total length is at least 10 (the length of
CODEFORCES), it is possible to leave only some parts of the common prefix and suffix, thus the rest part (being a substring, of course) may be removed for good. If the total length is less than 10, no such way exists. This is clearly O(n) solution (rather O(n) for reading the input, and O(|t|) for comparisons where t is
CODEFORCES in our case).
Sample solution: 10973831
Challenge (easy). A somewhat traditional question: how many (modulo some prime number) large Latin letter strings of length n have the property that a (non-empty) substring may be cut out to leave a given string t? Can you solve it in O(n + |t|2) time? In O(n + |t|) time? Maybe even faster? =)
n is up to 106. We may note that there are only 26 + 1 = 65 quasi-binary numbers not exceeding 106, so we could find them all and implement a DP solution that counts the optimal representation for all numbers up to n, or even a brute-force recursive solution (which is not guaranteed to pass, but has a good odds).
Are there more effective solutions? Sure enough. First of all, one can notice that the number of summands in a representation can not be less than d — the largest digit in decimal representation of n. That is true because upon adding a quasi-binary number to any number the largest digit may not increase by more than 1 (easy enough to prove using the standard algorithm for adding numbers). On the other hand, d quasi-binary numbers are always enough. To see that, construct a number m as follows: for every digit of n that is not 0, place 1 in the corresponding digit of m, and for all the other digits place 0. Clearly, m is quasi-binary. If we subtract m from n, all non-zero digits will decrease by 1 (clearly, no carrying will take place), thus the largest digit of n - m will be equal to d - 1. Proceeding this way, we end up with the representation of n as a sum of d quasi-binary numbers. This solution is good for every numeric base, and works in where d is the base.
Sample solution: 10973842
Challenge (easy). Let us call a number pseudo-binary if its decimal representation contains at most two different digits (e.g., 1, 555, 23, 9099 are pseudo-binary, while 103, 908 and 12345 are not). Represent an integer n as a sum of pseudo-binary numbers; minimize the number of summands. n ≤ 1018.
We want to make the maximum height as large as possible. Consider the part of the chain that was travelled between d i and d i + 1; we can arrange it in any valid way independently of any other parts of the chain, thus we consider all these parts separately. There also parts before d 1 and after d n, but it is fairly easy to analyze them: make them monotonously decreasing (respectively, increasing), as this maximizes the top point.
Without the loss of generality consider d i = 0 and d i + 1 = t (they may be increased of decreased simultaneously without changing the answer), and h d i = a, h d i + 1 = b. Clearly, in consistent data |a - b| ≤ t, so if this condition fails for a single pair of adjacent entries, we conclude the data is flawed.
If the condition holds, it is fairly easy to construct a valid way to move between the days under the |h i - h i + 1| ≤ 1 condition: increase or decrease the height while it differs from b, than stay on the same height. That does not make the optimal way, but at least we are sure that the data is not inconsistent.
How to construct the optimal arrangement? From the adjacent difference inequality if follows that for any i between 0 and t the inequalities h i ≤ a + i and h i ≤ b + (t - i) hold. Let h i = min(a + i, b + (t - i)) on the [0; t ] segment; clearly, every h i accomodates the largest possible value, therefore the value of maximum is also the largest possible. It suffices to show that these h i satisfy the difference condition. Basically, two cases should be considered: if for h i = a + i and h i + 1 = a + i + 1, or h i = b + (t - i) and h i + 1 = b + (t - i - 1), the statement is obvious. Else, h i = a + i but h i < b + (t - i) = h i + 1 + 1, and h i + 1 = b - (t - i - 1) but h i + 1 < a + (i + 1) = h i + 1. Thus, |h i - h i + 1| < 1, and h i = h i + 1.
To find the maximum value of maximum height (I really struggle not to use 'maximum maximum') we may either use ternary search on the h function, or find the point where lines a + i and b + (t - i) intersect and try integer points besides the intersection. If we use this approach analytically, we arrive at the formula (t + a + b) / 2 (try to prove that yourself!).
Sample solution: 10973854
Challenge (medium). Given the same data (that is, a subsequence h d i for a sequence h i), determine how many (modulo a prime number) integer sequences of length n with the property |h i - h i + 1| ≤ 1 agree with the subsequence and have global maximum equal to H? Can you solve the problem in O(n 2) time? In time? Maybe even faster?
Instead of trying to find out where the piece may go, let's try to find out where it can not go. Initially mark all the moves as possible; if there is a field ( x 1, y 1) containing a piece, and a field ( x 2, y 2) not containing a piece and not being attacked, clearly a move ( x 2 - x 1, y 2 - y 1) is not possible. Let us iterate over all pieces and over all non-attacked fields and mark the corresponding moves as impossible.
Suppose we let our piece make all the rest moves (that are not yet marked as impossible), and recreate the position with all the pieces in the same places. If a field was not attacked in the initial position, it will not be attacked in the newly-crafted position: indeed, we have carefully removed all the moves that could take a piece to this field. Thus, the only possible problem with the new position is that some field that was attacked before is not attacked now. But our set of moves is maximal in the sense that adding any other move to it will cause the position to be incorrect. Thus, if the new position doesn't coincide with the initial position, the reconstruction is impossible. Else, we have already obtained a correct set of moves. This solution has complexity of O(n 4) for iterating over all pieces and non-attacked fields. No optimizations were needed to make solution this pass.
Sample solution: 10973859
Challenge (medium). Solve the same problem in time.
With such large constraints our only hope is the subtree dynamic programming. Let us analyze the situation and how the subtrees are involved.
Denote w(v) the number of leaves in the subtree of v. Suppose that a non-leaf vertex v has children u 1, ..., u k, and the numbers to arrange in the leaves are 1, ..., w(v). We are not yet sure how to arrange the numbers but we assume for now that we know everything we need about the children's subtrees.
Okay, what is the maximal number we can achieve if the maximizing player moves first? Clearly, he will choose the subtree optimally for himself, and we are eager to help him. Thus, it makes sense to put all the maximal numbers in a single subtree; indeed, if any of the maximal numbers is not in the subtree where the first player will go, we swap it with some of the not-so-maximal numbers and make the situation even better. If we place w(u i) maximal numbers (that is, w(v) - w(u i) + 1, ..., w(v)) in the subtree of w(u i), we must also arrange them optimally; this task is basically the same as arranging the numbers from 1 to w(u i) in the subtree of w(u i), but now the minimizing player goes first. Introduce the notation for the maximal possible result if the maximizing/minimizing (depending on the lower index) player starts. From the previous discussion we obtain . Thus, if we know for all children, the value of can be determined.
How does the situation change when the minimizing player goes first? Suppose that for each i we assign numbers n 1, 1, ..., n 1, w(u i) to the leaves of the subtree of u i in some order; the numbers in the subtree of u i will be arranged so that the result is maximal when the maximizing player starts in u i. Suppose that numbers n i, j are sorted by increasing of j for every i; the minimizing player will then choose the subtree u i in such a way that is minimal. For every arrangement, the minimizing player can guarantee himself the result of at most . Indeed, if all the numbers are greater than r, all the numbers n i, j for should also be greater than r; but there are numbers n i, j that should be greater than r, while there are only w(v) - r possible numbers from 1 to w(v) to place; a contradiction (pigeonhole principle). On the other hand, the value of r is easily reachable: place all the numbers less than r as n i, j with , and r as, say, n 1, dp max(u 1); the first player will have to move to u 1 to achieve r. Thus, .
The previous, rather formal argument can be intuitively restated as follows: suppose we put the numbers from 1 to w(v) in that order to different subtrees of v. Once a subtree of u i contains dp max(u i) numbers, the minimizing player can go to u i and grab the current result. It follows that we may safely put dp max(u i) - 1 numbers to the subtree of u(i) for each i, and the next number (exactly r) will be grabbed regardless of what we do (if we do not fail and let the minimizing player grab a smaller number).
That DP scheme makes for an O(n) solution, as processing the k children of each node is done in O(k) (provided their results are already there). As an easy exercise, think about how the optimal arrangement of number in the leaves can be constructed; try to make implementation as simple as possible.
Sample solution: 10973864
Challenge (medium). Suppose that we are given numbers n, a, b, and we want to construct a tree with n leaves such that dp max(root) = a and dp min(root) = b. For which numbers n, a, b is this possible? (I'm sure you will like the answer for this one. =)) Can you propose an algorithm that constructs such a tree?
The first approach. For a given k and an element v, how do we count the number of children of v that violate the property? This is basically a range query 'how many numbers in the range are greater than v' (because, evidently, children of any element occupy a subsegment of the array); the answers for every k are exactly the sums of results for queries at all non-leaf vertices. Online data structures for this query type are rather involved; however, we may process the queries offline by decreasing v, with a structure that is able to support an array, change its elements and take sum over a range (e.g., Fenwick tree or segment tree). This can be done as follows: for every element of the initial array store 1 in the same place of the structure array if the element has already been processed, and 0 otherwise. Now, if we sum over the range for the element v, only processed elements will have impact on the sum, and the result of the query will be exactly the number of elements greater than v. After all the queries for v, we put 1 in the corresponding element so that queries for smaller elements would take it into account. That makes for an solution. Estimate q: notice that for a given k there are only non-leaf vertices, thus the total number of queries will be (harmonic sum estimation). To sum up, this solution works in time.
Sample solution (first approach): 10973867
The second approach. Let us index the elements of the array starting from 0. It is easy to check that for a given k the parent of the element a v is the element . One can show that there are only different elements that can be the parent of a v for some k. Indeed, if , the index of the parent is less that , and all produce no more than different parents too. Moreover, each possible parent corresponds to a range of values of k. To show that, solve the equality for k. Transform: , pk ≤ v - 1 < (p + 1)k, , . For every k in the range above the property is either violated or not (that depends only on a v and a p); if it's violated we should add 1 to all the answers for k's in the range. That can be done in O(1) offline using delta-encoding (storing differences between adjacent elements in the process and prefix-summing them in the end). There will be only queries to the delta array (as this is the number of different child-parent pairs for all k). This makes for a simple solution which barely uses any heavy algorithmic knowledge at all.
Sample solution (second approach): 10973868
Challenge 1 (medium). Denote c k the minimal number of elements that should be changed (each to a value of your choice) so that the array becomes a valid k-ary heap. Can you find a single c k (for a given k) in time? Can you find all c k (for 1 ≤ k ≤ n - 1) at once in O(n 2) time? Can you do better than these estimates?
Challenge 2 (hard). Solve the problem from Challenge 1 if an arbitrary rooted tree with numbers in vertices is given (that is, change the minimal number of elements so that no element is greater than its parent). Can you do it in O(n 2)? In ? In ? (I'm pretty certain my approach should work, but I would be glad if anyone could check me on this one. That being said, I'm eagerly waiting for your comments.) Not likely, but maybe you could do even better?
First of all, we'll simplify the problem a bit. Note that after every command the values of x + y and x - y are altered by ± 1 independently. Suppose we have a one-dimensional problem: given a sequence of x's and t's, provide a looped program of length l with commands ± 1 which agrees with the data. If we are able to solve this problem for numbers x i + y i and x i - y i separately, we can combine the answers to obtain a correct program for the original problem; if one of the subproblems fails, no answer exists. (Most — if not all — participants who solved this problem during the contest did not use this trick and went straight ahead to the two-dimensional problem. While the idea is basically the same, I'm not going into details for their approach, but you can view the submitted codes of contestants for more info on this one.)
Ok, now to solve the one-dimensional problem. Let us change the command set from ± 1 to + 0 / + 1: set . If the division fails to produce an integer for some entry, we must conclude that the data is inconsistent (because x i and t i should have the same parity). Now it is clear to see that the operation - 1 becomes operation 0, and the operation + 1 stays as it is.
A program now is a string of length l that consists of 0's and 1's. Denote s i the number of 1's among the first i commands, and s = s l for simplicity. Evidently, an equation holds, because the full cycle is executed ⌊ t i / l⌋ times, and after that more first commands. From this, we deduce .
Suppose that we know what s is equal to. Using this, we can compute all ; they are fixed from now on. One more important fixed value is s l = s. In any correct program s i ≤ s i + 1 ≤ s i + 1, but not all values of s i are known to us. When is it possible to fill out the rest of s i to match a correct program? If s a and s b are adjacent entries that are fixed (that is, every s c under a < c < b is not fixed), the inequality 0 ≤ s b - s a ≤ b - a must hold ( a and b may coincide if for different i several values of coincide). Furthermore, if the inequality holds for every pair of adjacent fixed entries, a correct program can be restored easily: move over the fixed values, and place s b - s a 1's between positions a and b in any possible way, fill with 0's all the other positions in between.
The trouble is that we don't know s in advance. However, we know the positions and the order in which fixed values of s a come! Sort them by non-decreasing of a. All fixed s a can be expressed as linear functions of s; if we substitute these expressions in the 0 ≤ s b - s a ≤ b - a, from each pair of adjacent fixed values we obtain an inequality of general form 0 ≤ p·s + q ≤ d, where p, q, d are known values. If the obtained system of inequalities has a solution, we can get ourselves a valid s and restore the program as discussed above.
It suffices to notice that every inequality of the system has a set of solutions of general form l ≤ s ≤ r (if the set is not empty), where l and r should be calculated carefully depending on the sign of p. All the intervals should be intersected, and the resulting interval provides a range of valid values of s.
Overall, the solution works in , or even in O(n + l) if we use bucketing instead of sorting. Note that the l summand in the complexity is only there for the actual program reconstruction; if we were only to check the existence of a program, an O(n) solution would be possible.
Sample solution: 10973870
Challenge (kinda hard). Under the same statement, how many (modulo a prime number) different programs agree with the given data? Assume that all elementary modulo operations (including division) take O(1) time. Can you solve this problem in O(nl)? In O(n + l)? Maybe even better (in , for example?)
The problem has several possible approaches.
The first approach. More popular one. Forget about t and T for a moment; we have to separate teachers into two groups so that no conflicting teachers are in the same group, and the number of students in each group can be chosen to satisfy all the teachers.
Consider a connected component via the edges which correspong to conflicting pairs. If the component is not bipartite, there is clearly no valid distribution. Else, the teachers in the component can be separated into two sets such that each set should be in the same group, and the groups for the sets should be different. The teachers in the same set will always go together in the same group, so we may as well make them into a single teacher whose interval is the intersection of all the intervals for the teachers we just compressed. Now, the graph is the set of disjoint edges (for simplicity, if a teacher does not conflict with anyone, connect him with a 'fake' teacher whose interval is [0;∞]).
Consider all possible distributions of students; they are given by a pair (n 1, n 2). Provided this distribution, in what cases a pair of conflicting teachers can be arranged correctly? If the teachers' segments are [l 1, r 1] and [l 2, r 2], either l 1 ≤ n 1 ≤ r 1 and l 2 ≤ n 2 ≤ r 2, or l 2 ≤ n 1 ≤ r 2 and l 1 ≤ n 2 ≤ r 1 must hold. Consider a coordinate plane, where a point (x, y) corresponds to a possible distribution of students. For a pair of conflicting teachers the valid configurations lie in the union of two rectangles which are given by the inequalities above. Valid configurations that satisfy all pairs of teachers lie exactly in the intersection of all these figures. Thus, the problem transformed to a (kinda) geometrical one.
A classical approach to this kind of problems is to perform line-sweeping. Note that any 'union of two rectangles' figure (we'll shorten it to UOTR) is symmetrical with respect to the diagonal line x = y. It follows that for any x the intersection of the vertical line given by x with any UOTR is a subsegment of y's. When x sweeps from left to right, for any UOTR there are O(1) events when a subsegment changes. Sort the events altogether and perform the sweeping while updating the sets of subsegments' left and right ends and the subsegments intersection (which is easy to find, given the sets). Once the intersection becomes non-empty, we obtain a pair (x, y) that satisfies all the pairs of teachers; to restore the distribution is now fairly easy (don't forget that every teacher may actually be a compressed set of teachers!).
Didn't we forget something? Right, there are bounds t and T to consider! Consider an adjacent set of events which occurs when x = x 1 and x = x 2 respectively. The intersection of subsegments for UOTRs obtained after the first event will stay the same while x 1 ≤ x < x 2. Suppose the y's subsegmen intersection is equal to [l;r]. If we stay within x 1 ≤ x < x 2, for a satisfying pair of (x, y) the minimal value of x + y is equal to x 1 + l, and the maximal value is x 2 + r - 1. If this range does not intersect with [t;T], no answer is produced this turn. In the other case, choose x and y while satisfying all boundaries upon x, y and x + y (consider all cases the rectangle can intersect with a 45-angle diagonal strip). Thus, the requirement of t and T does not make our life much harder.
This solution can be implemented in using an efficient data structure like
std::set or any self-balancing BST for sets of subsegments' ends. The very same solution can be implemented in the flavour of rectangles union problem canonical solution: represent a query 'add 1 to all the points inside UORT' with queries 'add x to all the points inside a rectangle', and find a point with the value m.
Sample solution (first approach): 10973887
The second approach. Less popular, and probably much more surprising.
Imagine that the values of t and T are small. Introduce the set of boolean variables z i, j which correspond to the event ' n i does not exceed j' ( i is either 1 or 2, j ranges from 0 to T). There are fairly obvious implication relations between them: . As t ≤ n 1 + n 2 ≤ T, we must also introduce implications (here i' is 1 or 2 not equal to i) because if a + b ≥ t and a ≤ j, b must be at least t - j, and for a similar reason. In this, z i, j for j < 0 clearly must be considered automatically false, and z i, j for j ≥ T must be considered automatically true (to avoid boundary fails).
The last thing to consider is the teachers. For every teacher introduce a binary variable w j which corresponds to the event 'teacher j tutors the first group'. The implications and are pretty much self-explanating. A conflicting pair of teachers j and k is resolved in a straightforward way: , .
If a set of values for all the boolean variables described satisfies all the restrictions, a valid distribution can be restored explicitly: n 1 and n 2 are maximal so that z 1, n 1 and z 2, n 2 hold, and the teachers are distributed unequivocally by values of w j. It suffices to notice that the boolean system is a 2-SAT instance, and can be solved in linear time. If we count carefully, we obtain that the whole solution has linear complexity as well: O(n + m + T).
Didn't we forget something? Right! The value of T may be too much to handle Ω(T) variables explicitly. To avoid that, one may notice that the set of possible values of n 1 and n 2 may be reduced to 0, t, l i, t - l i, r i, t - r i. We can prove that by starting from any valid values of n 1 and n 2 and trying to make them as small as possible; the listed values are the ones we may end up with. Thus, we can only use O(n) variables instead of Ω(T). The implications can be built similarily, but using lower/upper bound on the list of possible values instead of exact values (much care is advised!). Finally, this solution can be made to work in , with the logarithmic factor from all the sorting and lower/upperbounding.
Sample solution (second approach): 10973881
Challenge (easy, for a change) Don't you think it's wrong that a group may be without a teacher altogether? Come up with an algorithm that finds a distribution that places at least one teacher in each group. The complexity should not become worse. How about at least k teachers in each group?
Whew, wasn't it a long run! I tried to be verbose and elaborate where it was possible, hope it was worth the wait. Let me know what you think of this write-up!