### halyavin's blog

By halyavin, history, 3 years ago, ,

The problem D in the latest Open Cup involves function f(n) which is defined as the minimum sum of sequence b1, ..., bk such that any sequence a1, ..., al with sum less than or equal to n can be dominated by some subsequence of b. One sequence dominates the other, if they have the same length any every element of the first sequence greater than or equal to corresponding element of the second sequence. It turns out that f(n) = n + f(k) + f(n − 1 − k) where k = [(n − 1) / 2]. But why? If this equation still keeps you up at night, you can finally sleep well now. I have found a wonderful proof of this statement which fits the bounds of this site.

• +169

By halyavin, 4 years ago, ,

# I know what is the problem

Where we are going we need all the courage we can get

When I first read the comment that our shiny new gcc 6.2 C++14 compiler has slow printf and scanf I immediately knew what I would do the next weekend. The new MinGW switched to custom implementation and I didn't need anything else to tell what the problem is. Of course, their format string parsing is too slow, they copy data around countless times, no file buffering is used and they have enough levels of indirection to make finding the real code feeling like a treasure hunt. What else could be the problem after all?

Keeping this thoughts in mind I patiently waited for a weekend. When I finally opened MinGW source and searched for printf function, I was pleasantly surprised. The real implementation was only a couple hops away and soon enough I was looking at the heart of printf code. MinGW did try to conceal it a bit though by naming it mingw_pformat.c. But a ~100Kb source file was a very clear hint where all the action is. The format string parsing was straightforward and located in a single big function. Everything was passed by pointer and there were hardly any levels of indirection. But the data was clear — MinGW printf is twice as slow for integer numbers compared with msvcrt.dll and scanf is even worse. Increasing the file buffer size didn't help at all. Reluctantly, I had to admit that the problem is somewhere else. But what could possibly be the problem?

• +280

By halyavin, 4 years ago, ,

## 652A — Gabriel and Caterpillar

I have lost count of number of different challenges in this problem. Only 14 author tests for such a tricky problem were clearly not enough. There were 2 types of challenges.

The first type use integer overflow. When I saw that both h1, h2 and a, b are bounded by 105 I suspected that overflow could be a problem. But my solution divided heights by speeds to get the number of days. Surely there are no need to multiply these quantities I thought. Boy I was wrong. A lot of solutions used a simulation of the process. As long as a > b and the caterpillar is going up, there is no problem. But if the caterpillar is going down, you need to give up eventually and return -1. The largest positive answer is about 105/12 but the largest lose of height per day is about 12·105. The product is about 1010 > 231 and the caterpillar mistakenly finds an apple at the high earth orbit as a result. The challenge test looks like this:

1 100000
1 100000


The second type of challenges were small tests with the answer ≤2. The most popular challenge were

1 9
1 1


where the caterpillar reaches the apple at exactly the end of the day. I think the absence of such edge-case test was an oversight. The next most popular bug was trying to calculate the number of hours to reach the apple instead of calculating the caterpillar position after given amount of hours. This could backfire when the number of hours is not a whole number and rounding in integer division is not working in the direction you need. Such bugs were challenged by tests like these:

1 18
2 1

1 18
2 2


Another popular problem is using division to calculate the number of days but getting negative number. It can happen in the test

1 2
4 3


The caterpillar reaches apple in the first day, but if you continue its trajectory backwards, it will also reach apple in the day before.

A few people misunderstood the problem and thought that they need to calculate the number of days since the initial observation. They were challenged with test case where the caterpillar reaches apple in the beginning of the day:

1 90
10 1


The rest of the challenges are too diverse. I will only highlight submission 16927302 where I had to use h1 not equal to 1. It was a case of confusion between the marker value and a normal value.

## 652B — z-sort

There were 5 challenges for this problem. The author test set contained only single test with odd n greater than one. So it was not surprising that 3 out of 5 challenges used the first non-trivial odd case of n equal to 3. The other 2 challenges exploited reads just beyond the end of the vector. I used custom invocation tab to print the value beyond the end of the vector and construct a test where this value confuses program.

## 653C — Foe Pairs

There were 2 self-challenges that revealed worst cases for O(n2) solutions. Both used the same test:

300000 299999
1 2 ... 300000
1 300000
2 300000
...
299999 300000


I wonder how many more solution failed it after the challenge phase. One more challenge was a random test for solution which had correct time complexity but was too close to time limit (998ms). My first challenge used a bug in removing same consecutive foe pairs from the input. Can you see it?

    if(count && u[count-1] == u[count] && v[count] == v[count]);
else count++;


The worst part is that deduplication was not needed. It wouldn't work anyway since same foe pairs may not be adjacent in the input. Deduplication also played a fatal role for my second challenge. This time, it was done correctly but solution didn't take into account that it got less than m pairs. So the program read foe pairs beyond the end of the vector and produced wrong result.

## 654D — Nested Segments

If you used unordered_map to compress coordinates in this problem, you had a bad time... It took me quite a lot of attempts to construct an anti-hash test for this problem but all hash solutions failed on it. There were no other hacks. The author test set is very good for this problem.

## 654E & 654F — Pursuit For Artifacts & Ants on a Circle

I didn't have time to challenge solutions of these problems and no one else been able to do that too. But it wasn't all for nothing — stoyan_malinin was able to hack one of the author solutions for F. I don't know whether there were a particular idea for that test or it happened by accident though.

• +58

By halyavin, 4 years ago, ,

## 632A — Grandma Laura and Apples

There were no successful challenges for this problem. Thumbs up to everyone! I did find a couple of C/C++ solutions with uninitialized local variables though. Fortunately for them, the top of the stack is filled with zeros in the current testing system and compiler didn't decide to place uninitialized local variables in the register.

## 632B — Alice, Bob, Two Teams

There was quite a variety of off-by-one errors in this problem. My most successful test was

2
1 1
BB


Some solutions just have to flip something. The were also 2 challenges where solution got TL due to slow input.

The most crazy uninitialized variable prize for this problem goes to 16447719:

ll n;
v b(n);

cin >> n;

forn(i, n)
{
cin >> b[i];
}


What is the size of vector b? Definitely not n. Fortunately for the author, it can't be challenged. Undefined behavior can work in your favor too. BTW, thanks to Codeforces for custom invocation feature. This is my go to tool when I need to know the value of an uninitialized variable.

## 632C — The Smallest String Concatenation

Most challenges in this problem were anti-quicksort tests for Free Pascal solutions. One person used normal sorting with lexicographical order followed by bubble sort using the correct order. Turned out this is not fast enough on certain tests. Understandably, not everyone knows how to use custom comparator. You can learn C++98 and C++11 way of doing this in the example section here: std::sort. I tried to challenge 2 C/C++ solutions with qsort() but both of them fit in the time limit despite spending O(n2) time for sorting.

## 632D — Longest Subsequence

Most challenges in this problem used test with the zero answer like:

1 1
2


I am surprised that this test was missing in the author test set. The other half of the challenges were caused by m being bounded by only 438056 instead of 1000000 in the author test set. Some solutions were not fast enough and exceeded time limit for the largest value of m. Also, some solutions didn't merge equal numbers in the input and were challenged by the test with all ones. Spending O(m) time for each one in the input doesn't fit in the time limit. There were also a couple challenges were authors initialized answer (kmax) with 1 instead of 0 while calculating LCM with the maximum answer. So in the case of single number in the input, they used default value of LCM equal to 1. This fails for the test with the different answer like this:

1 3
2


## 632E — Thief in a Shop

Let us denote a the maximum number in the input. One solution tried to used normal O(k2an) dynamic programming approach (calculate the set of all sums of k numbers from a set of sums of k — 1 numbers) to this problem but with the twist: consecutive sums were compressed to a single interval. This worked very well on the author test set until it was challenged (not by me) with the test where all numbers are divisible by 3. One more solution were too close to the time limit and were hacked by random large test.

I challenged all solutions which used FFT over remainders by modulo p with a single power step. All I needed is to construct a test where the number of ways to get the same sum is divisible by p. I used the test of the form 1, 2, N-i1, N-i2, ... , N-i2D. The sum N + k in this test can be represented in Ci1k+Ci2k+...+Ci2Dk ways as a sum of k+1 numbers where C is binomial coefficient. So I just need to find the set of binomial coefficients which sum is divisible by p. How did I do that? Why I use even number of binomial coefficients? Meet me in the middle here, I can't tell you everything.

Unfortunately, there were no way to challenge dynamic programming solutions. These solutions first subtract the minimum number from the rest of them. This step reduces the problem to a case where the first number is zero. Since zeros doesn't influence sums, we now need to calculate all the sums of no more than k of non-zero numbers. Our dynamic programming will store for each sum the least number of non-zero numbers needed to get it. We will add non-zero numbers one by one. Each time we need O(ak) time to recalculate dynamic programming values. Recalculation should be done from small sums to large so that the current non-zero number could be used multiple times. After n-1 steps we choose all sums with values less than or equal to k. The time complexity of this algorithm is O(akn).

## 632F — Magic Matrix

A lot of solutions for this problem just randomly choose triplets i, j, k and check condition from the problem statement. They can be challenged by the graph where all edges have length 2 except a single triangle which contains 2 edges of length 0 and 1 edge of length 1. This triangle is the only triplet which doesn't satisfy magic matrix conditions in this graph. A couple solutions tried to optimize the full search by excluding some triplets using various conditions but these optimizations doesn't improve the worst time complexity of O(n3). A couple solutions only checked triplets with i = 0 and a couple more only checked some naive properties like all maximum values in rows are equal to each other.

The prize for the most confusing undefined behavior goes to 16449526. What the first solve function returns at the end? Does it even matter? Experimentally, I figured out that it is simply missing return true. I couldn't challenge it but it failed system tests. I don't know whether it was caused by undefined behavior or some other error though.

PS If I missed some interesting challenge, post it in the comments.