neal's blog

By neal, 4 weeks ago, In English,

Since there was a fair amount of interest, I'll explain here how I optimized my solution to 1249F - Maximum Weight Subset from $$$\mathcal{O}(n^4)$$$ to $$$\mathcal{O}(n)$$$. One cool feature of this problem is that it's possible to get from $$$\mathcal{O}(n^4)$$$ all the way down to $$$\mathcal{O}(n)$$$ with fundamentally the same solution idea. This is unlike many other problems, where if you want to improve the runtime further you have to completely rethink the solution from scratch (e.g., switch to greedy rather than DP). Thanks MikeMirzayanov for writing a nice problem.

$$$\mathcal{O}(n^4)$$$

I'll briefly go over the inefficient solutions first. My initial idea was to compute for each node a two-dimensional DP: dp[n][d] = the maximum weight to choose n nodes in a valid manner within our subtree such that the node closest to the root of the subtree has depth d. Then when combining two subtrees, we need to add the node counts together and take the minimum of the two depths to get our new DP state. Code here: 63153425. Pay attention to the attach function in particular.

This looks like $$$\mathcal{O}(n^5)$$$, but it's actually $$$\mathcal{O}(n^4)$$$ due to this observation. I wasn't sure at first if this would manage to pass the time limit, but it turns out the worst case is only around $$$100^4$$$ with good constant factor, rather than $$$200^4$$$.

$$$\mathcal{O}(n^3)$$$

To optimize the above solution, notice that the nested inner loops on x and y can be optimized to linear time by computing suffix minimums of the original DP arrays. We consider two cases for min(x, y), one where x = min(x, y) (in which case we need both y >= x and x + y >= K, meaning y >= max(K - x, x)) and another case where y = min(x, y) (with similar constraints on x). Code: 63200545. Note that the innermost loops perform the equivalent of the commented-out loops on x and y, with complexity improved by a factor of $$$n$$$.

A better $$$\mathcal{O}(n^3)$$$

How can we do better? Well, it would be helpful if we realized that the n for node count in our DP state is completely useless. Let's get rid of it. Now our solution looks like this instead: 63200548. A much cleaner $$$\mathcal{O}(n^3)$$$.

[Correction: this is actually $$$\mathcal{O}(n^2)$$$ due to the same observation as above.]

$$$\mathcal{O}(n^2)$$$

Here we apply the same suffix max optimization again. Code: 63200549. We've now reached the best solution described in the editorial.

$$$\mathcal{O}(n \log n)$$$

How can we improve further? Let's take a look at the quadratic solution again and analyze which parts are causing it to be quadratic. First the algorithm requires inserting into the beginning of a vector; this is easy to fix by swapping out std::vector for a std::deque and using push_front(). On top of that, the inner loops to combine two subtrees clearly cause the algorithm to be quadratic.

However, note that we can rewrite the attach function to run in time proportional to min(root.size(), child.size()) rather than root.size() + child.size(). In particular, the code to combine the two subtrees is completely symmetric with respect to root and child, so we can swap if necessary in order to assume root.size() >= child.size(). We can then combine the two subtrees in O(child.size()) by

  1. saving the suffix maxes in the DP state so we don't have to recompute them and
  2. realizing that we can limit the loops on both x and y to go up to child.size() only, since any indices larger than that will not get modified for our combined DP state.

Thus, after being careful to pass everything by reference to avoid redundant copying, our new algorithm is $$$\mathcal{O}(n \log n)$$$, since every DP state can only be merged from smaller into larger $$$\log n$$$ times each. Code: 63206899. For an easier implementation, note that we can just use the suffix maxes as the DP state itself; in other words, we redefine the DP state to be dp[d] = the maximum weight we can get from validly choosing nodes in our subtree such that all chosen nodes are *at least* depth d from the subtree root.

$$$\mathcal{O}(n)$$$

But our time bound on our algorithm can be improved -- it's actually $$$\mathcal{O}(n)$$$. How come? This is due to the fact that instead of having one DP state per node in our subtree, we have one DP state per distinct depth in our subtree. In particular, this means that merging a smaller state into a larger state results in an output of the same size as the larger state, rather than the sum of the two sizes (like in the usual smaller-to-larger case). So when we merge smaller into larger, you can think of it instead as all of the smaller state values getting 'eaten' by the larger state, in O(1) time per value. Since each value can only get eaten once, we can finally conclude that our algorithm is $$$\mathcal{O}(n)$$$.

Read more »

 
 
 
 
  • Vote: I like it
  • +313
  • Vote: I do not like it

By neal, 4 months ago, In English,

When trying to generate some of my own test cases for 1190E - Tokitsukaze and Explosion, I noticed that it was hard to make cases where even slightly large values of $$$m$$$ didn't all give the same answer. After some thinking this makes perfect sense, because with larger values of $$$m$$$, points that are very far away from the origin become almost freebies; only the few closest points to the origin should really matter.

So on a whim, I tried submitting some 'incorrect' code to the problem. After several rounds of experimentation (in other words, even more binary search), I found that you can add this line to your code:

M = min(M, 4);

and still get accepted. Turns out you don't need all the $$$2^k$$$ jump pointers which were mentioned in the editorial; you can just iterate each element four times instead. ;)

I'll try to generate some stronger test cases. Should be a good chance to try out the new uphacking feature.

EDIT: I generated three strong cases with M = 18, M = 200, and M = 1025 ("strong" meaning that increasing M further still changes the answer). They're now all in the system tests. You'll notice them if you get WA on test 149 or later.

Read more »

 
 
 
 
  • Vote: I like it
  • +218
  • Vote: I do not like it

By neal, 6 months ago, In English,

Did anyone else notice this issue come up today? When trying to hack in my room (via Chrome on Linux), I got a message saying that I needed to update Flash in order to continue. I clicked the update button, which refreshed the page and showed me the code in an almost unreadable font: non-monospaced, letters were all squished together, and l looked exactly like 1.

I just tried installing more fonts which I think might fix it. Does anyone know a way I can check on my hacking font again? Now that the contest is over, when I open code in my room it just shows the normal copy-pasteable code window instead.

Read more »

 
 
 
 
  • Vote: I like it
  • +38
  • Vote: I do not like it

By neal, 10 months ago, In English,

Continuing with the theme from last time, here's a quick writeup of my approaches:

1107A - Digits Sequence Dividing

Since we just want to make two numbers such that the first number is smaller than the second, our best bet is to use only the first digit for the first number and the rest of the digits for the second number. Note that since the numbers can have up to 300 digits we shouldn't actually evaluate the second number. Instead, since the digits only include 1 through 9, we can handle that case by checking the number of digits. Code: 49002957

1107B - Digital root

The key observation is that the digital root of an integer k is the single-digit number 1 ≤ d ≤ 9 such that . You can prove this by noticing that for all p. Once we observe this, finding the k-th number is very simple; see the code: 48993705

1107C - Brutality

Since we are only allowed to push the same button k times in a row, let's do a two-pointer sweep to find all segments of consisting of just one button. Within each segment, we'll sort and take the k highest values. See the code for details on the two-pointer sweep: 48994498

1107D - Compression

After some tinkering with the given condition, we notice that an x-compression is possible iff x divides n and the n × n matrix is divisible into x × x matrices such that each matrix is either all 1 or all 0. We can loop over all such x and check the condition in n2 time per x, but this is potentially too slow.

To speed this up, we can precompute rectangle sums for every rectangle containing the upper-left corner, which enables us to compute the sum of any rectangle in . This improves our time complexity to . Since (really!), this means our solution is . Code: 49028814

1107E - Vasya and Binary String

We set up a DP on (start index, end index, number of consecutive digits matching our start index). In other words, the current string we are solving is the substring from start index to end index, plus some number of additional digits (all equal to S[start]) added as a prefix to our substring.

We then have two choices from any given state:

  • Cash in on our consecutive digits at the start and recurse on .
  • Pick an index i such that S[start] = S[i], and collapse everything between those two indices in order to merge them together for an even larger prefix. This results in a score of , and we can loop over all i to take the maximum.

The runtime is , with a very good constant factor. Code: 49036191

Does anybody have ?

1107F - Vasya and Endless Credits

Notice that if we take offer i exactly m months before we buy the car, it will provide us with money at the time of the car purchase. Moreover, the only values of m that make sense are 0 ≤ m ≤ n - 1. This means we can immediately solve the problem via an algorithm for the assignment problem, such as min-cost flow or the Hungarian algorithm. This has a runtime of or , which manages to fit under the time limit with a good implementation. Code: 49033783

The better solution is to notice that for all offers i where we don't use up all ki months, it's best to sort them by bi (so that the highest values of bi have the lowest values of m). This leads to a very nice DP solution: 49035446

1107G - Vasya and Maximum Profit

We can first compute all values of . Since we only care about the maximum such value within our segment, we can use div-conquer to solve every segment. In particular, if we know the index of the maximum value in , we know that any segment crossing this index has this value as its maximum. We can thus solve all segments crossing this maximum and recurse on the left and right sides.

To find the best crossing segment, note that each problem contributes a value of a - ci. We can independently find the largest sum starting from our crossing index going left and the largest sum going right, and add these two together for the best overall crossing segment.

Unfortunately, since we can't guarantee that the maximum indices will divide our interval nicely in half, this does not lead to the usual runtime of div-conquer but is instead in the worst case. To improve on this, we can precompute partial sums of a - ci and use RMQ to find the minimum sum left of the crossing index and the maximum sum right of the crossing index. This reduces the crossing computation from per index to or , giving an overall runtime of . Code: 49036431

Read more »

 
 
 
 
  • Vote: I like it
  • +262
  • Vote: I do not like it

By neal, 10 months ago, In English,

Just finished this contest in virtual mode and noticed the editorial is missing, so I thought I would share my approaches.

1100A - Roman and Browser

Loop over all values of b and take the sum of all indices that are not equivalent to . Remember to take the absolute value at the end. Runtime is . Code: 48374675

1100B - Build a Contest

Maintain a frequency array of size n and track the number of distinct values seen so far as we loop through the array. Once the number of distinct values reaches n, we hold a round and decrement each index of the frequency array. Note that this takes n time, but this is fine since we can only hold at most rounds. Runtime is . Code: 48374608. Note that x++ means use the value of x and then increment, whereas ++x means increment and then use. Similar for x-- and --x.

1100C - NN and the Optical Illusion

Draw the line segments between the center of the inner circle and the centers of two adjacent outer circles. These three line segments form a triangle. If the inner radius is r and the outer radius is R, then the sides of the triangle are r + R, r + R, and 2R. We also know that the angle of the triangle at the center of the inner circle is . Now we can apply the Law of Cosines and some algebra to solve for R. Code: 48374823

Update: for an even easier method, we can realize that we can cut the angle above in half, which results in .

1100D - Dasha and Chess

This problem has a very clean mathematical solution. It turns out 666 rooks is just enough for us to solve the problem.

First, move to the center of the board, (500, 500). We can do this using at most 499 moves. If any rook is on our row or column we're done, so we can assume that's not the case. Then every rook is in one of the four quadrants surrounding us. Notice that by moving diagonally 499 times in a row, we can fully 'sweep' any of the quadrants we choose.

In addition, every rook is accessible by either row or column in exactly three of the four quadrants. Since 3 × 666 = 1998 > 4 × 499, there must be some quadrant where at least 500 rooks are accessible. We can simply sweep this quadrant fully in 499 moves, and there is no way for our opponent to remove all the rooks fast enough. Code: 48378396.

Make sure to handle this case correctly to avoid Wrong Answer: "The king cannot move onto the square already occupied by a rook." (In this case we should easily be able to find a move that immediately wins.)

1100E - Andrew and Taxi

First, struggle to solve the problem for a while. Then realize that you didn't pay enough attention to this sentence: "It is allowed to change the directions of roads one by one, meaning that each traffic controller can participate in reversing two or more roads." So in particular, we are not looking for the sum of modified edges' weights but instead the maximum edge weight we need to modify in order to remove all cycles from the graph.

When we want optimize the maximum of a set of things, it's a good idea to binary search. Let's say we are binary searching and we have a number C. Then we want to know if it's possible to remove all cycles by only allowing ourselves to reverse edges with c ≤ C.

This turns out to have an efficient solution. First, consider all of the edges with c > C. These edges must not contain a cycle, or else we can immediately decide that C is not good enough. If they don't contain a cycle then they will form a directed acyclic graph (or DAG), and we can perform a topological sort of this graph. Then for every remaining edge (c ≤ C), since we have the option to reverse it if we like, we can choose to point the edge in the same direction as the topological sort. Since all edges are then ordered according to the topological sort, there cannot be any cycles. Runtime is , which can be improved to if desired by binary searching on the sorted list of edge weights instead. Code: 48376981

1100F - Ivan and Burgers

Except for the very weird cover story, this problem is similar to 1101G - (Zero XOR Subset)-less (but harder). I recommend solving that problem first before attempting this one. Both problems are made easier with some knowledge of linear algebra.

For this problem, we want to find the maximum number attainable via XOR within subarrays of our given array. The best way to do this is to treat the numbers as vectors in and compute a basis (https://en.wikipedia.org/wiki/Basis_(linear_algebra)) of the vector space. In other words, we want to find a minimal set of numbers that we can combine together to generate the full vector space. Since the numbers only have bits, one can prove that the basis will have a size of at most 20.

My approach was then to create a seg tree on the array where each node stores a basis of its subarray. Note that it helps to store the basis in strictly decreasing order of highest bit. I join two bases in time, for an overall runtime of per query, which barely fits in the time limit at 2.6s. Code for reference: 48396593. Watch out that ci = 0 is allowed!

It is also possible to solve the problem in or even time per query. One key idea is to compute for each L the at most 20 different R values that increase the size of the basis for the subarray from L to R. See the comment section below for discussion.

Read more »

 
 
 
 
  • Vote: I like it
  • +193
  • Vote: I do not like it

By neal, 13 months ago, In English,

1043D - Mysterious Crime requires you to input 1 million integers and ultimately process them in either nm or time. But the time limit was set to only 1 second. This is very unfriendly to any language that isn't C++, and it even puts C++ solutions at risk of failing.

Case in point: tourist's contest submission uses an STL map and exceeded the time limit, causing him to drop over 100 places. Meanwhile, ksun48 and mnbvmar had almost exactly the same solution and barely squeezed by the time limit in 998 ms. tourist was also able to pass with the same code in practice mode by switching from C++11 to C++14, which shows how close it was.

I admit there are certain problems where it is interesting to distinguish O(n) and solutions, but this doesn't fit that case, as it doesn't take any novel insight to remove the log from the runtime here. Overall it would have been much better with a time limit of 2 or 3 seconds; or alternatively, with m ≤ 5 instead of 10.

Read more »

 
 
 
 
  • Vote: I like it
  • +347
  • Vote: I do not like it

By neal, 13 months ago, In English,

For today's Div 1 problem D, 1067D - Computer Game, once you succeed at a quest, it's clear you should upgrade and play the quest that maximizes your expected reward pi × bi for the remainder of the game.

However, before you succeed at a quest, you have a difficult tradeoff to make: should you take a quest with a high probability of succeeding, to maximize your chances of upgrading and gaining optimal points for the rest of the game? Or should you take a quest that gives you more expected reward pi × ai right now?

It turns out the answer to this tradeoff depends on T, the number of seconds left. As T gets smaller, we should switch to quests with lower pi but higher immediate reward pi × ai. This is because the long-term value of upgrading and maximizing future rewards becomes less relevant compared to getting more reward now. In particular, the optimal quest to pick (when we have not yet succeeded at any) can change multiple times as T gets smaller and smaller. My solution 44819755 uses multiple binary searches along with a stack to determine the optimal switching points.

Unfortunately, the system tests are extremely weak at testing this logic. Many competitors submitted solutions that never even choose to switch quests. This actually passes all of test cases 1 through 104. It will only fail on test case 105, the very last test case, where ironically N = 2:

2 10
1000 1001 0.1
10 1000 0.5

In this case, except when T = 1 you should always perform the second quest, as this gives you a 50% chance of upgrading and getting 0.5 × 1000 = 500 expected reward every turn thereafter. However, when T = 1, upgrading has no more value, so you should perform the first quest instead for an expected reward of 0.1 × 1000 = 100, as opposed to the second which only gives 0.5 × 10 = 5 immediate reward. So this test case requires just one quest switch at the very end.

This case was enough to catch the majority of wrong submissions. However, since it's very simple, it was still possible for incorrect solutions to get through. In-contest, I believe only one did: 44804123. In particular here is a case where two switches are required:

3 3
3 4 0.5
6 25 0.4
15 16 0.2

The maximum pi × bi here is 0.4 × 25 = 10, which is the amount we will get each turn after succeeding and upgrading.

When T = 1, we should choose quest 3 for an immediate reward of 0.2 × 15 = 3, the maximum possible.

However when T = 2, we should choose quest 2, which gives us an expected value of 0.4 × (6 + 10) + 0.6 × 3 = 8.2. Choosing quest 3 only gives us 0.2 × (15 + 10) + 0.8 × 3 = 7.4. Similarly, choosing quest 1 only gives us 8 points.

When T = 3, we should actually choose quest 1, which gives us an EV of 0.5 × (3 + 2·10) + 0.5 × 8.2 = 15.6. This is higher than choosing quest 2, which gives us 0.4 × (6 + 2·10) + 0.6 × 8.2 = 15.32.

So the answer for this test case is 15.6. However bazsi700's Accepted solution above outputs 15.5 because it never uses quest 2.

If you're working on this problem in practice mode and want to know whether you actually solved it, make sure to test the case above and a few others like it!

Read more »

 
 
 
 
  • Vote: I like it
  • +312
  • Vote: I do not like it

By neal, 13 months ago, In English,

C++ has always had the convenient data structures std::set and std::map, which are tree data structures whose operations take time. With C++11, we finally received a hash set and hash map in std::unordered_set and std::unordered_map. Unfortunately, I've seen a lot of people on Codeforces get hacked or fail system tests when using these. In this post I'll explain how it's possible to break these data structures and what you can do in order to continue using your favorite hash maps without worrying about being hacked

Read more »

 
 
 
 
  • Vote: I like it
  • +751
  • Vote: I do not like it

By neal, 14 months ago, In English,

After Round 512 finished, I took a look at the rating changes from Div 1 and ended up calculating their average.

To my surprise, the sum of the rating changes was -5383 (anyone want to help double check the math?). This means that for the 500 participants the average rating change was -10.77. I know this contest is probably an outlier, but this seems way too extreme to be reasonable.

If anything, I think the average rating change ought to be slightly positive, in order to reward participation over time. For example if the average rating change per contest is +0.5, then if someone participates in 100 contests over two years (which is some serious dedication), the most this could contribute to their rating is +50, which seems perfectly fine. This also serves as encouragement for relatively inactive people to be more active.

However, averaging more than a 10-point loss in a round is unreasonable and likely to discourage people over time from participating if it keeps happening; i.e., people who maintain a similar level of performance will see their rating go down over time, and people who improve slightly will see their rating stay flat. If my calculations all check out, the rating algorithm likely deserves some reconsideration.

EDIT: After some more observation, it looks like what's happening is that new accounts who do just a few contests, lose rating to everyone else, and go inactive are potentially offsetting this effect overall. It's hard to say for sure without more specific data.

Read more »

 
 
 
 
  • Vote: I like it
  • +184
  • Vote: I do not like it

By neal, 14 months ago, In English,

Hi Codeforces! In Round 507 earlier today, a large number of "mostly correct" randomized solutions on 1039B - Subway Pursuit were hacked. I wanted to write a quick explanation of how it's possible to hack these, as well as a guide on how to write your code so that you aren't vulnerable to getting hacked.

First to go over the solution quickly (skip down a few paragraphs if you already know the solution): one can use binary search to reduce the range of possible locations for the train to a constant multiple of K, such as 6K. Just make sure to expand out both ends of the range by K after every iteration. Once the range is at most 6K, one can repeat the following:

1) Make a random guess out of the remaining numbers. The probability of success is at least .

2) Binary search again until the range is at most 6K again. It turns out this only takes one query, since after our previous guess our 6K range will become an 8K range, and a single query will reduce this to again.

Since K ≤ 10 and we have 4500 queries, this works because our probability of failure on any test case is at most , extremely low.

However, this makes a key assumption that the train locations are independent of your queries. The main problem that most hacked solutions had was being completely deterministic; i.e., the programs query the same sequence of numbers on every run, given the same output. Most people had this issue due to using rand() without ever calling srand(). There were a few other causes as well: random_device is oddly deterministic on Codeforces, as is mt19937 without a seed. Calling srand() with a fixed value is no good either, as the program is still totally predictable.

Because of the predictability, these programs are quite easy to hack: simply generate the same sequence of queries that the program would make, and set up the train to always be at a different location from each query. To make this even easier for yourself when hacking, you can choose N = 2 and K = 1, which skips the initial binary search phase, and then literally move the train to the non-queried option between 1 and 2 every time.

Workaround?

To get around this, many competitors seeded their generators with the current time by calling srand(time(NULL)); This stops the code from being deterministic and makes it less likely you will get hacked, but it turns out this is still very possible to hack. How? The main problem here is that time(NULL) is only precise to one second. So if I'm able to guess the second that your program gets run, your program is effectively deterministic.

It turns out I don't even need to guess. If I set up N = 11 and K = 10, I can produce all the different query sequences you might generate in the next 10 seconds, since your code will almost certainly be run a few seconds after my generator. Then for every query, at least one of the 11 positions will not be chosen by any of the 10 sequences, and I can move the train to that position each time. Unfortunately -- or fortunately for the people in my room ;) -- I didn't have time to do this during the contest.

Solution

The fix is quite easy. time(NULL) has the right idea, but we should use a much more high-precision clock:

mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

Then generate random numbers by calling rng(). Remember to include <chrono> and <random> for this.

Another option is to use the address of a newly allocated heap variable:

mt19937 rng((uint64_t) new char);

This should be different every run. Note this creates a tiny memory leak for the life of the program since we never call delete, but since it's just one variable it's not a big deal.

I personally like to combine both of these methods, but it isn't necessary. Either one will give your program much more unpredictability, making it effectively impossible to hack.

Thanks for reading! This will likely lead to me getting fewer hacks in the future, but I thought the community should have a guide explaining this unusual situation.

If you want to read more, take a look at dacin21's great post on anti-hash tests.

Read more »

 
 
 
 
  • Vote: I like it
  • +271
  • Vote: I do not like it

By neal, 15 months ago, In English,

Don't use rand(). Why? Let's jump right into some code. What value will the following code print, approximately?

#include <cstdlib>
#include <iostream>
using namespace std;

const int ITERATIONS = 1e7;

int main() {
    double sum = 0;

    for (int i = 0; i < ITERATIONS; i++)
        sum += rand() % 1000000;

    cout << "Average value: " << sum / ITERATIONS << '\n';
}

Should be about 500,000, right? Turns out it depends on the compiler, and on Codeforces it prints 16382, which isn't even close. Try it out yourself.

What's happening here?

If you look up C++ documentation on rand(), you'll see that it returns "a pseudo-random integral value between 0 and RAND_MAX." Click again on RAND_MAX and you'll see that "This value is implementation dependent. It's guaranteed that this value is at least 32767." On the Codeforces machines, it turns out RAND_MAX is exactly 32767. That's so small!

It doesn't stop there though; random_shuffle() also uses rand(). Recall that in order to perform a random shuffle, we need to generate random indices up to n, the size of the array. But if rand() only goes up to 32767, what happens if we call random_shuffle() on an array with significantly more elements than that? Time for some more code. What would you expect the following code to print?

#include <algorithm>
#include <iostream>
#include <vector>
using namespace std;

const int N = 3000000;

double average_distance(const vector<int> &permutation) {
    double distance_sum = 0;

    for (int i = 0; i < N; i++)
        distance_sum += abs(permutation[i] - i);

    return distance_sum / N;
}

int main() {
    vector<int> permutation(N);

    for (int i = 0; i < N; i++)
        permutation[i] = i;

    random_shuffle(permutation.begin(), permutation.end());
    cout << average_distance(permutation) << '\n';
}

This computes the average distance that each value moves in the random shuffle. If you work out a bit of math, you'll find that the answer on a perfectly random shuffle should be = 1,000,000. Even if you don't want to do the math, you can observe that the answer is between = 1,500,000, the average distance for index 0, and = 750,000, the average distance for index .

Well, once again the code above disappoints; it prints out 64463. Try it yourself. In other words, random_shuffle() moved each element a distance of 2% of the length of the array on average. Based on my testing, the implementation of random_shuffle() on Codeforces matches the following exactly:

    for (int i = 1; i < N; i++)
        swap(permutation[i], permutation[rand() % (i + 1)]);

So naturally if RAND_MAX is much less than N, this shuffle will be problematic.

rand() itself has more quality problems than just RAND_MAX being small though; it is typically implemented as a relatively simple linear congruential generator. On the Codeforces compiler, it looks like this:

static long holdrand = 1L;

void srand(unsigned int seed) {
  holdrand = (long) seed;
}

int rand() {
  return (((holdrand = holdrand * 214013L + 2531011L) >> 16) & 0x7fff);
}

In particular, linear congruential generators (LCGs) suffer from extreme predictability in the lower bits. The k-th bit (starting from k = 0, the lowest bit) has a period of at most 2k + 1 (i.e., how long until the sequence takes to repeat). So the lowest bit has a period of just 2, the second lowest a period of 4, etc. This is why the function above discards the lowest 16 bits, and the resulting output is at most 32767.

What's the solution?

Don't worry, as of C++11 there are much better random number generators available in C++. The only thing you need to remember is to use mt19937, included in the <random> header. This is a Mersenne Twister based on the prime 219937 - 1, which also happens to be its period. It's a much higher-quality RNG than rand(), in addition to being much faster (389 ms to generate and add 108 numbers from mt19937 in Custom Invocation, vs. 1170 ms for rand()). It also produces full 32-bit unsigned outputs between 0 and 232 - 1 = 4294967295, rather than maxing out at a measly 32767.

To replace random_shuffle(), you can now call shuffle() and pass in your mt19937 as the third argument; the shuffle algorithm will use your provided generator for shuffling.

C++11 also gives you some nifty distributions. uniform_int_distribution gives you perfectly uniform numbers, without the bias of mod -- i.e., rand() % 10000 is more likely to give you a number between 0 and 999 than a number between 9000 and 9999, since 32767 is not a perfect multiple of 10000. There are many other fun distributions as well including normal_distribution and exponential_distribution.

To give you a more concrete idea, here's some code using several of the tools mentioned above. Note that the code seeds the random number generator using a high-precision clock. This is important for avoiding hacks specifically tailored to your code, since using a fixed seed means that anyone can determine what your RNG will output. For more details, see How randomized solutions can be hacked, and how to make your solution unhackable.

One last thing: if you want 64-bit random numbers, just use mt19937_64 instead.

#include <algorithm>
#include <chrono>
#include <iostream>
#include <random>
#include <vector>
using namespace std;

const int N = 3000000;

double average_distance(const vector<int> &permutation) {
    double distance_sum = 0;

    for (int i = 0; i < N; i++)
        distance_sum += abs(permutation[i] - i);

    return distance_sum / N;
}

int main() {
    mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
    vector<int> permutation(N);

    for (int i = 0; i < N; i++)
        permutation[i] = i;

    shuffle(permutation.begin(), permutation.end(), rng);
    cout << average_distance(permutation) << '\n';

    for (int i = 0; i < N; i++)
        permutation[i] = i;

    for (int i = 1; i < N; i++)
        swap(permutation[i], permutation[uniform_int_distribution<int>(0, i)(rng)]);

    cout << average_distance(permutation) << '\n';
}

Both shuffles result in almost exactly 106 average distance, like we originally expected.

Additional References

This post was inspired in part by Stephan T. Lavavej's talk "rand() Considered Harmful": https://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful

If you want even faster, higher-quality random number generators, take a look at this site by Sebastiano Vigna.

Read more »

 
 
 
 
  • Vote: I like it
  • +891
  • Vote: I do not like it

By neal, 15 months ago, In English,

Thanks to the many authors of this contest (all eight of them) for a great set of problems! I'll sketch out the solution ideas for a few of them here:

1019A - Elections

Let's assume our goal is to end the election with v votes and determine the minimum cost to do so.

First, we need to make sure every other party ends up with fewer than v votes. To do this, we should bribe enough people in every party to get them down to v - 1 votes or fewer. We can do this greedily by bribing the cheapest people in the party.

After that if we have v votes (or more), we are done. Otherwise, we should repeatedly bribe the cheapest voters remaining until we get to v. Note that we can do this in O(n) time rather than by using an O(n) selection algorithm such as nth_element, though this wasn't necessary to pass the problem.

Finally, simply loop over all possible values of v and take the minimum cost across all of them for an O(n2) solution.

Code: 41475525

In fact it is even possible to solve the problem in . Here is a solution using ternary search on v: 41534204

1019B - The hat

Let . Let's define the function d(i) = ai - ai + m. We are looking for i such that d(i) = 0. Notice that since |ai - ai + 1| = 1, d(i) - d(i + 1) must be  - 2, 0, or 2.

First query for the value of d(1). If it is odd, then every d(i) is odd due to our property above, so the answer is  - 1. Otherwise every value d(i) is even. Notice that d(m + 1) = am + 1 - a1 =  - d(1). This means that one of d(1) and d(m + 1) is a positive even value, and the other is a negative even value. Since consecutive values can only differ by  - 2, 0, or 2, there must be an index i in between 1 and m + 1 that satisfies d(i) = 0, and we can use binary search to find this index.

Code: 41502076

: 1019D - Large Triangle

The obvious solution for this problem is O(n3), which can actually be squeezed in the time limit if you're careful; e.g., 41488961 and 41491248. However the real solution is , which I'll describe here:

Since the area of a triangle is , if we picked two of the n points as our base, we would know exactly what height we needed in order to make a triangle with the goal area. In particular, if we had our points sorted by distance to this segment, we could use binary search to determine whether or not there was such a point in time.

We can perform a rotational sweep on the set of points, which lets us set up this height-sorted list for every segment in order to solve the problem. (Imagine smoothly rotating the points clockwise, stopping whenever any two points are perfectly lined up horizontally.)

To do this we can take all n2 segments and sort them by angle. We also initialize our sorted list of points by y, breaking ties by x. We then iterate through our sorted segment list, swapping two points each time in order to maintain the height-sorted list. Finally we do two binary searches per segment to find if there are any points at the appropriate height (one above and one below).

Note that we can perform all these computations in integers only by using cross products. See the code below for details. Also, kudos to the problem writers for adding the guarantee that "no three cities lie on the same line," which helps keep the code cleaner without changing the key idea of the problem.

Code: 41502172

Unfortunately I don't know the solutions for C or E. If you do, leave a comment!

Read more »

 
 
 
 
  • Vote: I like it
  • +146
  • Vote: I do not like it

By neal, 15 months ago, In English,

I've run the following test case on several of the accepted submissions for 1017E - The Supersonic Rocket, and more than a third of them get the answer wrong:

4 4
0 0
1 0
1 1
2 1
0 1
1 1
1 0
2 0

polygons

The answer is NO since the two polygons are rotationally distinct (they are reflections of each other), but many of the accepted submissions output YES--for example:

ko_osaga https://codeforces.com/contest/1017/submission/41350783

geniucos https://codeforces.com/contest/1017/submission/41375712

Benq https://codeforces.com/contest/1017/submission/41354131

Swistakk https://codeforces.com/contest/1017/submission/41351667

(Try running them yourself using Custom Invocation)

Here's why they get it wrong:

After computing the convex hulls of the two sets of points, the problem boils down to determining if two convex polygons are isomorphic under rotation and translation (that is, whether they can be made the same after rotating and translating).

One strategy for this is building up the string "edge length, angle, edge length, angle, ..." for each polygon and seeing if the two strings are identical after some cyclic rotation. However since computing floating-point angle values is error-prone, one idea is to keep all computations in integers by using the cross product of the two edges that make the angle instead of the angle itself. This serves as a proxy for the angle θ since , and the solutions above use this idea.

Unfortunately, sin θ doesn't quite uniquely identify θ, since . The case above makes use of this fact by alternating angles of and , making all cross products the same despite angles being different.

Now what?

Fortunately there is a nice fix; instead of the cross product we can use the dot product, which does the trick because , and cos θ is unique for (our polygons are convex). A similar idea that also works is to use the distance between vertex i and vertex i + 2, which also uniquely identifies the angle when convex (thanks scott_wu and ecnerwala for discussing).

For non-convex polygons we can use the combination of edge length, cross product, and dot product (try to convince yourself why this is sufficient), which again enables us to specify the polygon precisely using only integers.

Read more »

 
 
 
 
  • Vote: I like it
  • +379
  • Vote: I do not like it

By neal, 8 years ago, In English,

I'm unable to do weekday rounds all summer, and there are probably others with a similar situation.

Read more »

 
 
 
 
  • Vote: I like it
  • +60
  • Vote: I do not like it

By neal, 8 years ago, In English,
 
 
 
 
  • Vote: I like it
  • +17
  • Vote: I do not like it