### jeroenodb's blog

By jeroenodb, 2 months ago,

If you're looking for some binary search practice, this blog might be for you. I have compiled a list (I've heard lists are very popular on CodeForces) of some problems I really liked. The only criteria I used is that the solution for the problem has to be vaguely related to binary search. And I also tried to have some variety in the problems I chose, so they are not all the same type of application of binary search.

Here are 12 problems, in a roughly increasing order of difficulty, that are vaguely (or less vaguely) related to binary search:

1. 1672E - блокнот.exe
2. Median of two sorted arrays Although the time limit doesn't enforce it, try to make a $O(log(n+m))$ solution.
3. IOI 2013 Cave
4. IOI 2020 routers
5. IX'th Problem from NWERC 2021
6. Worm Worries BOI 2018
7. IOI 2015 Scales
8. Peer Streaming Spotify 2011
9. 1028G - Угадай число
10. Secret Sequence Swedish Coding Cup 2017
11. IOI 2017 Prize
12. 1483E - Ва-банк

It turns out the coolest uses (or variations) of binary search I know appeared in interactive problems, who would have guessed :)

• +111

By jeroenodb, history, 7 months ago,

While solving random problems on DMOJ I found this problem: Problem, Contest the problem appeared in. I only read the statement and decided it was not for me, so I moved on. Today after the round I remembered this problem because it seemed similar to 1726F - Late For Work (submissions are not allowed). Not only are the problems exactly the same, they even share the same sample and the explanation of the sample is also the same:

Together with the fact that little_angel has solved this problem on DMOJ:

I am convinced that the authors of Codeforces Round 819 (Div. 1 + Div. 2) and Grimoire of Code Annual Contest 2022 have purposefully copied the problem and didn't even try to hide it. I don't think the contest should be unrated, because the impact was not that big. But I still believe that this is unacceptable for Codeforces contests. You are supposed to create original problems.

• +1149

By jeroenodb, 8 months ago,

Inspired by an old Adamant blog, I reinvented a variation of the technique for calculating the cost of the minimum spanning tree offline, when edge weight updates are allowed. This technique is not new and already appeared in work of David Eppstein. This was also in Adamant's blog, but I missed it completely :).

So the problem statement is: You're given a undirected, connected, weighted graph with $n$ nodes and $m$ edges. There are $q$ updates of the form: Set the weight of $k'th$ edge in the input to $x$. After each update you should report the total weight of a minimum spanning tree. Notice that this is strong enough to simulate edge deletions and insertions, by just giving the edge a weight of infinity when it is "deleted".

This is difficult to do if you were forced to do all updates online. But offline, we can use divide and conquer on time to achieve a decent time complexity. We will aim for a complexity of $O( (q+m) \log(q))$.

Let's start with the ground work. We will number each update from $1$ to $q$. Each edge's weight changes at certain points in time. If we focus on one edge, this splits the time interval $[0,q]$ into smaller intervals $[0,x_1), [x_1,x_2), \dots [x_k,q]$, where in each interval the value of the edge stays the same. The total number of intervals created this way is $m+q$. Let's store all these intervals in structs:

struct intervaledge {
int l,r; // time interval for when this edge is active.
int u,v, weight; // edge itself
};


To find the MST cost for each time from $1$ to $q$ we will make a recursive function

solve(int l, int r, vector<intervaledge> es)


that will find the MST costs for all times inside $[l,r)$, recursively calling itself on smaller intervals. To make the height of the recursion tree small, we will recurse on $[l, mid)$ and $[mid,r)$, until we arrive at unit length intervals. The height of the recursion tree will be $\log(q)$.

The smart trick, that will make everything fast, is that we will make sure that we reduce the size of the graph in each recursive call, such that the number of edges passed into the recursive function is always $O(\text{# of active intervals})$.

We call an interval $[x,y)$ active if it partially overlaps with the interval $[l,r)$, but it does not fully enclose the interval $[l,r)$, with $[l,r)$ being the interval of the current recursive call. I will refer to these intervals and corresponding edges with either active intervals, or active edges.

From the complexity analysis of a segment tree we know that each interval is only active in $O(\log(q))$ recursive calls. If we only do work linear in the number of edges that we pass to each recursive function, we would get a complexity of $O((q+m) \log(q))$ overall. Because we will be using a disjoint set union structure, the actual complexity is slightly worse.

### Compressing the graph

Now comes the hard part, compressing the graph, such that we get our nice complexity. In our recursive function we will get two types of intervals: Some intervals that fully overlap, and some that partially overlap (the active intervals).

We will artificially lower the cost of the edges belonging to the active intervals to $-\infty$ for a moment. Then we run Kruskal's algorithm for finding the MST of our current graph. Kruskal's algorithm would first consider all active edges, and then consider all fully overlapping intervals.

Now it turns out that any fully overlapping edge that belongs to this special MST, has to appear in any MST where the active edges have arbitrary weights. This is because whatever the eventual weights of the active edges will be, their place in the sorted list of edges can only become later. These "certainly good" edges form some connected components. Now we can compress those connected components into single vertices, and relabel all edges. This actually reduces the number of vertices in the new graph to at most $\text{# of active edges}+1$. This is easy to prove: The special MST we created is one component. All the edges in the MST, besides the active edges are certainly good. So if we remove all the active edges from the MST we only create at most $\text{# of active edges}$ new components.

### Reducing the number of edges

So our graph already has few vertices, now we try to reduce the number of edges. Let's build another MST, now only from the edges that belong to fully overlapping intervals. We can do this again with Kruskal's algorithm. Any edge that is not used in this MST, will certainly not be used in when we add extra edges to the graph (the active edges that we ignored in this MST). So those edges can be removed without affecting the MST's we will obtain further on in the recursion.

Because we ignored some edges, it could be that we calculated a minimum spanning forest, instead of a tree. What we know is that forests with $n$ nodes can have at most $n-1$ edges. And because the number of nodes is small, the number of remaining edges is also small.

So in our recursive function we will run these two procedures, to reduce the graph size, and pass this new graph to the left and right smaller intervals. If we would use a naive implementation of Kruskal, we would get an extra log factor, because we need to sort the edges. Luckily, we can only sort the edges once, before the recursion begins, and then the runtime of kruskal becomes $O(n + m \alpha(n))$.

So the actual time complexity becomes $O( (q+m) \log(q) \alpha(n))$, where the extra factor $\alpha(n)$ is the inverse ackermann function, that's due to the use of a DSU.

Code for the DMOJ problem linked

I hope you enjoyed this blog. If you want to test your own implementation, you can do so in this DMOJ problem, although it has really small constraints.

• +169

By jeroenodb, 8 months ago,

Today, I solved a beautiful problem from the NEERC Subregional Contest: Problem M. Since I didn't find an editorial nor a solution in the comments, I decided to write this blog about my solution. For a visualization of my solution see: Visualization of the program

The problem boils down to constructing a spanning tree of a 2D point set with certain constraints. No three points in the point set are collinear. The points have two colors: red and blue. There is at least one blue and one red point. The constraints on the spanning tree are:

• Edges of the spanning tree can only intersect at the ends, if they are drawn as line segments.
• Edges of the spanning tree can only connect points with different colors.
• For each blue point, the degree of the point in the spanning tree should be equal to $r_i$.

The constraints on $r_i$ are nice: $1 \leq r_i \leq \text{# of red points}$. And $\sum r_i = n-1$, which is of course needed if the final graph needs to be a tree. It turns out that these conditions are strong enough that it is always possible to construct a spanning tree!

Let's prove this by induction on the size of the point set, $n$.

### Base case $n=2$

Because there is at least one red and one blue point, the smallest case is $n=2$: one red and one blue point. Because of the constraint $\sum r_i = n-1 = 1$, this means that the degree of the only blue point left is 1. The trivial solution is to connect the blue and the red point.

### Induction step $n>2$:

Now the hard part; Let's assume it is possible to make a spanning tree for all $2 \leq k < n$. Now we are given a point set with $n$ points, how do we construct the spanning tree for it? First, let's sort the points on their x-coordinate. We use the y-coordinate as a tiebreaker. Let's do casework on the colors of the leftmost and rightmost point:

#### Case 1: extreme points have different colors

Let's analyze the convex hull of the point set. As these two points both lie on the convex hull, the convex hull contains at least one red and one blue point. Certainly, there must be an adjacent pair of points on the hull with different colors. Let's connect these with an edge. If $r_i$ of the blue point is 1, we delete this point. Else we delete the red point, and lower $r_i$ by one. Notice that both the $\sum r_i$ and $n$ decreased by one, so the resulting $r_i$ satisfy all constraints. It can also be verified that at least one red and blue point remain, with some casework. So we can use the induction hypothesis on this smaller point set, which gives us a spanning tree. Could our newly added edge intersect with an existing edge? No, because the edge lies on the convex hull. So in this case we are done.

For the last two cases, where both extreme points have the same color, we will use a different technique. Let's give each point a weight, For blue points, we set $w_i = -1 + r_i$. For red points we set $w_i=-1$. If the sum of weights is -1, that means $\sum_\text{all points} -1 + \sum_\text{blue points} r_i = -1 \implies \sum_\text{blue points} r_i = n-1$.

So if we could somehow split the point set with a vertical line through some point, such that the left and right halve both have the sum of weights equal to -1, we could use the induction hypothesis. For this to work we need both sides to have at least one point of each color.

#### Case 2: extreme points are both blue

Let's look at the prefix sums of the weights of the point set in sorted order. So the prefix sum is $p_k = \sum_{i<k} w_i$, where the points are labeled from left to right. Because the total prefix sum is $-1$, and the prefix sum starts at $0$, it must cross from nonnegative to negative at some point. Because the weights of all blue points are nonnegative the only place this can happen is at a red point. Because the weight of a red point is $-1$, it will cross from $0$ to $-1$, and this is exactly what we wanted, a sum of weight equal to $-1$. Now we can draw a vertical line through this red point, and by the induction hypothesis construct two spanning trees, one containing all points to the left and the mid point, and one with all points to the right and the middle red point. Note that because the red point can't be one of the extreme points, the two point sets are definitely smaller than the initial point set. Since these spanning trees exist in two disjoint regions, they don't intersect. And they connect at the midpoint, so their union is a valid spanning tree of the whole point set! To ensure the regions are disjoint even if there are ties in x-coordinates, we tilt the splitting line slighly.

#### Case 3: extreme points are both red

We do exactly the same, looking for a prefix sum which is equal to $-1$. Now it turns out this is sometimes not possible. But by similar (but inverted) logic as in Case 2, there exists a blue point, such that the prefix sum goes from negative to nonnegative. We again make a vertical splitting line through this point, but now we modify the $r_i$ we will use for this blue midpoint in the left and right subproblem. We choose these new $r_i$'s such that the sum of weights in the left and right point sets is again $-1$. Then we apply the induction hypothesis to get two valid spanning trees of the left and right side. And the union of these spanning trees will be a valid spanning tree for the entire point set. This concludes the proof.

This proof is constructive, so for actually generating a spanning tree we can follow the induction logic backwards until the spanning tree is fully constructed.

The easiest way to code this is to make a recursive function that constructs a spanning tree for the point set it's given. Based on the colors of the outermost points we either calculate a convex hull or prefix sums from left to right and call the function recursively. We only recurse $n$ times, and the complexity per call can be made $O(n \log n)$, so this gives $O(n^2 \log n)$.

But if we additionally presort the points and ensure the point set stays sorted in the recursive calls, the convex hull calculation can also be done in linear time, resulting in $O(n^2)$ running time.

Although this is not needed, I think this approach can be optimized further by using Smart divide and conquer and Decremental convex hull, and some modifications to quickly find two oppositely colored adjacent points on the convex hull. This would give a recursive formula for the time complexity of $T(n) = \max_{1\leq k < n} T(k)+T(n-k) + O(\min(k,n-k) \log (n) ) = O(n \log^2 n)$, but this is definitely not needed to get AC.

My code can be found here: submission

• +102

By jeroenodb, 19 months ago,

I am currently using the Discord Bot, TLE Bot. This bot calls the Codeforces API's user.status, to request my submissions.

However it seems like whenever https://codeforces.com/contest/1524/submission/117415184 is included in this list, with for example This call to user.status, Codeforces gives back an Internal Server Error. Now the bot will not serve me problems, because it needs to check which problems I already did. Could you please look into this?

• +42

By jeroenodb, history, 19 months ago,

I recently solved the problem Playlist on Kattis: https://open.kattis.com/problems/playlist. In brief the statement is:

Given a directed graph with $n \leq 100$ nodes and the outdegree of each node is less than 40. All the vertices are colored. Find a path of length 9 in the graph such that no pair of nodes in the path has the same color, and output this path. If there're multiple solutions, output any, output "fail" if there is no such path. The time limit is very big (12 seconds).

My solution involved meet-in-the-middle:

Split the path into a middle node and two paths of length 4 connected to this middle node.

Now we first calculate for each node all the subsets of 4 colors such that there's exist a path of these four colors going into this node and going out of this node. This can be done by a DFS with extra state. The state is of the form: (current node, subset of colours we visited until know). A state can be extended by putting the colour of the current node in the subset and going to a neighbour of a color that isn't in our subset. With hashtables I assure that no path is calculated twice (like normal DFS). We do this for the given graph, and the reverse graph. A simple bound of the number of states is $\binom{100}{4} \cdot 100 \approx 4 \cdot 10^8$, and for each state we loop through all the out edges. So the number of operations of this step is about $40 \cdot |\mathrm{States}|$.

In the combining step we loop through all candidate middle nodes. Now we have two collections of subsets (one for the ingoing paths, one for the outgoing). We want to find two sets in both collections that are disjoint. I used Inclusion-Exclusion to count for a given set in the first collection, how many sets in the second collection have at least one element in common, and with this we can detect disjoint sets. This adds $2^4 \cdot |\mathrm{States}|$ to the runtime.

Because of all the hash tables and the huge number of states, my solution ran in 11 seconds.

Here's my code: https://ideone.com/JEZoNF

Is there a simpler or faster solution? I couldn't find an editorial from the contest (Spotify Challenge 2011).

• +22

By jeroenodb, history, 2 years ago,

Hey codeforces,

I found an old blog about an interesting problem about persistent lists:
Old blog Problem and submit on e-olymp

In this problem you start with a bunch of lists of integers (all with length 1 at the start). There are three operations:

• merge i j: Make a brand new list that is the concatenation of list i and list j. (i and j could be the same)

• head i: Make two new lists: One with the first element (the head) of list i and one with all the other values

• tail i: Make two new lists: One with all the elements except the last, and one with all the other values.

After each operation you have to report the sum of numbers in all the newly created lists modulo $10^9 + 7$.

Constraints are: number of starting lists with one number inside $\leq 10^5$ and number of operations $\leq 10^5$

The difficulty of this problem lies in the fact that with the operation merge, you can double the size of the largest array every time. This means that a list can have a size of $2^{10^5}$. Another difficulty is that you have to remember all the previous lists and only make copies of them.

I tried my hands at it and came up with a solution, inspired by the old blog.

For each list we keep track of its sum, and we keep track of the first $10^5$ elements and the last $10^5$ elements. This is all we need because the head and tail operations will only remove one element from the front or back at a time. For fast split and concatenation of the remaining head and tail elements I used persistent treaps. There is also a special case for when the size of one of the lists is still smaller than $2 \cdot 10^5$. Then, I store the whole list in one treap.

With my persistent treaps, split and merge can be done in $O(log(n))$ expected time and with $O(log(n))$ extra nodes created per operation. With persistence all previous versions of the data structure stay intact, so this makes it possible to do the merge, head and tail operations. (head and tail can be simulated by splitting the treap into a piece of length 1 and length n-1). Because there are $m$ operations and each operation does a constant number of splits and merges on a persistent treap of size at most $2*m$, the total time and space complexity becomes $O(n + m log(m))$, where $n$ is the number of initial lists. You can check out my code here: https://ideone.com/GVwY1W.

• +21

By jeroenodb, history, 2 years ago,

Dear Codeforces,

I made a small blog on different visualizations I made for Geometry problems I solved. I think the coolest one to look at is the Voronoi Diagram construction with Fortune's Sweep You can check it out here: Blog.

• +140