Duality in linear programming. Part 2 — in competitive programming

Hi everyone!

Previously, I wrote a general introduction to linear programming duality. In this blog, I would like to write about several problems that could be solved with this technique. Familiarity with the first blog, or general knowledge of dual problems and how to construct them is generally expected to navigate in this one.

Thanks to brunovsky and Golovanov399 for problem suggestions!

And particularly special thanks to WeakestTopology for problem suggestions and all insightful discussions on the topic!

Dual construction mnemonics

To simplify the construction of dual problems, let's recall the correspondence between constraints/variables in primal and dual problems.

LP duality mnemonics

Maximization	Minimization
Inequality constraint $$$\leq$$$	Nonnegative variable $$$\geq$$$
Inequality constraint $$$\geq$$$	Nonpositive variable $$$\leq$$$
Equality constraint $$$=$$$	Free variable
Nonnegative variable $$$\geq$$$	Inequality constraint $$$\geq$$$
Nonpositive variable $$$\leq$$$	Inequality constraint $$$\leq$$$
Free variable	Equality constraint $$$=$$$

Swapping variables and constraints

Let's continue from where we left in the previous article:

605C - Мечты фрилансера. There are $$$n$$$ jobs. The $$$i$$$-th job gets you $$$a_i$$$ experience and $$$b_i$$$ dollars per second. You want to gain at least $$$p$$$ experience and at least $$$q$$$ money overall, while spending as little time overall as possible. How much time would it take?

Primal formulation

Dual formulation

Solution

So, the first nice property of LP duality in competitive programming is that it allows to swap variables and constraints, effectively reducing the dimensions of the problem when there are very little constraints.

Dual of minimum cost flow

Library Checker — Minimum cost b-flow. Given a flow network, find a minimum cost $$$\mathbf b$$$-flow $$$\mathbf f$$$ and its dual $$$\mathbf \pi$$$.

Primal formulation

Dual formulation

Solution

In this way, dual solution may be found even if you didn't use Dijkstra with potentials and e.g. used SPFA instead.

Inverse MST

acmsguru — 206. Roads. Given a weighted graph and its spanning tree $$$T$$$, you need to adjust weights of graph edges from $$$c_i$$$ to $$$d_i$$$ in such way that the sum of $$$|c_i - d_i|$$$ is minimum possible and $$$T$$$ is a minimum spanning tree of the graph with new edges.

Primal formulation

Dual formulation

Solution

It's not hard to see that the dual is, in fact, an assignment problem on bipartite graph.

Edges $$$i \in T$$$ form one part of the graph, while edges $$$j \not\in T$$$ form another part of the graph, and the edge between $$$i$$$ and $$$j$$$ costs $$$c_{i} - c_j$$$ and has the capacity $$$1$$$. This already allows us to recover the minimum sum $$$x_1 + \dots + x_m$$$, but the problem asks us to find optimal $$$d_1, \dots, d_m$$$, that is to also find the actual values of $$$x_1,\dots, x_m$$$.

For convenience, let's change edge weights from $$$c_i - c_j$$$ to $$$c_j - c_i$$$ and work with minimum cost flow instead, as we're more familiar with its potentials and their properties than with maximum cost flow.

Let's see how minimum cost flow potentials relate with this problem. Adjusted weights of edges in minimum cost flow are defined as $$$c_{ij}^\pi = c_{ij} + \pi_j - \pi_i$$$. And we know that $$$c_{ij}^\pi \geq 0$$$ for non-saturated edges, while $$$c_{ij}^\pi \leq 0$$$ for edges with non-zero flow. What does it mean for edges in the residual network? The edge $$$i \to j$$$ of cost $$$c_j - c_i$$$ would change to

$$$ c_j - c_i + \pi_j - \pi_i = (c_j + \pi_j) - (c_i + \pi_i). $$$

Then, making edge $$$i \to j$$$ of infinite capacity, we will guarantee that $$$c_{ij}^\pi = 0$$$ for edges that participate in the assignment and $$$c_{ij}^\pi \geq 0$$$ for all other edges, hence $$$c_j + \pi_j \geq c_i + \pi_i$$$ for any edge in the bipartite graph. This means that

$$$ x_k = \begin{cases} -\pi_k, & k < n, \\ \pi_k, & k \geq n \end{cases} $$$

would define feasible values of $$$x_1, \dots, x_m$$$. Not necessarily optimal however. What is the total cost of the assignment defined by the residual network with potentials $$$\pi_1, \dots, \pi_m$$$? Any edge participating in the assignment would have zero cost and won't contribute to the assignment value. This only leaves us with edges $$$s \to i$$$, having costs $$$\pi_i-\pi_s$$$ and edges $$$j \to t$$$, having cost $$$\pi_t-\pi_j$$$. Here, $$$\pi_t$$$ and $$$\pi_s$$$ summed up together give an extra factor of $$$b(\pi_t - \pi_s)$$$ in the minimum cost dual.

In this factor we should note that $$$\pi_s=0$$$ by definition when using shortest path from $$$s$$$ potentials, and $$$\pi_t \geq 0$$$ in the moment at which we're no longer able to use augmenting paths of negative cost. To not consider $$$\pi_t > 0$$$ case separately it is convenient to adjust the flow network by adding a direct edge $$$s \to t$$$ of cost $$$0$$$ and positive capacity. This way we guarantee that when all negative-cost paths are exhausted, we will have $$$\pi_t=0$$$ and thus we may ignore it.

The property above means that the optimal cost is obtained if we only use potentials of the vertices that actually participate in the assignment. It hints that we may just change potentials of all the other vertices to $$$0$$$ and it would still be a feasible solution.

Why is it true? Consider a vertex $$$i$$$ in the first part of the bipartite graph that wasn't taken in the assignment. Its potential (negated shortest path from $$$s$$$) must be non-negative, as there is still an edge $$$s \to i$$$ of the cost $$$0$$$. Consider a vertex $$$j$$$ in the second part that wasn't taken in the assignment. Its potential must be non-positive, as there is still an edge $$$j \to t$$$.

If we'd use such potentials in $$$x_i$$$ and $$$x_j$$$ we would increase the cost of edges in $$$T$$$ and decrease the cost of edges not in $$$T$$$. We get a feasible solution after doing it, but it means that we were fine before doing it to begin with, so changing the potentials to $$$0$$$ is OK.

Summarizing everything above, the algorithm is as follows:

Construct the flow network for the assignment problem;
Add an extra edge $$$s \to t$$$ with cost $$$0$$$ and positive capacity;
Find minimum-cost flow $$$f$$$;
Find the potentials, $$$\pi_k$$$ for vertex $$$k$$$ is the negated length of a shortest path from $$$s$$$ to $$$k$$$ in the residual network;
For every $$$k$$$ that participates in the resulting assignment (is incident to saturated edge), add $$$\pi_k$$$ to $$$c_k$$$ to get $$$d_k$$$.

Possible implementation.

Duality... On segments?

1696G - Fishingprince снова играет с массивом. You may do the following operations with the array $$$a_1, \dots, a_n$$$:

Pick $$$1 \leq i < n$$$, then decrease $$$a_i$$$ by $$$x$$$ per second and decrease $$$b_i$$$ by $$$y$$$ per second;
Pick $$$1 \leq i < n$$$, then decrease $$$a_i$$$ by $$$y$$$ per second and decrease $$$b_i$$$ by $$$x$$$ per second.

Let $$$f(a_1, \dots, a_n)$$$ be the minimum time needed to make all $$$a_k$$$ less or equal than zero. Process $$$q$$$ queries of the following kind:

Change $$$a_k$$$ to $$$v$$$;
Given $$$l$$$ and $$$r$$$, print $$$f(a_l, \dots, a_r)$$$.

In this problem, it always holds that $$$a_k \geq 1$$$ for all $$$k$$$.

Primal formulation

Dual formulation

Solution

Consider the case of only two variables:

$$$\begin{gather} \max\limits_{\pi_1,\pi_2 \in \mathbb R} & \pi_1 a_1 + \pi_2 a_2, \\ s.t. & x \pi_1 + y \pi_{2} \leq 1, \\ & y \pi_1 + x \pi_{2} \leq 1, \\ & \pi_1,\pi_2 \geq 0. \end{gather}$$$

The constraints define a quadrilateral with four vertices:

The quadrilateral defined by the inequalities and its vertices

Thus the space of allowed $$$\pi_1, \dots, \pi_n$$$ may be interpreted in the following way: There is a convex quadrilateral defined by the inequalities and $$$(n-1)$$$ points, $$$k$$$-th point having coordinate $$$(\pi_k, \pi_{k+1})$$$. You may arbitrary number of times add same value to the $$$x$$$ coordinate of the $$$k$$$-th point and $$$y$$$ coordinate of the $$$(k-1)$$$-th point, while making sure that all points stay within the quadrilateral.

Besides that, you may freely change the $$$x$$$ coordinate of the first point and the $$$y$$$ coordinate of the last point, obviously always making them as large as possible, as $$$a_1,\dots, a_n \geq 0$$$.

Next step in the solution is to notice that it's always possible to pick a solution of LP which is a vertex of the polytope defined by the constraints (as long as the polytope is bounded). That is, it's always possible to pick a subset of $$$n$$$ inequalities that will be "saturated". In this particular problem, however, all inequalities concern either only $$$1$$$ or only $$$2$$$ variables, so we may construct a graph in which there is an edge between $$$i$$$ and $$$j$$$ whenever there is an inequality that concerns both $$$\pi_i$$$ and $$$\pi_j$$$.

In such graph, an edge $$$(u, v)$$$ would mean that we're able to recover $$$u$$$ if we know $$$v$$$ and vice versa. Besides that, it is also possible to resolve any cycle, as substituting variables along the cycle allows us to return to the first variable of the cycle with the equation of form $$$ax=b$$$. For the set of $$$n$$$ equations to have a unique solution it is sufficient that each connected component has a cycle. Moreover, since we only use $$$n$$$ edges for $$$n$$$ vertices, the cycle of each component must be unique.

Another particularity of this specific problem is that all cycles, if they're unique in their connected component, will be either self-loops (if we take the equation $$$\pi_k = 0$$$) or between only two vertices (if we take both equations $$$x\pi_k + y \pi_{k+1} = 1$$$ and $$$x \pi_{k+1} + y \pi_k = 1$$$).

In the first case, vertices with even distance from the root will hold $$$0$$$.

And the vertices with odd distance from the root will hold $$$\frac{1}{\max(x, y)}$$$, because $$$\frac{1}{\min(x, y)}$$$ would violate the constraints.

In the second case, all vertices in the connected component will hold the value of $$$\frac{1}{x+y}$$$.

Duality in bipartite graphs

ARC 125 — Snack. There are $$$m$$$ kids and $$$n$$$ kinds of snacks. Distribute maximum amount of snack among children in such way that

Total distributed amount of $$$j$$$-th snack is at most $$$A_j$$$;
Total amount of each snack type distributed to $$$i$$$-th child is at most $$$B_i$$$;
Total amount of snacks distributed to $$$i$$$-th child is at most $$$C_i$$$.

Primal formulation

Dual formulation

Solution

The problem could as well be solved with minimum cut directly, but perhaps duality makes it more evident.

Besides, it sheds some light on how flow LP looks like in bipartite graphs, which is particularly useful in the following task.

Duality in bipartite graphs 2

ABC 224 — Security Camera 2. You have a bipartite graph $$$(L, R)$$$. You may pay $$$A_i$$$ to put $$$1$$$ camera at vertex $$$i \in L$$$ or pay $$$B_j$$$ to put $$$1$$$ camera at vertex $$$j \in R$$$. For every pair $$$(i, j)$$$ you want the total number of cameras installed in $$$i \in L$$$ and $$$j \in R$$$ to be at least $$$C_{ij}$$$. What is the least amount you need to pay to satisfy this condition?

Primal formulation

Dual formulation

Solution

Bonus: How to recover the answer (amount of cameras in each vertex)?

Is this aliens trick?

XIX Open Cup, Grand Prix of Korea — Utilitarianism. Given a weighted tree, find a maximum-weight matching of exactly $$$k$$$ edges.

Primal formulation

Dual formulation

Solution

Yes, this is essentially the aliens trick. Note that aliens trick is a bit more general than LP duality, and may work for non-linear problems. However, LP problems is quite large class of cases for which aliens trick will consistently work, without need to prove it every time.

Bonus: Solve the dual problem directly, without taking a second dual of it.

Duality... On subsets?!

1430G - Очередное взвешивание графа. You're given a weighted DAG on $$$n \leq 18$$$ vertices. You need to assign each vertex an integer $$$a_i$$$, so that for each edge $$$i \to j$$$ the value of $$$a_i - a_j > 0$$$ and the total sum of $$$w_{ij} (a_i - a_j)$$$ is minimized.

Primal formulation

Dual formulation

Solution

Essentially the dual problem tells us that the $$$i$$$-th vertex should create (if $$$b_i > 0$$$) or absorb (if $$$b_i < 0$$$) the amount of flow that is equal to $$$|b_i|$$$. At the same time each edge has a cost $$$1$$$ and capacity $$$+\infty$$$, and we want to compute a flow of maximum cost.

To reduce flow creation and absorbtion to only two vertices, we will say that when $$$b_i > 0$$$, there is an edge of cost $$$0$$$ and capacity $$$b_i$$$ from $$$s$$$ to $$$i$$$. Otherwise, we make an edge from $$$i$$$ to $$$t$$$ of cost $$$0$$$ and capacity $$$-b_i$$$.

Note that we need to not only find the size of the answer, but also to recover it. As with inverse MST, we will once again use Johnson's potentials and the complementary slackness.

Let $$$c_{ij}^\pi = c_{ij}+\pi_j - \pi_i$$$ be the adjusted cost of the edge. As we remember, for non-saturated edges it holds that $$$c_{ij}^\pi \geq 0$$$ and for edges with non-zero flow it holds that $$$c_{ij}^\pi \leq 0$$$.

At the same time, for the edges of the initial graph that are used in the flow, it is guaranteed that $$$c_{ij}^\pi = 0$$$, as their capacity is infinite, so they can't be saturated. In other words, for any edge $$$i \to j$$$ it holds that $$$c_{ij} + \pi_j - \pi_i = c_{ij}^\pi \leq 0$$$. Using $$$c_{ij}=1$$$ it means that $$$\pi_i - \pi_j \geq 1$$$ for any edge $$$i \to j$$$, which is exactly what the statement asks for.

On the other hand, the total cost of the flow is determined solely by the edges from $$$s$$$ to $$$i$$$, which contribute $$$b_i\pi_i$$$ to the cost of the flow and the edges from $$$j$$$ to $$$t$$$, which contribute $$$b_j(\pi_t-\pi_j)$$$. Summed up altogether this is equal to the sum of $$$w_{ij} (\pi_i - \pi_j)$$$, as the primal problem asks for, plus an excess of $$$b \pi_t$$$. But this excess is exactly the difference between the mincost flow on the initial and adjusted edges, as this difference is generally equal to $$$b(\pi_t - \pi_s)$$$ and $$$\pi_s=0$$$ from they way we compute potentials.

That being said, the solution is $$$a_k = \pi_k$$$, where $$$\pi_k$$$ is the negated length of the shortest path from $$$s$$$ to $$$k$$$ in the residual network of the maximum flow of maximum cost. Possible implementation.

Nope, no subsets or bitmasks here. The intended solution uses them, hence $$$n \leq 18$$$, but the solution with LP duality is polynomial, so it provides a significant improvement in the complexity of the algorithm.

Inefficient slope trick

713C - Соня и задача без легенды. You're given the array $$$a_1, \dots, a_n$$$. You may increase or decrease some elements of the array arbitrary number of times. What is the smallest number of changes you need to do to make the array non-decreasing?

The problem is famous for being exemplar problem for "slope trick", still it can also be solved less efficiently with duality.

Primal formulation

Dual formulation

Solution

Bonus: Can you solve the dual problem, i.e. find $$$\lambda_1, \dots, \lambda_n$$$, faster than $$$O(n^2)$$$?