### ko_osaga's blog

By ko_osaga, history, 3 years ago, You are given a graph $G$, and for each vertex $v$ you have to assign a positive integer color such that every adjacent pair of vertices (vertices directly connected by edge) have different color assigned. You have to minimize the maximum color assigned: In other words, you have to minimize the number of distinct assigned colors.

But that's graph coloring (vertex coloring). It's hard. Ok, one more time:

You are given a graph $G$, and for each edge $e$ you have to assign a positive integer color such that every adjacent pair of edges (edges sharing same endpoints) have different color assigned. You have to minimize the maximum color assigned: In other words, you have to minimize the number of distinct assigned colors.

Note that this can be interpreted as "graph coloring (vertex coloring) of line graphs". In that sense, you can consider in a similar spirit to "graph coloring of interval graphs", "graph coloring of permutation graphs", blah-blah.

## General graph

Let $D = max_{v \in V(G)}deg(v)$ be the maximum degree of a graph. Since every vertex should be incident to edges with different colors, the edges can't be colored with less than $D$ colors.

So if we somehow find the way to color the edges with $D$ colors, the case is closed. This is obviously not true: Consider a 3-cycle, then $D = 2$, but you need 3 colors.

Still, Vizing discovered that this is approximately true:

• Theorem (Vizing, 1964). Every simple graph can be edge-colored with $D+1$ colors.
• Proof: If you are interested then see here.

Note that:

• Theorem doesn't hold if the graph has multiple edges connecting same vertices.
• It's NP-hard to determine if the graph can be colored with $D$ colors, so optimal coloring is still NP-hard.

Wikipedia discusses the $O(nm)$ algorithm to find a $D+1$ edge coloring of simple graph, which you can find the implementation here. Sometimes, people asks you to solve just that exact same problem, and sometimes you can just cheese problems without thinking too much. Anyway, this is not our main point. let's jump to more interesting part.

## Bipartite Graphs

To find an optimal edge coloring, we have to prove that the edges can be colored with $D$ colors. In general graph this was a little (one color!) bit of fail, but as the countercase had an odd cycle, maybe in bipartite graph this is true?

• Theorem. Every bipartite graph can be edge-colored with $D$ colors.

• Proof. We use induction on $D$.

• If $D = 0$, the proposition is trivial.

• Otherwise, we will partition the graph $G$ with a matching $M$ and a graph $G\setminus M$ with maximum degree $D - 1$. We transform the graph $G$ in such a way that the left/right bipartition have equal number of vertices (just add dummy nodes of degree 0), and every node have degree $D$ (repeatedly pick two nodes with degree $<D$ and add a dummy edge. If this procedure fails, then two bipartition have different sum of degrees, which is impossible).

Let $L, R$ be a bipartition of this new graph. Now we assume every node of $G$ have degree $D$. Suppose there is no perfect matching in $G$. Then by Hall's theorem, there exists a subset of vertices $S \subseteq L$ such that $|N(S)| < |S|$. Then there exists $|S| \times D$ edges connecting between $N(S)$ and $S$. On the other hand, there can't exist more than $|N(S)| \times D$ edges incident to $N(S)$. Contradiction. Thus the perfect matching exists.

Now remove the dummy edges from the perfect matching. Note that the dummy edges don't connect any node which originally had degree $D$. So the edges left from the matching still covers all nodes with degree $D$. Denote this matching $M$. Then $G \setminus M$ have maximum degree $D - 1$. Color the edges in $M$ with color $D$, and color $G \setminus M$ by inductive hypothesis.

• Remark. Theorem still holds when the graph has multiple edges.

This also gives an algorithm to find a edge coloring of bipartite graphs. We just want to find any matching that covers all nodes with maximum degree $D$. If all nodes have degree $D$, then simple bipartite matching works. In general case, since we have to cover some nodes, we have to model this as flow and enforce some edges to have capacity 1. This can be modeled with maximum flow with demands. In any case, if you use Dinic's algorithm or Hopcroft/Karp, time complexity is $O(D \times MN^{0.5})$.

So that's great, we can optimally solve edge coloring of bipartite graph in polynomial time. On the other hand, using matching or even flow with demands is pretty tiring and complicated, can we get rid of flows at all?

There is an implementation of edge coloring in $O(NM)$ with short code and good constant factor, which doesn't rely on flow argument. You can read about this algorithm here (problem F). Here is the implementation I use, which is copied from waterfalls' submission here.

## Faster

Can we do this faster? Let's go back to Dinic's algorithm. Here's a simple yet elegant trick to reduce the time complexity from $O(D \times MN^{0.5})$ to $O(MN^{0.5} \log D)$.

The idea is, as you can guess from the $\log$ factor, divide-and-conquer. If $D$ is odd, you can just use the naive method to reduce the degree to $D - 1$. If $D$ is even, it partitions the graph into two pieces with maximum degree $\frac{D}{2}$. But how?

Let's revisit the following well-known fact related to even degrees:

• Lemma (Euler Circuit). For connected graphs where every node has even degree, there is a circuit which visits every edges exactly once.

And also, there exists an algorithm to compute the Euler circuits in linear time.

So now, let's add some dummy edges again to connect the vertices with odd degree. Obviously, there exists even number of vertices with odd degree, so we can add any kind of matchings between them. Now, every connected component has an Euler circuit, so we can compute them for linear time.

Now, note that the Euler circuit obviously have even length in bipartite graphs, let's denote the circuit as $C = {e_1, e_2, \ldots, e_{2k}}$. Let's color the odd-numbered edges ($e_1, e_3, e_5 \ldots$) as blue, and even-numbered edges as red. Then, for each vertex $v$, we can see it is adjacent to $\frac{deg(v)}{2}$ blue nodes and $\frac{deg(v)}{2}$ red nodes! This is because, if the circuit entered the vertex with some color, it leaves the vertex with different colors.

Thus, just take the Euler circuit for each component, and divide the edges into blue and red sets. Remove dummy edges, and you still have max degree $\frac{D}{2}$ for each color. So you can recurse from that point! Since every edges are considered at at most $O(\log D)$ recursion stage, and they contribute to at most $N^{0.5}$ overhead, the time complexity is $O(MN^{0.5} \log D)$.

## More faster!!

Now let's look for something genuinely fast: Fast enough that it doesn't involve any flow and have $O(M\log M \log D)$ solution!

Going back to the original proof, we know that it is easier to assume that the graph is regular: Every node has degree of $D$. Naive strategy of adding edges, as described in proof, adds too much edges and is inefficient. But we can improve it by "merging" small-degree nodes: If two node have sum of degree at most $D$, then we can merge(contract) two nodes, and any coloring in this new graph still works in the original one.

So, for each side of biparitition, while there is at least 2 nodes with $deg(v) \le \frac{D}{2}$, find them and contract it. After the end of this procedure, we are left with at most 2 nodes (one for each bipartition) with $deg(v) \le \frac{D}{2}$. Now, even if we apply the naive way of adding dummy nodes and edges, we end up adding at most $O(cM)$ edges where $c \le 3$ in my naive calculation.

Now let's assume the graph is regular. Thanks to regularity, we don't have to find flows anymore but can simply calculate a perfect matching in this regular graph. It doesn't look easy to find this faster than $O(M^{1.5})$, but there is an algorithm to compute a perfect matching of regular graph in $O(M \log M)$, which was discovered in 2010: https://arxiv.org/pdf/0909.3346.pdf

To briefly explain what's going on, the algorithm is actually not very different from finding a bipartite matching with flows. We construct a usual flow graph with source and sinks, and find an augmenting path. There are just two notable difference:

• If the edge is in matching, rather than reversing it, we just contract two endpoint of such edge. You can see this doesn't make the situation different.
• We don't use DFS to find path: We use random walk from source, which means we will sample any outgoing edges with random probability, and halt until it reaches the sink.

Intuitively, the second heuristic will run very fast in the initial stage of augmenting path finding. At the first stage the length of path will be length 3, and it will remain so until a certain period. Random walk procedure is much faster than DFS in such case, because it's time complexity is just it's walk length. Now, the paper claims the following:

• Lemma 6 (from paper). If we found $m < n$ matchings in a current graph, then the expected number of steps taken by the random walk is at most $2 + \frac{n}{n - m}$.

Then, we can simply do this for all $0 \le m \le n - 1$, and get $2n + n \sum_{i=1}^{n}\frac{1}{i} = O(n \log n)$!

Lastly, note that this problem can be solved in $O(M \log M)$ time. Interested readers can find it in the second reference link.

## Challenge

The $O(mn)$ algorithm is very simple to implement and it usually don't find long Kempe chains. I doubt that is true for any fast algorithm I mentioned. Plus, it won't be cache oblivious. Can anyone implement any $o(mn)$ algorithm for bipartite edge coloring which runs faster than $O(mn)$ algorithm?

## Practice problem

I added some practice problem for this topic. Some of them are directly related to my article, some of them checks your creativity at reducing different problems to bipartite edge coloring.

## References

Also special thanks to 300iq for introducing me the paper for perfect matching.  Comments (7)
 » Cool story bro.
 » I'm weak at constructive algorithms, so adding additional vertices and edges to make graph regular seems very hard to me. It is much easier to prove that chromatic index is $D$ without changing the graph. By Hall's theorem, you can cover all vertices of degree $D$ in the left part. You can also cover all such vertices in the right part. Now if you look at those 2 matchings and apply the usual symmetric difference idea, you can notice that we can actually cover all the vertices of degree $D$.The algorithm also becomes simpler: we just run Kuhn from all the vertices of degree $D$. We only need one modification: we may need to use an alternating path of even length if its end has degree less than $D$. Since we don't add any new edges the constant is pretty small. It passes the problem from educational round easily: 75006778. Maybe with that cool Euler cycle idea it can even work on big graphs.
•  » » Indeed, it sounds much nicer than flow with demands.
 » 3 years ago, # | ← Rev. 2 →   The divide and conquer by Euler loop is beautiful! Didn't know that one. Was really interesting :)! Thank you! To fully understand the last algorithm complexity I will need to read the paper probably THO haha.
 »
 » 3 years ago, # | ← Rev. 3 →   I think Wikipedia proof of Vizing's Theorem by Diestel is incorrect in reasoning. It should not be "Note that β cannot be missing in x, otherwise we could easily extend ck". We cannot extend ${c_k}$. If β is missing in x, then same color β is missing on either sides of edge ${xy_k}$, thus we can now paint ${xy_k}$ as β. This free up the original color z = ${c_o(xy_k)}$. But by definition of ${y_i}$'s, ${y_{k - 1}}$ was already missing z. So now we can paint ${c_o(xy_{k-1})}$ by z. And now a new color is freed. So doing this recursively we can paint ${c_o(xy_0)}$ using original ${c_o(xy_{1})}$. Thus a proper (Δ+1)-edge-coloring of G would have existed. So we assume that β is guaranteed to exist in x. Now from wiki, "there exists 1 ≤ i < k such that ${c_o(xy_i)}$ = β". This is true because otherwise, now we could have extended ck/yk.ko_osaga what do you think?
 » Can you share us the code for that O(Mlog(M)) complexity.