Fastest algorithm to find Bipartite matching

→ Обратите внимание

До соревнования
Codeforces Round 940 (Div. 2) and CodeCraft-23
2 дня
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя learnig

Fastest algorithm to find Bipartite matching

Автор learnig, история, 6 лет назад, По-английски

What is the fastest algorithm to compute maximal bipartite matching. What is its worst case run time complexity. How to construct the worst case for such algorithm. Thanks in advance.

learnig
6 лет назад
13

Комментарии (13)

Написать комментарий?

Tanzir5

6 лет назад, # |

The fastest is Hopcroft Karp BPM. It runs in O(sqrt(v)*E). You can go through Wikipedia to learn more.

→ Ответить

learnig

6 лет назад, # ^ |

Do you know In what case it is slow? I think on random graphs it runs pretty fast

→ Ответить

Lelby

6 лет назад, # ^ |

I bet this guy knows something about it: Burunduk1

→ Ответить

Burunduk1

6 лет назад, # ^ |

In d-regular graph (all degrees are equal to d) there is O(nlogn) algorithm Goel, Kapralov, Khanna. Notice, asymptotic is independent of number of edges. The algorithm: consider graph, on which Kuhn runs dfs, and instead of dfs run "random walk" (choose random edge and go throw it).

On random but not regular graphs Kuhn with optimizations is better than Hopcroft.

→ Ответить

triveni

6 лет назад, # ^ |

+13

Bipartite Matching problem is equivalent to max flow in a modified graph.

In graphs which are dense locally at some places and mostly sparse overall, Push Relabel Algorithm with gap Relabel heuristic runs very very fast. Much faster compared to Dinic's (which is equivalent to Hopcroft Karp).

Asymptotic complexity of such algorithm is O(|v|^3) though for general graphs. But on random graphs it runs much more fast. To give you some numbers — my implementation runs within seconds for random graphs with n = 10000 nodes, m = 1000000 edges.

→ Ответить

teja349

6 лет назад, # ^ |

it would be great if you could share your implementation code.

→ Ответить

triveni

6 лет назад, # ^ |

← Rev. 2 →

Code

#include <bits/stdc++.h>

using namespace std;

/*
	Push Relabel O(n^3) implimentation using FIFO method to chose push vertex.
	This uses gapRelabel heuristic to fasten the process even further. If only 
	the maxFlow value is required then the algo can be stopped as soon as the
	gap relabel method is called. However, to get the actual flow values in the
	edges, we need to let the algo terminate itself.
	
	This implimentation assumes zero based vertex indexing. Edges to the graph 
	can be added using the addEdge method only. capacity for residual edges is
	set to be zero. To get the actual flow values iterate through the edges and
	check for flow for an edge with cap > 0.

	This implimentaion is superior over dinic's for graphs where graph is dense  
	locally at some places and mostly sparse. For randomly generated graphs, this
	implimentation gives results within seconds for n = 10000 nodes, m = 1000000 
	edges.

	Code Tested on : SPOJ FASTFLOW
	@author : triveni
*/

typedef int fType;

struct edge
{
	int from, to;
	fType cap, flow;
	edge(int from, int to, fType cap, fType flow = 0) : from(from), to(to), cap(cap), flow(flow) {}	
};

struct PushRelabel
{
	int N;
	vector<edge> edges;
	vector<vector<int> > G;
	vector<int> h, inQ, count;
	vector<fType> excess;
	queue<int> Q;
	PushRelabel(int N) : N(N), count(N<<1), G(N), h(N), inQ(N), excess(N) {}

	void addEdge(int from, int to, int cap) {
		G[from].push_back(edges.size());
		edges.push_back(edge(from, to, cap));
		G[to].push_back(edges.size());
		edges.push_back(edge(to, from, 0));
	}

	void enQueue(int u) {
		if(!inQ[u] && excess[u] > 0) Q.push(u), inQ[u] = true;
	}

	void Push(int edgeIdx) {
		edge & e = edges[edgeIdx];
		int toPush = min<fType>(e.cap - e.flow, excess[e.from]);
		if(toPush > 0 && h[e.from] > h[e.to]) {
			e.flow += toPush;
			excess[e.to] += toPush;
			excess[e.from] -= toPush;
			edges[edgeIdx^1].flow -= toPush;
			enQueue(e.to);
		}
	}

	void Relabel(int u) {
		count[h[u]] -= 1; h[u] = 2*N-2;
		for (int i = 0; i < G[u].size(); ++i) {
			edge & e = edges[G[u][i]];
			if(e.cap > e.flow) h[u] = min(h[u], h[e.to]);
		}
		count[++h[u]] += 1;
	}

	void gapRelabel(int height) {
		for (int u = 0; u < N; ++u) if(h[u] >= height && h[u] < N) {
			count[h[u]] -= 1;
			count[h[u] = N] += 1;
			enQueue(u);
		}
	}

	void Discharge(int u) {
		for (int i = 0; excess[u] > 0 && i < G[u].size(); ++i) {
			Push(G[u][i]);
		}
		if(excess[u] > 0) {
			if(h[u] < N && count[h[u]] < 2) gapRelabel(h[u]);
			else Relabel(u);
		}
		else if(!Q.empty()) { // dequeue
			Q.pop();
			inQ[u] = false;
		}
	}

	fType getFlow(int src, int snk) {
		h[src] = N; inQ[src] = inQ[snk] = true;
		count[0] = N - (count[N] = 1);
		for (int i = 0; i < G[src].size(); ++i) {
			excess[src] += edges[G[src][i]].cap;
			Push(G[src][i]);
		}
		while (!Q.empty()) {
			Discharge(Q.front());
		}
		return excess[snk];
	}
};

int main()
{
	int n, m;
	scanf("%d %d", &n, &m);
	PushRelabel df(n);
	while(m--) {
		int x, y, c;
		// cin >> x >> y >> c; // 0- based index
		scanf("%d%d%d", &x, &y, &c);
		--x, --y;
		if(x != y){
			df.addEdge(x, y, c);
			df.addEdge(y, x, c);
		}
	}
	cout << df.getFlow(0, n-1) << "\n";
	return 0;
}

...

→ Ответить

dacin21

6 лет назад, # ^ |

If you use the lowest-label version and use global relabeling, the runtime can be bounded by $\text{[math]}$ (see Robert J. Kennedy, Jr. Solving unweighted and weighted bipartite matching problems in theory and practice, chapter 3 (The paper does not treat gap relabeling.)).

Even without global relabeling, the lowest label version is usually the fastest one on bipartite matching graphs.

→ Ответить

triveni

6 лет назад, # ^ |

Thanks for sharing. Not so obvious at the moment though, I will ponder on this :)

→ Ответить

helThazar

6 лет назад, # |

+25

One can also use Dinic algo and reduce problem to max flow, that will be as fast as Hopcroft Karp on bipartite graphs.

→ Ответить

bicsi

6 лет назад, # |

+12

You might want to check out this variation of the alternating paths bipartite match algorithm. It is easy to code and very quick for most networks that have particularities (it can do ~1e6 vertices and edges in considerably less than 1s), even quicker than the regular Hopcroft-Karp algorithm (because it doesn't do any bfs beforehand).

Code: https://pastebin.com/q12aBwya

Note that the key optimization here is in the separation of the two for loops.

I have yet to find an example where it does quadratic time (and I think it's very non-trivial). I however suspect that the proof of Edmonds Karp upper bound cannot be applied here, as what the algorithm is doing is not equivalent to a BFS (the alternating paths aren't shortest, per se).

→ Ответить

dacin21

6 лет назад, # ^ |

+24

Your implementation has got an Ω(V²) worst case on a sparse graph (E = Θ(V)).

generator

The input is approximately n, the output is n m followed by m edges. The vertices on either side are zero-based.

n = int(input())
k = n//8*2
ed = []
for i in range(k):
    ed.append((i, i))
for i in range(k-1):
    ed.append((i+1, i))
for i in range(k//2):
    ed.append((i, k+2*i))
    ed.append((2*k+2*i, k+2*i))
    ed.append((2*k+2*i, 3*k+2*i))
    ed.append((k+2*i, i))
    ed.append((k+2*i, 2*k+2*i))
    ed.append((3*k+2*i, 2*k+2*i))
for j in range(k//2):
    i = k//2-1-j
    ed.append((k//2+j, k+2*i+1))
    ed.append((2*k+2*i+1, k+2*i+1))
    ed.append((2*k+2*i+1, 3*k+2*i+1))
    ed.append((k+2*i+1, k//2+j))
    ed.append((k+2*i+1, 2*k+2*i+1))
    ed.append((3*k+2*i+1, 2*k+2*i+1))

print("{} {}".format(n, len(ed)))
for e in ed:
    print("{} {}".format(e[0], e[1]))

The input is a tree consisting of a long path of length $\text{[math]}$ and paths of length 3 hanging from every vertex in the long path. Vertex labels and edge order are chosen in a way such that initially every other edge in the long path is matched and every middle edge in a short path is matched. In a single phase, you'll find a single long augmenting path, alternating left → right and right → left. (The paths of length 3 break your heuristic.) This results in $\text{[math]}$ phases.

The lower bound can be extended to Ω(VE) by adding a dense part K_2x, x with $\text{[math]}$ where you'll scan x² ≈ E edges every phase.

You might be able to fix this by randomizing the vertex and edge ordering.

→ Ответить

bicsi

6 лет назад, # ^ |

← Rev. 3 →

Thank you for your counterexample!

As weird as this sounds, this implementation is the one that basically every high-school student in our country learns about as being the Hopcroft-Karp algorithm, and nobody here has managed to find a counterexample to break the heuristic (I think mostly because I am not very familiar with how to generate maximal test cases for matching in general).

I will try to add this example to our most popular online judge archive. This should be fun.

It seems like randomizing the order of vertices drops down the number of iterations to 8 in this case, although it's (very slightly) slower because of the shuffle algorithm. It's good to note though.

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 19.04.2024 13:37:34 (h2).

Десктопная версия, переключиться на мобильную.

При поддержке