Heuristic algorithm for Hamiltonian path in directed graphs

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

(It's a translated version of my blog at https://peehs-moorhsum.blog.uoj.ac/blog/6384. )

Finding a Hamiltonian path in a directed graph is a well-known NP problem.

However, about a year ago, I came up with the following heuristic algorithm which has GREAT performance on random graphs(by first generating a hamiltonian path, adding random edges, then randomly permuting indices) and many CP problems.

Here is the idea of the algorithm:

We maintain a subset $$$S$$$ of edges, ensuring there's no cycle and each vertex has in-degree and out-degree at most 1. Whenever $$$|S|$$$ reaches $$$n-1$$$, we find a hamiltonian path.

We start with $$$S$$$ empty. Each time we pick a random edge $$$e \notin S$$$. If we can add $$$e$$$ to $$$S$$$ without violating any rule, we simply add it to $$$S$$$. Otherwise, if $$$e$$$ won't create a cycle and violate with exactly one edge e', we replace e' with $$$e$$$ with 50% probability. Repeat until $$$|S|$$$ reaches $$$n-1$$$.

We can use LCT to check cycles. Here's my code, which takes less than 1 second on most random graphs with $$$|V|=100000, |E|=500000$$$:

namespace hamil {
    template <typename T> bool chkmax(T &x,T y){return x<y?x=y,true:false;}
    template <typename T> bool chkmin(T &x,T y){return x>y?x=y,true:false;}
    #define vi vector<int>
    #define pb push_back
    #define mp make_pair
    #define pi pair<int, int>
    #define fi first
    #define se second
    #define ll long long
    namespace LCT {
        vector<vi> ch;
        vi fa, rev;
        void init(int n) {
            ch.resize(n + 1);
            fa.resize(n + 1);
            rev.resize(n + 1);
            for (int i = 0; i <= n; i++)
                ch[i].resize(2), 
                ch[i][0] = ch[i][1] = fa[i] = rev[i] = 0;
        }
        bool isr(int a)
        {
            return !(ch[fa[a]][0] == a || ch[fa[a]][1] == a);
        } 
        void pushdown(int a)
        {
            if(rev[a])
            {
                rev[ch[a][0]] ^= 1, rev[ch[a][1]] ^= 1;
                swap(ch[a][0], ch[a][1]);
                rev[a] = 0;
            }
        }
        void push(int a)
        {
            if(!isr(a)) push(fa[a]);
            pushdown(a); 
        }
        void rotate(int a)
        {
            int f = fa[a], gf = fa[f];
            int tp = ch[f][1] == a;
            int son = ch[a][tp ^ 1];
            if(!isr(f)) 
                ch[gf][ch[gf][1] == f] = a;    
            fa[a] = gf;

            ch[f][tp] = son;
            if(son) fa[son] = f;

            ch[a][tp ^ 1] = f, fa[f] = a;
        }
        void splay(int a)
        {
            push(a);
            while(!isr(a))
            {
                int f = fa[a], gf = fa[f];
                if(isr(f)) rotate(a);
                else
                {
                    int t1 = ch[gf][1] == f, t2 = ch[f][1] == a;
                    if(t1 == t2) rotate(f), rotate(a);
                    else rotate(a), rotate(a);    
                } 
            } 
        }
        void access(int a)
        {
            int pr = a;
            splay(a);
            ch[a][1] = 0;
            while(1)
            {
                if(!fa[a]) break; 
                int u = fa[a];
                splay(u);
                ch[u][1] = a;
                a = u;
            }
            splay(pr);
        }
        void makeroot(int a)
        {
            access(a);
            rev[a] ^= 1;
        }
        void link(int a, int b)
        {
            makeroot(a);
            fa[a] = b;
        }
        void cut(int a, int b)
        {
            makeroot(a);
            access(b);
            fa[a] = 0, ch[b][0] = 0;
        }
        int fdr(int a)
        {
            access(a);
            while(1)
            {
                pushdown(a);
                if (ch[a][0]) a = ch[a][0];
                else {
                    splay(a);
                    return a;
                }
            }
        }
    }
    vi out, in;
    vi work(int n, vector<pi> eg, ll mx_ch = -1) { 
        // mx_ch : max number of adding/replacing  default is (n + 100) * (n + 50) 
        // n : number of vertices. 1-indexed. 
        // eg: vector<pair<int, int> > storing all the edges. 
        // return a vector<int> consists of all indices of vertices on the path. return empty list if failed to find one.  
        out.resize(n + 1), in.resize(n + 1);
        LCT::init(n);
        for (int i = 0; i <= n; i++) in[i] = out[i] = 0;
        if (mx_ch == -1) mx_ch = 1ll * (n + 100) * (n + 50); //default
        vector<vi> from(n + 1), to(n + 1);
        for (auto v : eg)
            from[v.fi].pb(v.se), 
            to[v.se].pb(v.fi);
        unordered_set<int> canin, canout;
        for (int i = 1; i <= n; i++)
            canin.insert(i), 
            canout.insert(i); 
        mt19937 x(chrono::steady_clock::now().time_since_epoch().count());
        int tot = 0;
        while (mx_ch >= 0) {
        //    cout << tot << ' ' << mx_ch << endl;
            vector<pi> eg;
            for (auto v : canout)
                for (auto s : from[v])
                    if (in[s] == 0) {
                        assert(canin.count(s));
                        continue;
                    }
                    else eg.pb(mp(v, s));
            for (auto v : canin)
                for (auto s : to[v])
                    eg.pb(mp(s, v));
            shuffle(eg.begin(), eg.end(), x);
            if (eg.size() == 0) break;
            for (auto v : eg) {
                mx_ch--;
                if (in[v.se] && out[v.fi]) continue;
                if (LCT::fdr(v.fi) == LCT::fdr(v.se)) continue;
                if (in[v.se] || out[v.fi]) 
                    if (x() & 1) continue;
                if (!in[v.se] && !out[v.fi]) 
                    tot++;
                if (in[v.se]) {
                    LCT::cut(in[v.se], v.se);
                    canin.insert(v.se);
                    canout.insert(in[v.se]);
                    out[in[v.se]] = 0;
                    in[v.se] = 0;
                }
                if (out[v.fi]) {
                    LCT::cut(v.fi, out[v.fi]);
                    canin.insert(out[v.fi]);
                    canout.insert(v.fi);
                    in[out[v.fi]] = 0;
                    out[v.fi] = 0;
                }
                LCT::link(v.fi, v.se);
                canin.erase(v.se);
                canout.erase(v.fi);
                in[v.se] = v.fi;
                out[v.fi] = v.se;
            }
            if (tot == n - 1) {
                vi cur;
                for (int i = 1; i <= n; i++) 
                    if (!in[i]) {
                        int pl = i;
                        while (pl) {
                            cur.pb(pl), 
                            pl = out[pl];
                        }
                        break;
                    } 
                return cur;
            }
        }
        //failed to find a path
        return vi();
    }
}

Note that if the start vertex $$$S$$$ and/or end vertex $$$T$$$ of the path is given, we simply create two new vertices $$$A, B$$$ and add edges from $$$A$$$ to $$$S$$$, from $$$T$$$ to $$$B$$$. Any Hamiltonian path in the new graph automatically has $$$A\to S$$$ as the first edge and $$$T\to B$$$ as the last edge. If we want to find a Hamiltonian cycle, we simply enumerate an edge, then convert the problem to finding a path given start/end vertices. For bidirectional case, we can simply add each edge twice.

Comments (11)

Write comment?

Rewinding

3 years ago, # |

Auto comment: topic has been updated by Rewinding (previous revision, new revision, compare).

→ Reply

Benq

+11

I assume you meant $$$|V|=10^4$$$? With $$$|V|=10^5$$$ and $$$|E|=5\cdot 10^5$$$ directed edges the expected number of vertices with out-degree 0 is quite large ... ($$$\approx |V|\cdot e^{-5}\approx 674$$$)

geniucos

3 years ago, # ^ |

+23

He did say "by first generating a hamiltonian path, adding random edges, then randomly permuting indices" which I guess means he ensures there are no such vertices (and I guess there are ~674 vertices with degree 1 then?)

← Rev. 3 →

+14

aha, can't read

Well assuming $$$4\cdot 10^5$$$ edges in addition to the hamiltonian path then that's $$$|V|\cdot e^{-4}\approx 1832$$$ with out-degree 1.

Yeah, that's true. I also tried many other random graphs, and it turned out that the performance is approximate $$$O(|E|\log|V|)$$$, not depends on $$$|E|/|V|$$$ very much (but it sometimes stuck in cases $$$|E|~3|V|$$$). I guess there're many paths in random graphs, so random guesses work most of the time?

wery0

Very cool approach!

Instead of time(0) it's better to use chrono::steady_clock::now().time_since_epoch().count() inside mt19937, because we can call function work() several times for better probabilities. If we call this function with the same arguments several times per second, time(0) returns the same value, which is bad.

Or we can move mt19937(time(0)) out of work()'s body.

Yeah I see. Updated! >w<

← Rev. 2 →

You can test any input with the code above! Just generate graphs (store edges as vector<pair<int, int>>, where (u, v) denotes an edge from $$$u$$$ to $$$v$$$), and put them into hamil::work. Initialization is done automatically, so you can test graphs in series.

Elegia

+24

This idea is also useful in some problems which are related to Hamiltonian path! In JOISC 2021 Day1 T1 Aerobatics, applying this idea will get >80 points easily.

SoMuchDrama

+20

I tried to apply this idea here and looks like it doesn't perform great for the bidirectional case, where not so many random edges added, so it won't work for very simple undirected graphs. For example, it fails to find the path for 5000 vertices and undirected edges $$$1-2-3-\ldots-N$$$ in 2 seconds.

+40

I see. Seems like the algorithm works bad when there are lots of directional paths but few Hamiltonian paths(which is often true when bidirectional).

(It also depends on seed a lot. For 1-2-3-...-N with N = 5000, it runs rather fast with seed = 377921677545000 (the 1 out of 100 seeds that success at mx_ch = 1e6 lol))

But bidirectional case should be easier, as we can only restrict vertices to have $$$deg \le 2$$$ rather than $$$out \le 1$$$ and $$$in \le 1$$$. I'll write another code for bidirectional later. :)

Rewinding's blog