Suffix Automaton Implementation

Revision en2, by Meaf, 2023-03-23 17:26:01

I've recently been working on suffix automaton code and I have an implementation that I believe to be more intuitive and easier to type (certainly shorter) than the implementation I've seen circulating around (mainly adapted from cp-algorithms). It also appears to benchmark a little faster than the cp-algorithms implementation as well, which is a nice perk. Please, let me know if you can find some improvements to make, either making the code shorter, faster, etc.

Edit: The primary purpose of this code is for ICPC-style contests which limit printed material and have no digital materials (unlike codeforces which allows copy/paste during contests). Thus, my primary focus is brevity and speed, not necessarily readability. There's always tradeoffs; for example, if 20 characters would be added to make the code much more readable, than that's a valid consideration, but I want the number of lines and number of characters to be as small as reasonable without making the code completely arcane.

// len is the longest-length substring ending here
// pos is the first index in the string matching here
// term is whether this node is a terminal (aka a suffix)
struct st { int len, pos, term; st *link; map<char, st*> next; };
st *suffixAutomaton(string &str) {
    st *last = new st(), *root = last;
    for(auto c : str) {
        st *p = last, *cur = last = new st{last->len + 1, last->len};
        while(p && !p->next.count(c))
            p->next[c] = cur, p = p->link;
        if (!p) cur->link = root;
        else {
            st *q = p->next[c];
            if (p->len + 1 == q->len) cur->link = q;
            else {
                st *clone = new st{p->len+1, q->pos, 0, q->link, q->next};
                for (; p && p->next[c] == q; p = p->link)
                    p->next[c] = clone;
                q->link = cur->link = clone;
            }
        }
    }
    while(last) last->term = 1, last = last->link;
    return root;
}
Tags strings, suffix automata, implementation

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en2 English Meaf 2023-03-23 17:26:01 533
en1 English Meaf 2023-03-21 19:07:14 1494 Initial revision (published)