Permutation group basis construction (Schreier–Sims algorithm)

Hi everyone!

I had this blog drafter for quite a while, but the $300 contest motivated me to actually finish and publish it. I don't care about the prize that much as I think many other blogs published recently are talking about something even cooler. But I like to make people suffer, so I will apply anyway to make peltorator read even more stuff >:)

Besides, the contest makes it harder to stay on contribution top, so I got to do something to stay afloat. That being said...

Today I'd like to write about an algorithm that solves the following problem:

You're given a set of permutations $$$G = \{g_1, \dots, g_m\}$$$. Find the size $$$|\langle g_1, \dots, g_m \rangle|$$$ of the group generated by $$$G$$$.

At first, I wanted to get into a lot of group-theoretic details with proofs and all, but then I felt like it would make the article harder to comprehend, so I will try to keep group-theoretic stuff to minimum, while telling more about the high-level idea.

Prerequisites

It is strongly recommended to be familiar with some simpler basis constructing algorithms, e.g. Gauss elimination or XOR basis.

Familiarity with basic group theory (e.g. definitions) will also be beneficial.

What does the statement mean?

In this article, we will treat permutations as a bijective functions from $$$\{1, \dots, n\}$$$ to itself.

For example, the permutation $$$\sigma = \begin{pmatrix}1 & 2 & 3 \newline 2 & 3 & 1\end{pmatrix}$$$ is perceived as a function $$$\sigma$$$, such that $$$\sigma(1) = 2$$$, $$$\sigma(2) = 3$$$ and $$$\sigma(3) = 1$$$.

Then, we may as well define the composition $$$\sigma_1 \sigma_2$$$ of permutations $$$\sigma_1$$$ and $$$\sigma_2$$$ as $$$(\sigma_1 \sigma_2)(k) = \sigma_1(\sigma_2(k))$$$.

By the group $$$\langle G \rangle$$$ generated by $$$G$$$ we mean the set of permutation that we will get by repeatedly composing permutations with each other in all possible ways, until no new permutations may be produced. Further we will rely on the following properties of permutations:

There is an identity permutation $$$e(k) = k$$$, such that $$$e \sigma = \sigma e = \sigma$$$ for any $$$\sigma$$$;
For every permutation $$$\sigma$$$, there is unique $$$\sigma^{-1}$$$ such that $$$\sigma(\sigma^{-1}(k)) = \sigma^{-1}(\sigma(k)) = k$$$, that is $$$\sigma \sigma^{-1} = \sigma^{-1} \sigma = e$$$.

Together, the properties above just say that permutations form a group.

High-level idea

General idea of the algorithm is to find $$$k$$$ sets $$$G_1, \dots, G_k \subset G$$$ such that every element $$$g \in G$$$ may be uniquely represented as $$$g = g_1 g_2 \dots g_k$$$, where $$$g_1 \in G_1, \dots, g_k \in G_k$$$. Then it will also hold that $$$|G| = |G_1| |G_2| \dots |G_k|$$$.

It would be even nicer to make $$$G_1, \dots, G_k$$$ such that finding $$$g_1 g_2 \dots g_k$$$ for any given $$$g$$$ is a reasonably simple task.

Which $$$G_1, \dots, G_k$$$ to use?

This idea above is quite similar to representing any vector $$$r$$$ of a linear space as a linear combination $$$r = a_1 r_1 + \dots + a_k r_k$$$ of basis vectors. The basis $$$r_1, \dots, r_k$$$ is usually constructed in a row-echelon form, meaning that each vector has a specific coordinate associated with it, so that all basis vectors with higher indices have zero coefficient in this coordinate. In this way, the coefficient $$$a_i$$$ may be found just by looking on the corresponding coordinate of $$$r-a_1 r_1 - \dots - a_{i-1} r_{i-1}$$$.

Generally, in linear spaces we may allow zero vectors in the basis and choose $$$r_1, \dots, r_n$$$ in such way that $$$r_i$$$ has zero value in all coordinates $$$j < i$$$. Then, the vector $$$a_i r_i$$$ is uniquely determined by the $$$i$$$-th coefficient of $$$r - a_1 r_1 - \dots - a_{i-1} r_{i-1}$$$.

Getting back to permutations, $$$G_i$$$ may be interpreted as a set of all possible values of $$$a_i r_i$$$. So instead of representing the vector $$$r$$$ as a linear combination of $$$a_i r_i$$$, we represent the permutation $$$g$$$ as a composition of $$$g_i$$$. In linear spaces, we wanted $$$a_i r_i$$$ to have the value $$$0$$$ in coordinates $$$j < i$$$. And also for $$$a_i r_i$$$ to be uniquely determined by the $$$i$$$-th coordinate of $$$r - a_1 r_1 - \dots - a_{i-1} r_{i-1}$$$.

What would it mean for us in permutation context?

What is the equivalent of $$$r - a_1 r_1 - \dots - a_{i-1} r_{i-1}$$$ in permutations?

We ultimately want $$$g = g_1 \dots g_n$$$. Cancelling first $$$i-1$$$ permutations on both sides, we get to $$$g' = g_{i-1}^{-1} \dots g_1^{-1}g$$$.

What is the equivalent of coordinates in permutations?

It's the values $$$\sigma(1), \sigma(2), \dots, \sigma(n)$$$.

What is the equivalent of $$$j$$$-th coordinate being $$$0$$$ for $$$j < i$$$?

We needed it, so that the composition of $$$g_{i}^{-1}$$$ and $$$g'$$$ always preserved $$$g'(j) = j$$$ for $$$j < i$$$.

By definition it means that $$$g_i^{-1}(g'(j)) = g'(j) = j$$$, which only happens when $$$g_i^{-1}(j) = g_i(j) = j$$$.

What is the equivalent of "$$$a_i r_i$$$ is uniquely determined by the $$$i$$$-th coordinate of $$$r - a_1 r_1 - \dots - a_{i-1} r_{i-1}$$$"?

In the very end we ultimately want $$$g_n^{-n} \dots g_1^{-1} g = e$$$, where $$$e$$$ is the identity permutation $$$e(k) = k$$$.

To achieve this, we must pick $$$g_i$$$ in such way that $$$g_i^{-1}$$$ will turn $$$g'(i)$$$ into $$$i$$$.

By definition, it means that $$$g_i^{-1}(g'(i)) = i$$$, or equivalently $$$g_i(i) = g'(i)$$$.

In other words, we must have $$$a(i) \neq b(i)$$$ for any $$$a, b \in G_i$$$.

That being said, Schreier-Sims algorithm constructs $$$n$$$ sets $$$G_1, \dots, G_n$$$ such that

every element $$$g_i \in G_i$$$ has $$$g_i(j) = j$$$ for $$$j < i$$$;
all elements $$$g_i \in G_i$$$ have distinct $$$g_i(i)$$$;
The sets $$$G$$$ and $$$G_1 \cup G_2 \cup \dots \cup G_n$$$ generate the same permutation group.

I will use Python in my code examples below. Let schreier_sims(S) be the function that takes a list of permutations $$$G$$$ and returns $$$G_1, \dots, G_n$$$. Each $$$G_i$$$ is also represented as a list of permutations. Then checking that an element belongs to $$$G$$$ may be done as follows:

# compute the composition p[i] = p1[p2[i]]
def apply(p1, p2):
    return tuple(p1[p2[i]] for i in range(len(p2)))

# compute the inverse permutation ans[p[i]] = i
def inverse(p):
    ans = [0] * len(p)
    for i in range(len(p)):
        ans[p[i]] = i
    return tuple(ans)

# compute the sets G1, ..., Gn
def schreier_sims(S):
    ...

# G is the list of Schreier-Sims basis sets,
# p is the permutation that we need to check
def belongs(G, p):
    for i in range(len(G)):
        for g in G[i]:
            if p[i] == g[i]:
                p = apply(inverse(g), p)
    return all(p[j] == j for j in range(len(p)))

belongs(schreier_sims([[1, 2, 0]]), [2, 0, 1]) # True
belongs(schreier_sims([[1, 2, 0]]), [2, 1, 0]) # False

Is this enough?

No! There is an additional implicit assumption that we have made here. Specifically, that every permutation $$$g \in G$$$ such that $$$g(j) = j$$$ for $$$j < i$$$ can be generated just by permutations from the sets $$$G_i, G_{i+1}, \dots, G_n$$$. While this is a somewhat natural assumption if you're familiar with Gauss algorithm, it isn't necessarily true.

It could be that combining permutations from $$$G_i$$$, you will get a permutation $$$g_i$$$ such that $$$g_i(i)=i$$$ and then you will have to apply it to make $$$g$$$ an identity permutation in the end. Therefore, we need a stronger set of constraints:

every element $$$g_i \in G_i$$$ has $$$g_i(j) = j$$$ for $$$j < i$$$;
all elements $$$g_i \in G_i$$$ have distinct $$$g_i(i)$$$;
For any permutation $$$g \in \langle G \rangle$$$ such that $$$g(j) = j$$$ for $$$j < i$$$, it is also true that $$$g \in \langle G_i, G_{i+1}, \dots, G_n\rangle$$$.

In group theory terms, the last condition means that the set of permutations $$$G_1 \cup \dots \cup G_n$$$ is a strong generating set of $$$\langle G \rangle$$$.

Note that this is very similar to how we construct the basis for vectors over remainders modulo possibly composite number $$$m$$$ (see e.g. the article by errorgorn). Key difference with vectors over fields here is the possible existence of torsions, when it is possible to annihilate the $$$i$$$-th coordinate of a vector by multiplying the vector with a non-zero constant (thus, possibly not annihilating the remaining coordinates).

Similarly with permutations, it may be possible to annihilate the $$$i$$$-th term of a permutation from $$$G_i$$$ by applying other permutations from $$$G_i$$$ to it, without making the permutation as a whole the identity permutation, hence it might be still needed in the decomposition unless $$$G_1, \dots, G_n$$$ constitutes a strong generating set.

How to construct $$$G_1, \dots, G_n$$$?

Schreier-Sims algorithms construct a strong generating set iteratively. Let $$$G^{(i)}$$$ be the set of permutation $$$g \in \langle G \rangle$$$, such that $$$g(j) = j$$$ for $$$j < i$$$. In particular, $$$G^{(1)}= \langle G\rangle$$$. Note that $$$G^{(i)}$$$ is closed under composition, that is $$$g_1 g_2 \in G^{(i)}$$$ for any $$$g_1, g_2 \in G^{(i)}$$$, because

$$$ g_1(g_2(j)) = g_1(j) = j $$$

for any $$$j < k$$$. In group-theoretic terms, $$$G^{(i)}$$$ is called a stabilizer subgroup of $$$1, 2, \dots, i-1$$$.

From the definition of a strong generating set above it follows that

$$$ G^{(i)} = \langle G_i \cup G_{i+1} \cup \dots \cup G_n \rangle. $$$

The iteration of the Schreier-Sims algorithm assumes that we have a generating set of permutation $$$S$$$ such that $$$G^{(i)} = \langle S \rangle$$$, and from that it finds a set $$$G_i$$$ and a new generating set $$$S'$$$ such that $$$G^{(i)} = \langle G_i \cup S' \rangle$$$, and $$$G^{(i+1)} = \langle S' \rangle$$$. In this way, it is possible to apply similar steps iteratively to construct $$$G_{i+1}, G_{i+2}, \dots, G_n$$$.

Finding $$$G_i$$$

To find $$$G_i$$$, we should find a representative for every possible value of $$$g(i)$$$ for $$$g \in G^{(i)}$$$. In other words, for every obtainable $$$j \neq i$$$, we should find a permutation $$$g_j \in G^{(i)}$$$ such that $$$g_j(i) = j$$$. Once we apply $$$g^{-1}_j$$$, to the permutation $$$g$$$ having $$$g(i)=j$$$, we would make its $$$i$$$-th term being equal to $$$i$$$ and therefore it will move to $$$G^{(i+1)}$$$, which is dealt with later on.

That being said, $$$G_i$$$ will consist exactly of $$$g_j$$$ for all obtainable $$$j$$$. The set of all possible $$$g(i)$$$ is called the orbit of the number $$$i$$$. We can find and organize the orbit by building a so-called Schreier tree. The tree consists of all possible values of $$$g(i)$$$, and edges of the tree correspond to permutations that should be applied to change one element of the orbit to another.

Now, let $$$g_j \in G^{(i)}$$$ be such that $$$g_j(i) = j$$$. Then for any $$$g \in S$$$ it holds that $$$(g g_j)(i) = g(j)$$$. This fact allows to construct the Schreier tree with a simple depth-first search. We will store actual permutations $$$g_j$$$ in an array orbit:

# orbit[j] = permutation g in <S> s.t. g[i] = j
def build_schreier_tree(i, S, orbit):
    for g in S:
        if g[i] not in orbit:
            orbit[g[i]] = apply(g, orbit[i])
            # g[i] = j
            build_schreier_tree(g[i], S, orbit)

The running time of the construction above is $$$O(n|S|+n^2)$$$.

Finding $$$S'$$$

Assume that we have $$$S$$$ s.t. $$$\langle S \rangle = G^{(i)}$$$ and $$$G_i$$$, represented by the elements of the orbit of $$$i$$$. Then the generating set $$$S'$$$ such that $$$\langle S' \rangle = G^{(i+1)}$$$ can be found from what is called the Schreier's lemma:

$$$ S' = \{ (g_{s_u})^{-1} s g_u | u \in \texttt{orbit}_i, s \in S\} $$$

Here, $$$\texttt{orbit}_i$$$ are the possible values of $$$g(i)$$$ for $$$g \in G^{(i)}$$$ and $$$g_j \in G_i$$$ is a representative having $$$g_j(i)=j$$$.

Indeed, consider an element $$$s' = s_1 s_2 \dots s_k \in G^{(i+1)}$$$. Let $$$r_i = s_i s_{i+1} \dots s_k$$$, then we may rewrite it as

$$$ s' = s_1 (g_{r_2(i)} g_{r_2(i)}^{-1}) s_2 (g_{r_3(i)} g_{r_3(i)}^{-1}) \dots (g_{r_k} g_{r_k}^{-1}) s_k, $$$

which, using that $$$s'(i)=i$$$ and $$$r_k(i) = s_k(r_{k+1}(i)) = s_k(u)$$$ for $$$r_{k+1}(i) = u$$$, in turn further rewrites as

$$$ s' = (g_i^{-1} s_1 g_{r_2(i)}) (g_{r_2(i)}^{-1} s_2 g_{r_3(i)}) \dots (g_{r_k(i)}^{-1} s_k g_i) = s'_1 s'_2 \dots s'_k. $$$

The process of generating such set is straight-forward:

# Given <S> = G^(i) and
# the orbit representatives of i,
# find S' s.t. <S'> = G^(i+1)
def make_gen(S, orbit):
    n = len(next(iter(S)))
    newS = set()
    for s in S:
        for u in orbit:
            newS.add(reduce(apply, [inverse(orbit[s[u]]), s, orbit[u]]))
    return newS

Note that the size of $$$S'$$$ generated in this way may be as high as $$$n |S|$$$, which would allow an exponential growth if not halted.

Making sure $$$S'$$$ has a reasonable size

For any set $$$S'$$$, it is possible to construct another set $$$S' '$$$ such that $$$\langle S' \rangle = \langle S' ' \rangle$$$, but $$$|S' '| \leq n^2$$$. There are several ways to construct such sets, for example Jerrum's filter even guarantees that $$$|S' '| \leq n$$$, but at the cost of higher construction complexity. In Schreier-Sims algorithm, however, since the construction and sifting is repeated several times iteratively, the preferred way is Sims filter. It guarantees that for any pair $$$(i, j)$$$ there is at most one permutation $$$g \in S' '$$$ such that $$$g(k)=k$$$ for $$$k < i$$$ and $$$g(i)=j$$$.

The construction process for it is fairly straightforward and essentially repeats the Gauss method:

# Sifts S and returns S'
# such that <S> = <S'>
# and |S'| <= n^2
def normalize(S):
    n = len(next(iter(S)))
    newS = set()
    base = [{} for i in range(n)]
    for s in S:
        for x in range(0, n):
            if s[x] != x:
                if s[x] in base[x]:
                    s = apply(inverse(s), base[x][s[x]])
                else:
                    base[x][s[x]] = s
                    newS.add(s)
                    break
    return newS

Getting it all together

Gluing all the pieces mentioned above together, we get the following algorithm:

def schreier_sims(S):
    n = len(next(iter(S)))
    ans = []
    for i in range(n):
        orbit = {}
        # initiate with identity permutation
        orbit[i] = tuple(range(n))
        # find G_i = orbit
        build_schreier_tree(i, S, orbit)
        # add G_i to the answer
        ans += [[orbit[j] for j in orbit]]
        # generate S' and make sure it is at most n^2 in size
        S = normalize(make_gen(S, orbit))
    return ans

Complexity analysis here is not very pleasant and yields something around $$$O(n^6)$$$. It is possible to further reduce it by roughly a factor of $$$n$$$ by terminating the sifting process earlier, if sufficient amount of uniformly generated elements of $$$G$$$ are sifted through.

Example problem

There was a problem in Petrozavodsk Summer-2011. Moscow SU Unpredictable Contest that asked specifically to find the size of a permutation group for $$$n \leq 50$$$. If you don't have an access to opentrains, I have uploaded this specific problem to the mashup which you can access by the invitation link, and here is my submission to it using the Schreier-Sims algorithm: [submission:188935679].

A bit of self-advertisement

By the way, I'm organizing a competitive programming camp this February! And you would make me very happy if you participate in it :)

Comments (5)

Write comment?

adamant

15 months ago, # |

+23

By the way, this algorithm was used in a checker for 102354J - Tree Automorphisms, so I feel like it can be even more useful to problemsetters than participants, as it allows to check a lot of cool stuff about groups.

→ Reply

ftiasch

15 months ago, # ^ |

The very specific task also appeared at XVI Open Cup GP of Ekaterinburg — Heimdall (too lazy to find the link).

Yes! I remembered the task too, and the name of the team that made it, but only looked for GP of Ural, while it was GP of Ekb. Thank you! The opentrains login is here, but I don't know if it's available anywhere else. It also has $$$n \leq 15$$$ rather than $$$50$$$.

Hm, this task was also discussed and explained in this comment by Zlobober. In Russian though...

Savior-of-Cross

+26

orz

peltorator

14 months ago, # |

Great blog!

I think that you sometimes use $$$G$$$ and $$$\langle G \rangle$$$ interchangeably. For example, in the high-level idea section, I believe that all $$$G$$$'s should actually be $$$\langle G \rangle$$$'s.