# Oct 18, 2016 - Introduction to Permutation Groups

This post provides an introduction to algorithms on permutation groups, building towards Stabilizer Chains, the most important data structure for computing with permutation groups.

In this post, we will assume we are working with finite permutation groups in GAP. Before reading this post, please go and install GAP so you can play along! You can download all the code from this post here:

## Introduction

Let’s begin with a simple permutation group:

There are many obvious questions we might want to answer about G, some simple examples include:

How does GAP calculate each of these values? If we exhaustively calculated every element of G then calculating each of these results will be fairly trivial, but for larger groups that rapidly becomes impractical.

## Orbit Calculation

We will begin by calculating the orbits. This is the easiest thing to calculate, and a vital building block towards stabilizer chains.

Calculating the orbit of an integer is mathematically simple – keep applying the generators to the point, it’s image, its image’s images… until a fixed point is reached. This is easier to understand in GAP! This function is a little inefficient, but will serve our early purposes.

Let’s write a couple of simple tests, to convince ourselves our function works.

This function has two limitations: Firstly, it is a inefficient because the line p^g in knownOrbit is slow – it has to search through the whole orbit we have seen so far. Secondly, in practice we often want to know the permutation which will get us to each point in the orbit, but we throw that information away!

So, let’s do an improved version. Here we will start with a base value, and then track whenever we find a new value in the orbit, which permutation got us there. By following the permutations we will be able to get back to the base.

Enough chat, I find this data structure is easier to understand once you have it:

Notice that line img := p/g. You might not have seen /. This returns the value p maps to g. It’s the same as applying g to the inverse of p, but is more efficient.

Why do we do this? Well, we want to be able to get back to the base. We could instead store the inverse of p, but that would require inverting the permutations.

So, what is this vector useful for? It’s main use is to let us find a permutation which maps any integer back to our base’’ point. I would play with this function for a while, to be sure you really understand why it works!

Let’s see how this function works in practice. It gives a permutation which maps it’s first argument back to the base point, or returns fail if no such permutation exists.

Given these two together, we can take a permutation $p$ and a group $G$ and find another permutation $q$, where $q$ is in $G$ if and only if $p$ is, and $q$ fixes the base point of our schreier vector. This function looks like this:

This function takes a permutation and multiplies it by a permutation in our group, to fix the base point (or returns fail if no such permutation exists).

So, we have turned the problem of finding if a permutation is in our group to the problem of finding if a different permutation is in the stabilizer of a single point in that group.

So how do we solve this problem? Well, build Stabilizer(G,1), build a schreier vector for that, and recurse until no group is left!

Now we have a stabilizer chain, we are (finally) in a position to implement checking if a permutation is in a group. This function is quite long, but follows a basic recursive design:

• Look where the permutation (p) maps the base point. Is that outside the orbit of the base point? Then it is not in our group (G).
• Calculate a permutation (q) in (G) such that (p.q) maps the base point to itself, then search in the stabilizer.

The final terminating case is to detect when we reach the bottom of the stabilizer chain, in which case we must have the identity permutation left.

So, what else can we do with a stabilizer chain? Lots of things, almost anything which explores a group. Here is one easy thing we can do – find the size of the group using the orbit - stabilizer theorem.

We can do much more interesting calculations, like begin to build a group intersection algorithm.

# Jun 1, 2016 - Introduction to Backtrack Search

This is the first in a series of articles which will explore backtracking search in permutation groups. To begin, these articles will concentrate on the theory of backtracking, rather than highly-optimised implementations.

If you have any comments or corrections on these pieces, or just enjoyed reading them and want more, please e-mail me.

I will assume you are familiar at least with permutations and permutation groups. Also all code examples will be given in the GAP language. I strongly recommend installing the latest version of GAP and following along!

# Introduction

Let’s dive straight in, and consider the problem of intersecting two permutation groups $G$ and $H$ on the set $\Omega = \{1..n\}$. We will assume we are given $G$ and $H$ by a set of generators. In GAP, we can give two groups, and find their intersection, as follows:

How can we implement the Intersection function? Your first assumption might be that there is some clever mathematical proof, which quickly generates the intersection, but (as far as we currently know) there is not, we have to search through both groups for the members they have in common. Let’s begin with the simplest possible implementation of intersection: Generate all permutations in $G$, and check if they are in $H$:

One obvious problem here is haven’t we just changed one piece of magic (Intersection) for two different pieces of magic (for g in G1 and if g in G2)?

Iterating over the members of a permutation group, checking if a given permutation is in a group, and finding the members of a group which stabilize a single integer can all be efficiently implemented using a base and strong generating set, also known as a stabilizer chain. In a future post we will see how to make stabilizer chains, and how to implement methods like g in G1.

If your only interest is worst-case complexity over all permutation groups, IntersectionEnumerate is surprisingly close to the state of the art! None of the algorithms we will discuss will ever beat this algorithm by very much, in a theoretical sense. However, we can (as you might hope) greatly outperform this algorithm for most real-world problems.

The fundamental idea behind all of our improvements revolve around backtracking search, also known as divide-and-conquer. We will take our problem and split it up into sub-problems, which will be (hopefully) easier to solve, and then stitch our answers back together to form the entire group we are looking for.

# Basic Backtracking

So, how can we split the search for Intersection(G,H) up? One natural method of attack if to consider searching for subgroups and cosets contained inside the group we are interested in. Let’s try that!

# A Brief Aside: Point stabilizer

The point stabilizer of an integer $x$ in a permutation group $G$ is the subgroup of $G$ which fixes $x$. This is often represented as $G_x$. How can we find this subgroup in GAP? Firstly, lets write a slow function:

However, actually finding the point stabilizer is another thing we can calculate using a Stabilizer Chain. They are great aren’t they! In GAP, we do this by:

Notice how Stabilizer has printed out a shorter answer than StabPointEnumerate. Both functions outputted the same group, but Stabilizer has produced a set of generators, rather than every element of the group.

# Back to Backtracking

So, we will split the problem of finding $G \cap H$ into pieces, by splitting our problem up into $n$ pieces:

• Find $p \in G \cap H$ where $1^p = 1$.
• Find $p \in G \cap H$ where $1^p = 2$.
• Find $p \in G \cap H$ where $1^p = n$.

Let’s first consider the first set. A permutation $p$ where $1^p=1$ is in $G \cap H$ where $1^p=1$ if and only if it is contained in both Stabilizer(G,1) and Stabilizer(H,1). Therefore we just need to find Intersection(Stabilizer(G, 1), Stabilizer(H,1)). While this is another intersection problems, it is a smaller problem, and so will (hopefully) be easier to solve. Let’s call this intersection $R$.

Now we are going to apply a little group theory. We know that $R \subseteq G \cap H$. This means we can divide $G \cap H$ into a list of cosets of $R$. Let’s pick a single permutation in some coset of $R$ – for example $q = (1,3)(4,5)$ (which we know is in $G \cap H$, as we found it earlier during our brute force enumeration).

As all $p \in R$ satisfy $1^p=1$, then all permutations in the coset $qR$ will satisify $1^p=3$. In general, in each of our subproblems, we will find either:

• A coset of $R$
• No permutations

(A full proof of this is left to interested reader..)

Therefore, we don’t need to find all solutions to all of our sub-problems after the first – we only need to find one permutation from each (or prove no permutation exists)! Already a big gain in performance.

How can we find quickly which of these sub-searches contains a permutation from $G \cap H$? That problem has been the subject of a huge amount of research. We will discuss some methods of implementing this in coming articles.

# Making use of partial knowledge of $G \cap H$

We can prove some parts of our search don’t have to be performed. Early in search we find the permutation $(4,5) \in G \cap H$. Later, we try finding a permutation where $1^p = 4$. There is no need after this to search for a permutation where $1^p = 5$. Why? There are two cases:

** There is a permutation where $1^p = 4$ (in this problem there is, $(1,4)(2,6)(3,5) \in G \cap H$). We can find a permutation where $1^p = 5$ by multiplying these two permutations together, to get $(1,5,3,4)(2,6)$.

** If we had found no permutation where $1^p = 4$ in $G \cap H$, there is no point searching for one where $1^p=5$, as if we found such a permutation, we could again multiply it by $(4,5)$ and get a permutation where $1^p = 4$, which can’t exist!

How can we formalise this idea? As we find members of $G \cap H$, we keep track of the orbits of $G \cap H$ and only check one value in each orbit. An exact discussion of this algorithm will appear in a future post!

# Proving parts of search contain no element of $G \cap H$.

This is where the majority of the research into improving backtrack search in permutation groups (and backtrack searches in general) is performed.

Looking at the output of FindExtendingElement, there are a number of easy ways it can be improved. We certainly shouldn’t test mappings like [ 3, 2, 1, 1 ], as permutations are invertible!

More generally, let us consider the second stage of our search, when we are looking for $G_1 \cap H_1$. We can ask GAP to give us the orbits of $G_1$ and $H_1$:

From this, we can deduce the orbit of $2$ in $G_1 \cap H_1$ must be contained in $\{2,3\} \cap \{2,4,5\}$, which means that actually the orbit of $2$ is just $\{2\}$!

By the same reasoning, both $3$ and $6$ are fixed, leaving the only non-trivial orbit as $\{4,5\}$. Of course, this does not mean that $G_1 \cap H_1$ contains a permutation $p$ where $4^p = 5$, this orbit may still split. However, this is the only case which we need to consider. We have reduced our search for $G_1 \cap H_1$ to having to consider a single permutation: $(4,5)$!

# Installing

I will begin by assuming you have installed GAP, either the latest release, or a development version

# Quickstart Guide

GAP’s test file format looks like the result of running gap. This is intensional. Here is an example test file which tests the + operator. We have purposefully added two tests which are incorrect, 1 + 0 (which isn’t 6) and the final test (which is formatted incorrectly).

If you save this to a file called gapsimpletest.tst, you can run it as follows, and see the failing tests.

The first failing test isn’t surprising, 1+0 isn’t 6 after all. The second test shows an important fact – GAP tests for exact string matching, not object equivalence.

Sometimes we want to write tests which can’t really be written all on one line. In this case, we can use ‘> ‘ to start each new line. If a line should produce no output (for example if we end it with ;;), then we move straight on to the next gap> statement. We can also start lines with a # to give comments, but only at the beginning of the file, or after a blank line. Let’s put all of these things together:

## Writing stable tests

Earlier we mentioned that GAP’s test checks the output string, not that the actual objects produced are equivalent. In some cases, this is exactly what we want – when we are checking how GAP prints for instance. In other cases, it can lead to fragile tests.

A good way to see this is by looking at three ways we can get GAP to print out the symmetric group on 50 points:

If we construct the symmetric group with SymmetricGroup, then we can see the group is output as Sym( [1 .. 50 ] ). However, if we just give the generators GAP just remembers it has a <permutation group with 2 generators>. Finally, if we take that group and ask for it’s size, GAP remembers that size and starts printing that out as well.

This causes a problem for writing tests – you have to know exactly which one GAP is going to output, and also hope no internal change to GAP causes it to learn the size (for example), which doesn’t change the result of your test, but does change the output.

The other possibility is that a test could break, without changing the text output. For example imagine that we recorded some function should return <permutation group with 2 generators>. That function’s return value could change to any group with two generators, without ever noticing the error.

The solution? The easiest option is to write tests which explictly check the correct object is returned, and just print true, for example, to check Intersection

It is perfectly legal to use Read to read other code while testing. One technique I often use is to write functions which should output nothing if the tests succeed. Firstly, we will write a file with functions which test intersection. If these functions succeed, they will print nothing

Then we can write a trivial test which makes sure this function runs correctly, and prints nothing.

Is generating many random inputs worthwhile? Yes! Here is a bug which was fixed in GAP – the following code didn’t work:

This code was broken in GAP 4.7.5 when:

• Stabilizing a symmetric or alternating permutation group
• Which GAP knew was a symmetric or alternating group
• Using OnTuples
• With repeated integers in the tuple

It is very unlikely (I think) someone would have manually constructed a test that hit this case – but it is easy to hit it if you just brute force randomly chosen groups and lists!

## Testing projects

There are two useful features which we haven’t covered yet. The first is the function TestDirectory. This function just runs all tests in a directory (including tests in all sub-directories, recursively). This is useful for testing a whole project.

The second feature is uses tests for timing, using GAPstones. You will often notice tests start with a line like START_TEST("file.tst"); and end with END_TEST("file.tst", 10000);. These lines are used to record how long the file takes to run. My advice, ignore these lines for now, GAPstones don’t actually turn out to be very useful for measuring the performance of GAP code.

# Mar 12, 2016 - Line by Line Code Coverage in GAP

The purpose of this article is to show how to use GAP’s new line-by-line code coverage. Firstly, you need gap version at least 4.8.0, or a build a development version of gap.

This guide will just cover how to record the executed lines of code. Do you want to know how long is being spent on each line? Then switch to the profiling guide (these two guides are very similar!)

# Quickstart Guide

We will start with a quick guide to code coverage, with some brief comments. We will explain later how to do these things in greater depth!

Let’s start with some code we want to profile. Here I am going to calculate cover coverage for the function f given below, and use a group from atlas.

There are two methods of calculating coverage, each with different trade-offs. Which one you should choose depends on the following rule: Profiling only marks lines which were not executed if they were read while profiling was running.

What does this mean in practice? If you want to know which lines from the standard library were not executed, you need to do profiling a full GAP session. If you want to only know about your own files and packages, then you must load those files and packages after running CoverageLineByLine.

## Profiling a partial GAP session

Firstly, let’s make a simple function. You can download this file to save having to re-type it into GAP.

Firstly, we will record a coverage profile for the function f:

This creates a file called output.gz, which stores all lines which were executed while running f. Now we want to turn that into a nice output. This requires loading the profiling package, like this:

Ignore the scary warning for now, but open the file index.html, in the outdir directory that was created. This gives an overview of all the profiled files, with proffunction.g somewhere in the list (probably at the bottom). Click on that, and you should see a table, like the one below (this is a screenshot, the real version is interactive):

This tells us that every line of our function was exected except line 7, which was missed.

## Profiling a full GAP session

One major limitation of our previous profiling is that it doesn’t show missed lines from the standard library, only executed lines. This is because we need to start profiling before reading lines. The easiest way to accomplish this (and the only way for files in the standard library) is to start coverage when GAP starts, by giving the --cover output.gz flag to GAP when starting. You can still call UncoverageLineByLine when finished as normal.

# Important Limitations

• As already discussed, you will only get missed lines from files read after profiling starts.

• You can only profile once per run of GAP – if you want to profile another function you have to quit and restart GAP (sorry). (If you want to know why, this is because a global list of lines executed is stored, which can not be cleared).

• Giving your output file the gz extension makes GAP automatically compress the file. This is a great idea, because the files can get very big otherwise! Even then, the files can grow quite large very quickly, keep an eye on them.