Cyclically covering subspaces

This blog post is based on joint work with Tom Johnston and James Aaronson.

Suppose a group of n mathematicians need to decide between two different coffee machines for the department.

We ask them to raise their hand if they prefer option A, and record the subset of [n]= \{1,\dots,n\} that raised their hand. For each such subset, we specify whether option A wins or loses. This gives a collection \mathcal{WIN}\subseteq \mathcal{P}([n]) of winning coalitions. Their are a few very natural assumptions for a ‘fair’ way of deciding who won:

  1. If more people vote for option A, then option A is more likely to win. In combinatorics terms, \mathcal{WIN} is an upset.
  2. Option A and B have an equal probability of winning. That is, |\mathcal{WIN}|=\frac12 2^n.

So far what we have defined is equivalent to a maximal intersecting family. We will add one more assumption which is modelled in a way which is perhaps a bit less natural. The automorphism group Aut(\mathcal{WIN}) of our set system \mathcal{WIN} is the set of bijections f:[n]\to[n] for which f(A)\in\mathcal{WIN} for all A\in \mathcal{WIN}. We call \mathcal{WIN} symmetric if the action of its automorphism group is transitive, that is, for all i,j\in [n], there exists a f\in Aut(\mathcal{WIN}) with f(i)=j.

  1. All voters are treated equally. Our set system \mathcal{WIN} is symmetric.

Such a fair game or symmetric maximal intersecting family need not exist. It is easy to find one for n odd: indeed, we can use the voting rule most often used in practice called the majority rule, for which \mathcal{WIN}=\{x\subseteq [n]:|x|>n/2\}. However, the existence for even n is far from understood. Isbell proved the following result in his first Homogenous Games paper.

Lemma (Isbell) Let n\in \mathbb{N}. The following statements are equivalent.

  • There exists a maximal intersecting family on [n] that is symmetric.
  • There exists a transitive subgroup G of S_n for which every s \in G has at least one odd cycle.
  • There exists a transitive subgroupG of S_n for which every 2-power-order element s\in G has a fixed point i\in [n] (that is, s(i)=i).

Isbell conjectured in 1960 that there is a finite bound m(b) for each odd b\in \mathbb{N} such that no symmetric maximal intersecting family exists for n=2^a b for all a\geq m(b). Peter Cameron was interested in this problem, and introduced cyclically covering subspaces (the topic of our paper, defined further below). He noticed that a small cyclically covering subspace in \mathbb{F}_2^b (say codimension a), can be used to construct a large symmetric maximal intersecting family (say for n =2^ab). (This connection that m(b)>h_2(b) is explained in detail by Cameron, Ellis and Raynaud.)

Let \sigma:\mathbb{F}_2^n\to \mathbb{F}_2^n denote the cyclic shift, e.g. \sigma(1,0,0)=(0,1,0) and \sigma(0,0,1)=(1,0,0).

Definition A cyclically covering subpace V\subseteq\mathbb{F}_2^n is a linear subspace with the property that for all x\in \mathbb{F}_2^n, there exists a shift k\in[n] for which \sigma^k(x)\in V.

It is easy to construct such a subspace: we can always take V=\mathbb{F}_2^n. It is also clear the subspace cannot be too small, e.g. the all ones and all zeros vectors always need to be a part of V. How small can such a cyclically covering subspace be (given n)? Define h_2(n) as the largest codimension (n minus dimension) of a cyclically covering subspace in \mathbb{F}_2^n. More than 25 years ago, Peter Cameron posed the question whether h_2(n)\to \infty as n \to \infty over the odd integers (see Problem 190, and his blog that we are proud to be mentioned on). The main result of our paper answers this question negatively (conditional on Artin’s conjecture).

Theorem (Aaronson, G., Johnston) If p is a prime for which the polynomial 1+X+X^2+\dots+X^{p-1} is irreducible over \mathbb{F}_2, then h_2(p)=2.

There are infinitely many such primes if Artin’s conjecture is true. Artin conjectured in 1927 a precise density for such primes p among the primes, which seems about correct for the primes below 300.000, as can be seen below.

The first couple of Artin primes have their own OEIS sequence.

The proof of our result is quite long. Some of the main steps we take are:

  • We reformulate the problem by switching to the orthogonal complement: it suffices to prove that there are no linearly independent u,v,w such that the orthogonal complement of these is cyclically covering.
  • We show we may assume that u=(0,1,1,\dots,1). This is the only step where we use our assumption that 1+X+\dots+X^{p-1} is irreducible, through the fact that \mathbb{F}_2[X]/(X^p-1) splits into the sum of two fields \mathbb{F}_2[X]/(X-1)\oplus \mathbb{F}_2[X]/(1+X+\dots+X^{p-1}). Letting a vector v correspond to the polynomial v_0+v_1X+\dots+v_{p-1}X^{p-1}, shifting corresponds to multiplying by X. This reformulation was introduced by by Cameron, Ellis and Raynaud, who use algebraic arguments to prove several bounds on h_2(n).
  • We say a vector is symmetric if v_0=0 and v_i=v_{-i}. E.g. (0,1,0,0,1) is symmetric and (0,0,1,0,0) is not. We show, for any prime p, that u=(0,1,1,\dots,1) cannot ‘work together’ with two additional vector v,w if either is symmetric.
  • We then try to show our conjecture that there is no non-symmetric vector v such that the orthogonal complement of (0,1,1,\dots,1),v is a cyclically covering subspace of codimension 2. For this we reformulate into Cayley graphs, giving the equivalent statement of our conjecture below.
  • For primes p, by studying additive properties of the set S_v of i for which v_i=1, we try to reduce our conjecture to the case in which S_v is contained in a small arithmetic progression (which we can handle). We are able to do this whenever the Cayley graph we constructed has no directed cycles of length 4 or 6. Whenever this is not the case, we fail to prove our conjecture, but with some additional case analysis work we show there can’t be an additional w, which suffices for our theorem.

Conjecture Let n be an odd integer which is not divisible by 7 and let G be a Cayley digraph on vertex set \mathbb{Z}/n\mathbb{Z}. Then G has no odd-sized induced subgraph with all outdegrees odd if and only if G is a graph.

A Cayley digraph on vertex set \mathbb{Z}/n\mathbb{Z} is a directed graph where a fixed subset S of \mathbb{Z}/n\mathbb{Z} specifies the arcs: i \to j if and only if j-i\in S. This is a graph if and only if 1_S is symmetric, that is, 0\not \in S (no self-loops) and i \in S \iff -i \in S (all edges are bidirectional).

The Cayley graph with generating set S_v for v=(0,1,0,0,0,0,1,0,0). The graph induced on \{0,1,2,4,8\} is of odd size and every vertex has odd outdegree. This implies that the vector (1,1,1,0,1,0,0,0,1) has no shift within the orthogonal complement of (0,1,\dots,1) and v.

One of the directions of our conjecture follows from the handshake lemma. We checked the other direction for all n \leq 43 by computer search.

Cameron, Ellis and Raynaud also introduced the generalisation of h_2(n) to h_q(n) for arbitrary prime powers q. Their problem of determining for which values of n we have h_q(n)=0 remains open. Several other interesting open problems can be found in our paper.

Exceptional graphs for the random walk

This blog post is based on a joint paper with Juhan Aru, Tom Johnston, Bhargav Narayanan, Alex Roberts and Alex Scott

Suppose you are on a grid \mathbb{Z}^2, starting in the origin. At each iteration, you select a direction uniformly at random from the set {North, South, East, West}, after which you set one step in that direction. It is a well-known result that such a walk will visit each point infinitely often in \mathbb{Z} and \mathbb{Z}^2, but not in higher dimensions (see e.g. Section 5.9 of Part A Probability in Oxford). It turns out that this is even the case if you restrict this simple symmetric random walk to a connected subgraph G of the grid, where you `skip’ an instruction if there is no edge present to walk on.

Hence for each subgraph G of the grid, a simple symmetric random walk returns to the origin infinitely often with probability one (since there is always a positive probability you walk from one place to another, it suffices to return infinitely often to 0).  What if we switch around the `for each’ and the `with probability one’ in this sentence? As the countable union of null events is again a null event, for any countable (in particular finite) collection of subgraphs we know that our random walk returns to 0 infinitely often on all those graphs with probability one. However, the collection of all subgraphs is not countable, and it turns out that with probability one there actually is an exceptional graph on which our random walk is not `recurrent’. The following nice result was obtained by Aru and Narayanan (see https://arxiv.org/abs/1805.06277v1) and independently Balister, Bollobás, Leader and Walters.

Theorem If W is a simple random walk on \mathbb{Z}^2, then

\mathbb{P}(\exists H\subseteq \mathbb{Z}^2: W|_H \text{ visits the origin finitely many times}) =1.

SSRWgrid
A slice of the graph H\subseteq \mathbb{Z}^2  between two consecutive vertical lines; the graph has vertical lines present at all x-coordinates of the form 2^n-1.

More information on the construction of H.


A technical note first: we have a random variable X:\Omega \to \{\mathbb{N}\to\{(1,0),(0,1),(-1,0),(0,-1)\}\} that gives us a sequence of instructions for each \omega \in \Omega. We need to show that on a subset of \Omega of measure 1, we can construct a graph H=H_\omega for which the corresponding walk does not return to 0 infinitely often. 

The general idea is the following: we read our random instructions and reveal the graph as we go. Suppose we try to let our random walk “drift” to the East direction. Whenever we have not revealed edges yet, we allow the walk to continue on eastward if it wants to do so, but never reveal any “westward” edges. It turns out one needs to be a bit more clever than this: instead, consider a graph which has all vertical lines present at all x-coordinates of the form 2^n-1, where the segments in between will be revealed by the rules above. All line segments have only a single horizontal line connecting them all the way through; the idea will be to ensure that the probability that you get from one vertical line to the next, is much smaller than the probability you walk back to the previous line. Let E_n be the event we walk back to L_{n-1} before we reach L_n for the first time (after having reaches L_n). If \sum \mathbb{P}(E_n)<\infty, then by Borel-Cantelli only finitely many E_n happen with probability one, and in particular the walk returns to 0 finitely often.

Juhan Aru and Bhargav Narayanan moreover showed that even if you have a finite collection of independent random walks, there still almost surely exists a graph H on which none of them are recurrent. In our joint paper, we provide some answers to the question of what the largest cardinality of such a collection could be. We show that for any countable collection of independent simple random walks, there is, with probability one, a subgraph on which none of these random walks is recurrent. When modelling uncountably infinite collections, it is a bit less clear what the appropriate phrasing of the question is.

One natural formulation is in the language of branching random walks. A branching random walk on \mathbb{Z}^d starts with a single particle at the origin. At each time step, each particle independently generates a number of additional particles at its current location according to some fixed offspring distribution, and we say that this offspring distribution is nontrivial if the number of offspring is nonzero with positive probability. Independently of the other particles and their history, all particles then take a step in a direction chosen uniformly at random, which leads to a random family of dependent simple random walks. It turns out that in this case, with probability 1, for each graph there is some branch of the walk that is recurrent.

Theorem. Fix d \in \mathbb{N} and let (W_i)_{i\in S} be a family of random walks generated by a branching random walk on \mathbb{Z}^d with a nontrivial offspring distribution. Then

\mathbb{P}(\forall H  \subseteq \mathbb{Z}^d~ \exists i \in S: W_i|_{H} \text{ visits the origin infinitely often})=1

The proof uses a rapidly-increasing sequence of times (T_i)_{i\in \mathbb{N}}. At each time T_i, and for every possible finite subgraph G \subseteq \mathbb{Z}^d contained in the box [-T_i, T_i]^d, we choose a representative particle p_G. We then show that, with very high probability, for each such representative particle p_G and every possible extension of the corresponding graph G onto [-T_{i+1}, T_{i+1}]^d, some descendant of p_G visits every vertex of G with distance at most i (in G) to the origin, and ends up exactly at or one step away from the origin at time T_{i+1}.

One of the original motivations for considering the problem of the random walk on the grid comes from a very nice question from Spink on universal traversal sequences.

Problem Does there exist a fixed set of directions, such that the walk which follows the corresponding instructions is recurrent on every subgraph of the grid?

A popular way of constructing such a universal transversal sequence is by choosing the sequence at random; however apparently their exist exceptional graphs for such a random sequence.
For some classes of subgraphs of the grid such a sequence can be constructed, but the general problem is still open.

Workshop for Young Researchers in Combinatorics

Update Applications are now closed. We are having an online preview this July and the workshop will take place summer 2021 (most likely July 12-16).

The Young Researchers in Combinatorics workshop is aimed at PhD students and early-career academics working in Extremal and Probabilistic Combinatorics and related fields. By building an environment of mostly young researchers, we hope the participants will feel free to contribute their ideas openly, thus preparing them to undertake their own research projects.

We aim to host around 10 invited participants and about 25 PhD students for a week between 19 and 25 July 2020 at ICMS Edinburgh. Accommodation and refreshments are provided for all participants, as well as a conference dinner. A registration fee of 100GBP is payable by all non-invited participants. The confirmed invited participants are:

António Girão (University of Birmingham)
Hong Liu (University of Warwick)
Natasha Morrison (University of Cambridge)
Maryam Sharifzadeh (University of Warwick)
Katherine Staden (University of Oxford)
Adam Wagner (ETH)
Alexey Pokrovskiy (Birkbeck, University of London)
Gal Kronenberg (University of Oxford)
Ben Barber (University of Bristol)
Shoham Letzter (ETH Zurich/Cambridge)

The workshop consists mostly of working sessions in which participants collaborate on research problems, and a limited number of talks and lectures. All participants have the chance to contribute a subset of the following items to the workshop:
– research problem with suggested reading;
– lecture on a useful method;
– short talk about their own research.

Applications for PhD students are now open on our our website.

I am organising this workshop together with Shagnik Das, Jon Noel and Yanitsa Pehova.

Intersection sizes of linear subspaces with the hypercube

This blog post is based on joint work with Tom Johnston.

What intersection sizes are possible between a hypercube and a linear subspace? We continue the work of Melo and Winter related to this question.

driepunten
A 2-dimensional linear subspace can have 1, 2, 3 or 4 points in common with a (hyper)cube.

Let H(k,m) denote the set of possible intersection sizes |S\cap \{0,1\}^{k+m}| that a k-dimensional linear subspace S can have with the (k+m)-dimensional hypercube \{0,1\}^{k+m} (in Euclidean space). As the 0-vector will always be present in S, this will be an integer of size at least 1. Moreover, an upper bound of 2^k is immediate from the following reformulation:

H(k,m)= \{|\{0,1\}^k \cap L^{-1}\{0,1\}^m| \text{ for }L:\mathbb{R}^k \to \mathbb{R}^m \text{ a linear map}\}.

(Proof of reformulation).

To see this equivalence, note that we can parametrise any k-dimensional linear subspace S as S=\{v\oplus L(v): v \in \mathbb{R}^k\} for some linear map L:\mathbb{R}^k \to \mathbb{R}^m.

(Proof of reparametrisation).

Let m>0 be an integer. We first show there is an injective linear map from S to a (k+m-1)-dimensional space. As m>0, there must be some basisvector e_i which is not contained in S. In that case, the map

\mathbb{R}^{k+m} \to \mathbb{R}^{k+m-1}:x \mapsto (x_j)_{j\neq i}

is injective on S. Indeed, if x\neq y\in S would satisfy x_j=y_j for all j\neq i, then x_i\neq y_i. Suppose wlog that x_i\neq 0. Properties of a linear subspace now imply that e_i = \frac1{x_i}(x-y)\in S, giving a contradiction.
After renumbering coordinates, we may now assume that the map

\pi: S \to \mathbb{R}^k:x \mapsto (x_j)_{j=1}^k

is an injective linear map, and the image will be a k-dimensional linear subspace, hence \pi is a linear bijection; the inverse L is the linear map we are looking for.

This implies that

|S\cap \{0,1\}^{k+m}| = |\{(v, L(v)): v \in \{0,1\}^k,~ L(v)\in \{0,1\}^m\}| =|\{0,1\}^k\cap L^{-1}\{0,1\}^m|.

Note that the linear map L is determined by the values it takes on the basis vectors e_1,\dots,e_k of \mathbb{R}^k. The reformulation makes it easy to show that

H(k,m)\subseteq H(k',m') whenever k\leq k' and m\leq m'.

Let L be a map from \mathbb{R}^k\to \mathbb{R}^m. We claim that there is a L': \mathbb{R}^{k'}\to \mathbb{R}^{m'} with |\{0,1\}^k\cap L^{-1}\{0,1\}^m|=|\{0,1\}^{k'} \cap (L')^{-1}\{0,1\}^{m'}|. To handle k'>k, define L'(e_i)_1 to be a very large negative value for i>k; in that case, L'(x)_1\in \{0,1\} is only possible if x_i=0 for i>k (in which case L'=L). Set L'(x)=(L(x),0) to extend a map L to \mathbb{R}^m to a map L' to \mathbb{R}^{m'}.

For k=1,2, all intersection sizes are possible already when m=1 (see the figure above). Is H(k,m)=\{1,\dots,2^k\} for m large?

Melo and Winter showed that for a k-dimensional subspace, for m large all intersection sizes 2^{k-1}+2^i for i\in \{0,\dots,k-1\} are possible, whereas an intersection between 2^{k-1}+2^{k-2} and 2^k can never be achieved. They conjectured that no values in the other gaps can be achieved either. Our first result shows this is almost true. Let H^+(k,m)=H(k,m)\setminus \{1,\dots,2^{k-1}\}.

Theorem For k \geq 6, and m \geq k-1,

H^+(k,m)=\{2^{k-1}+2^i:i\in \{0,\dots,k-1\}\}\cup\{2^{k-1}+2^{k-5}+2^{k-6}\}.

In fact, we determine H^+(k,m) exactly for all m. There is one additional value 35 \cdot 2^{k-6}, which appears already for m=1, k=6.

Melo and Winter also conjectured that all intersections below 2^{k-1} can be achieved; we show this is false by proving that slightly below 2^{k-1} in fact most values cannot be achieved.

Theorem For k \geq 8 and any m \geq 1,

H(k,m) \cap \left\{ \tfrac{15}{16} 2^{k-1}, \dots, 2^{k-1}\right\}= \left\{ \tfrac{15}{16} 2^{k-1}, \tfrac{63}{64} 2^{k-1}, 2^{k-1} \right\}.

Our proofs work by first solving the case m=1 and then considering m>1 as the intersection of various m=1 cases. Indeed, we can consider the maps L_i(x)=L(x)_i for i\in \{1,\dots,m\} and then it is not difficult to see that

I(L) = \cap_{i=1}^m I(L_i), where

 

writing I(L) = \{0,1\}^k \cap L^{-1}\{0,1\}^m for the points in \{0,1\}^k whose image under L lies in \{0,1\}^m. We use the following auxiliary results:

As noted by Melo and Winter, L(\{e_1,\dots,e_k\})\subseteq \{0,1\}^m if |I(L)|>2^{k-1}.


We prove the contrapositive. Suppose \alpha= L_1(e_1)\not\in \{-1,0,1\}. For any x\in \{0\}\times \{0,1\}^{k-1}, if L_1(x)\in \{0,1\}, then L_1(x+e_1)=L_1(x)+\alpha\not\in \{0,1\}. Hence either x or x+e_1 is not in I(L_1), thus |I(L)|\leq |I(L_1)|\leq 2^{k-1}.

We may still assume this if |I(L)|>\frac{15}{16}2^{k-1} as long as |I(L)|\neq 2^{k-1}.


In fact, the value 2^{k-1} can be achieved in various ways by setting L(e_1),\dots,L(e_k) equal to \pm \frac12,\pm 2 (for small k).


We prove that a non-trivial intersection with an additional I(L_{m+1}) gives a significant reduction of I(L). The reduction is particularly strong if m=1, hence if wlog L_1 is the part taking a value outside of \{-1,0,1\}, then |I(L_1)|\leq 2^{k-1} by above and it turns out that any strictly smaller I(L_1)\cap I(L_2) lands out of our range.

I(L_i) (and hence I(L)) is too small unless L_i(e_j)=0 for all but at most 9 of the j.

Suppose m=1 and

L(e_1),\dots,L(e_a)=1,
L(e_{a+1}),\dots,L(e_{a+b})=-1,
L(e_{a+b+1} ),\dots,L(e_{k})=0.

Then we prove that

\left| \left\{ x \in \{0,1\}^k : L(x) = j \right\} \right| = 2^{k-a-b}\binom{a + b}{b + j}.

(Proof idea)


The smallest value L(x) can take is -b, which is achieved by setting x_i=1 exactly for i\in\{a+1,\dots,a+b\}. If x_i is changed to 1 for an element i\in \{1,\dots,a\}, then the value of L(x) will increase by 1. In order to get to 0, b such changes have to be made which can be done in exactly \binom{a+b}{b} ways. The coordinates mapping to 0 (the x_i for i \geq a+b+1) have no effect on the value of L(x) so each can individually be included or not; this gives the factor 2^{k-a-b}.

This implies that

I(L)= \left| \left\{ x \in \{0,1\}^k : L(x) = 0 \right\} \right|+ \left| \left\{ x \in \{0,1\}^k : L(x) = 1 \right\} \right|
=2^{k-a-b}\binom{a + b + 1}{b+1}.

The largest binomial coefficient grows as 2^{a+b}/\sqrt{a+b}, and calculation confirms that for any a+b\geq 10 we find that \binom{a + b + 1}{b+1}\leq \frac{15}{32} 2^{a+b}.

If multiple I(L_i) intersect and none of them is redundant, then the sets S(L_i) of values j for which L_i(e_j)\neq 0 must form a particular “shape”.


We show this by generating all cases using computer search, which is computationally feasible using the claims proved above.
We define the shape of a map L to be a hypergraph which has \{1,\dots,k\} as vertices and the sets S(L_i) as edges. We call a map minimal if each S(L_i) has a value in \{1,\dots,k\} for which it is the only one covering it, and the `constraint is not redundant’, i.e. |I(L)|<|I(L_{[m]\setminus i})| where L_{[m]\setminus i}(x)=(L(x)_j)_{j\in [m]\setminus i}. The minimal shapes for which I(L) is in our range of interest, tend to take a particular form: a (3,2)-star or a (2,1)-star.

32star
A (3,2)-star with two edges.


It turns out that for example the following statement holds.
Suppose L:\mathbb{R}^k\to \mathbb{R}^m is a minimal linear map with |S(L_i)|>1 for all i\in \{1,\dots, m\}, for which |\cup_i S(L_i)|\geq 8. Then either the sets S(L_i) form a (2,1)-star or a (3,2)-star, or t(L) \leq 2^{k-1}.

It remains open which intersection sizes are possible below \frac{15}{16} 2^{k-1}, and in particular if all values would have been present with a smaller upper bound. We pose the following conjecture.

Conjecture For every constant M, for k sufficiently large, a positive fraction of the values in [0,2^{k-M}] cannot be achieved as an intersection between a k-dimensional subspace and a hypercube.

Postgraduate Combinatorial Conference 2019

[Update: This event already took place; photos can be found on https://pcc2019.github.io/photos. Next year the PCC is organised by Glasgow on 27-29 April 2020. See https://pcc2020.github.io/ for more information.]

The 26th Postgraduate Combinatorial Conference (PCC) will be held in Oxford on 10th-12th June 2019.

The PCC is an established conference organised by, and for, current research students in all areas of combinatorial and discrete mathematics, under the auspices of the British Combinatorial Committee. The PCC is mainly aimed at UK-based students, but is also open to those from abroad.

The aim of the conference is to allow research students to discuss their research in a relaxed environment, to gain practice at presenting their research outside of their own department, and to meet other young researchers in their area. Each student is encouraged to contribute by giving a talk which will last 25 minutes (including 5 minutes question and answer time).

Registration is available via https://pcc2019.github.io/ and payment via Oxford University Store. The early-bird fee is £30 which increases to the normal registration fee of £40 on 1st May.

The fee includes the conference reception, the conference dinner and refreshments during the breaks. We also have limited funds to cover travel expenses, hopefully up to the value of around £50-60 per person.

We gratefully acknowledge support from the LMS and the BCC.

Size reconstructibility and the graph reconstruction conjecture

[This blog post is based on a joint paper with Hannah Guggiari and Alex Scott]

A graph G consists of a set of vertices V(G) and a set of edges E(G) which may connect two (distinct) vertices. (There are no self-loops or multiple edges.)

A very basic question about graphs is: Is a graph determined by its induced subgraphs? For a graph G and a vertex v, a card of the graph is an induced subgraph G-v obtained by removing the vertex v and all adjacent edges. The deck of the graph is the collection of cards \{G-v:v\in V(G)\}, allowing multiples. An example of a deck of card is given below.

graphdrawings

Below the cards, we see how the cards can be obtained by removing vertices from a single graph, removing one at a time and removing each vertex exactly once.
Can two non-isomorphic graphs have the same deck of cards? Yes, if the graphs have two vertices.

graphdrawings2

If we see a single vertex twice, then we know the original graph had two vertices but there is no way for us to know whether there is an edge between them.

More than 60 years ago, Kelly and Ulam made the beautiful conjecture that if two graphs on at least three vertices have the same deck, they must be isomorphic. In 1977, Bondy and Hemminger wrote the following about this conjecture:

The Reconstruction Conjecture is generally regarded as one of the foremost unsolved problems in graph theory. Indeed, Harary (1969) has even classified it as a “graphical disease” because of its contagious nature. According to reliable sources, it was discovered in Wisconsin in 1941 by Kelly and Ulam, and claimed its first victim (P. J. Kelly) in 1942. There are now more than sixty recorded cases, and relapses occur frequently (this article being a case in point).

There are many subtleties in the problem which might not be apparent at first sight; for example, if two vertices u and v have the same card (i.e. G-u\cong G-v) then they are not necessarily “similar” in the graph, that is, there does not necessarily exist a graph automorphism mapping u to v.

Rather than reconstructing the entire graph, can we at least read off some information about the graph? If a graph has vertices v_1,\dots,v_n then

\sum_{i=1}^n |E(G-v_i)|=(n-2)|E(G)|

since each edge appears in all cards except for the two corresponding to vertices it is adjacent to. So we can read off the number of edges of the graph (the size) if we are given a full deck of cards. What if we have only access to a subset of the cards? In Size reconstructibility of graphs Alex Scott, Hannah Guggiari and I describe a way to reconstruct the size of the graph if at most \frac1{20}\sqrt{n} cards are missing from the deck (for n large). The best previous result in this direction was the case in which two cards are missing.

Our proof works as follows:

  • We first note that if we have been given the cards G-v_1,\dots,G-v_{n-k}, we can still compute \frac{\sum_{i=1}^{n-k} |E(G-v_i)|}{n-2-k}\approx|E(G)|, which allows us to obtain an (over)estimate \widetilde{|E(G)|} on the number of edges.
  • The degree d(v) of a vertex is the number of edges adjacent to it. Note that on the card G-v, exactly the edges adjacent to v are missing. Hence |E(G)|-|E(G-v)|=d(v). We will try to approximate the degree sequence (d_t), where d_t gives the number of vertices of degree t, in two different ways.
  • Firstly, using our estimate on the number of edges, we can still approximate the degree of the vertices of the cards that have been given: \widetilde{d(v)}=\widetilde{|E(G)|}-|E(G-v)|. Since we overestimate the number of edges, we also overestimate d(v) for every vertex, but always by the same amount \alpha = \widetilde{|E(G)|}-|E(G)|. This means our approximated degree sequence (\widetilde{d}_t) is the actual degree sequence (d_t), but shifted to the right, and moreover some values have been underestimated (since some of the cards are missing). If we could recover the shift \alpha, we could find |E(G)| since we know \widetilde{|E(G)|}.
  • Secondly, we discover many small values d_t exactly.  Let d_t(G) denote the number of vertices of degree t in G. We discover many values of d_t=d_t(G) from our magic formula

\sum_{i=1}^n d_t(G-v_i)= (n-1-t)d_t+(t+1)d_{t+1}.

  • We almost know the quantity on the left-hand side. If we knew say d_{t+1} exactly, then we can find d_t exactly if we find an estimate whose error is guaranteed to be <\frac12, as we know that d_t is an integer (we then round this error away). For a similar reason, we are able to “guess” d_t and d_{t+1} if both of these are small, and \frac{t+1}n is “not too near a bad fraction”. Hence we can discover some of the values, and can then use this knowledge to find some of the nearby values as well. In fact, we can either find the adjacent value or discover it is ‘large’.
  • We either reconstruct the entire degree sequence (then we can read off the number of edges from this) or discover some large values. Suppose that we find a segment as in the picture below: one large value, and to one side of it a bunch of known values, many of which are small.

path4522

  • We “match up” the known values to various shifts of (\widetilde{d}_t). For the correct shift, the error will be small, whereas for any other shift the error is lower bounded by the difference between large and small in the figure below. This allows us to recover \alpha and then the number of edges.

Some interesting open problems include:

  • The original graph reconstruction conjecture, for which the edge variant is also unknown. (The directed graph, infinite graph and hypergraph analogues are false.) This is also unknown for “nice” graph classes such as bipartite graphs or planar graphs.
  • Improving our result beyond \sqrt{n} or extending it to recovering different information about the graph, such as the degree sequence or the number of triangles.

3-colouring graphs without long induced paths

[This blog post is based on a joint paper with  Karolina Okrasa, Pawel Rzążewski, Alex Scott, Paul Seymour and Sophie Spirkl]

The decision problem 3-COLOURABILITY asks whether the vertices of a given graph G can be coloured with three colours in such a way that vertices which are connected by an edge receive different colours. This problem is well-known to be NP-hard, meaning it is difficult to solve in general in some sense. What happens if we restrict the class of graphs? Is it due to a particular structure in the graph that graphs are easier or harder to colour?

colouredgraph
This graph can be coloured with 3 colours: the nodes with the same colour are not connected. Such colourings can model various real-life scenario’s, e.g. the colours could represent time-periods and two tasks could have an edge between them if they cannot happen simultaneously. A 3-colouring is then a conflict-free assignment of the tasks to time periods such that at most three periods are required.

A graph G is called F-free if it does not contain the graph F as induced subgraph (meaning that one can remove vertices, but not edges, e.g. a complete graph has only complete graphs as induced subgraphs). It turns out that k-colouring is still NP-hard if you restrict to the class of graphs not containing a cycle or the claw graph K_{1,3} (for all k\geq 3). If we assume F is connected, then the remaining cases are paths P_t on t vertices.

After a series of paper, the complexity status of k-COLOURABILITY of P_t-free graphs has been determined, except for k=3 and t\geq 8. The open question is hence: does colouring become easier if restricted to graphs which do not contain long induced paths? Intuitively, long paths may allow for “long-term interactions”. In H-colouring Pt-free graphs in subexponential time, my co-authors and I show that in some sense this case is easier: there is a subexponential-time algorithm for deciding 3-colourability. Even more, the number of 3-colourings can be counted in subexponential time. Under the Exponential Time Hypothesis (which says that 3-SAT cannot be solved in subexponential time), such an algorithm cannot exist for the other graphs than P_t nor for k>3 (for, say, t\geq 7).

[To be continued: So what makes the difference between 3 and 4?]