Flyspeck

What is Flyspeck?

The Flyspeck project was launched in 2003 to formalise the proof of the Kepler Conjecture, a 400-year old problem in mathematics about the densest possible packing of same-sized spheres. The conjecture was proved in 1998 by Tom Hales in 300 pages of mathematical text plus thousands of lines of computer code, but the complexity of the proof made it too difficult for the mathematical community to fully accept. Hales himself instigated Flyspeck in order to remove any doubts about his proof, and oversaw the project to completion in 2014.

Flyspeck is notable in many respects:

It successfully addressed concerns of mathematicians about the correctness of one of the most important results of the 20th century.
It is the largest formalisation project to date, created by 20 man-years of effort, and producing 500,000 lines of proof script that generate hundreds of billions of primitive proof steps and that take hundreds of hours of computing time to process.
Despite the complexity and sheer size of the original proof and its formalisation, it was carried out mainly in HOL Light, which is one of the simplest theorem provers.
It broke new ground for formal proof projects, being the first to involve large-scale international collaboration, with around 20 contributors spanning 3 continents, and the first to involve major outsourcing, with about 50% of the effort performed by mathematicians who were paid by result.

The Formal Proof

Flyspeck breaks down into 4 parts that correspond to the 4 main parts of the original 1998 proof:

The main text, corresponding to the 300 page traditional mathematical text.
The classification of tame graphs, re-establishing results of a Java program.
The non-linear inequalities, done in HOL Light, re-establishing the results of a C++ program.
The linear inequalities, done in HOL Light, re-establishing results from using Mathematica.

Parts 1 and 4 were proved in the same HOL Light session, but part 2 was proved separately, by extracting a program from Isabelle/HOL and executing it, and part 3 was so large it needed to be split up and was proved in around 600 separate HOL Light sessions.

Trusting Flyspeck

Now that the proof has been formalised, and the mathematical community are convinced, surely this closes the matter? Well, in practice quite a lot can go wrong when formalising a proof. For example, theorem provers have flaws that allow incorrect proof steps, and outsourced workers in projects like Flyspeck might exploit these. And theorem provers sometimes display formulae wrongly, and so the wrong theorem may have been proved. And in multi-part proofs like Flyspeck, the different parts might not align properly.

We believe that important formalisations such as Flyspeck should be audited, to ensure that the formal proof does indeed work as intended. See our paper about proof auditing for detailed discussion of the issues.

Replaying Flyspeck

As part of a full audit of Flyspeck, all its proof steps would ideally be performed in a single session of a highly trustworthy theorem prover. To this endeavour, we are employing Common HOL proof recording to capture the steps so that they can be replayed in HOL Zero. Currently we have managed to record and replay the steps of parts 1 and 4 in the same HOL Zero session.

See our Flyspeck Replay page for instructions and downloads for replaying part 1 of Flyspeck (corresponding to 1.3 billion primitive proof steps in HOL Light).