Abstract
2. Universal properties
3. Products in practice
4. Universal properties in algebraic geometry
5. The problem with Grothendieck’s use of equality.
6. More on “canonical” maps
7. Canonical isomorphisms in more advanced mathematics
8. Summary And References
In the previous section we saw that a simple construction in mathematics (product of two sets) could be defined up to unique isomorphism by a universal property (“a product”), and could also be defined via an explicit model (“the product”). In a perfectly functorial world, mathematicians would only ever use the universal property to do everything. However in practice this is far from the case; we are sometimes forced to use an explicit model of our object. This surprised me. Here is an example of this phenomenon which I learnt from Patrick Massot.
\ Say R is a commutative ring and S is a multiplicatively closed subset (so 1 ∈ S and a, b ∈ S =⇒ ab ∈ S). Informally, the localisation R[1/S] of R at S is the ring generated by R and the inverses of the elements of S. Formally there is a universal property at play here. Let us define a localisation of R at S to be an Ralgebra R[1/S] with the following universal property: given any commutative ring Z and any ring homomorphism R → Z sending each element of S to an invertible element of Z, there is a unique extension of this homomorphism to an R-algebra map R[1/S] → Z.
\ A localisation has a universal property, and so the uniqueness yoga for universal properties above tells us that localisations are unique up to unique isomorphism – if they exist. To work with localisations then, it suffices to find one construction of an object with the universal property, and we could call this construction the localisation.
\ Like the real numbers, there are several ways to construct an explicit model which satisfies the universal property. One is to adjoin formal inverses ys of every s ∈ S to R, making a huge multi-variable polynomial ring, and then to quotient out by the ideal generated by the sys − 1 for s ∈ S. Another is to consider an element of R[1/S] as a fraction with numerator in R and denominator in S, defining R[1/S] to be ordered pairs (r, s) modulo an appropriate equivalence relation (we say (r1, s1) ∼ (r2, s2) if there exists t ∈ S such that r1s2t = r2s1t) and then to define addition and multiplication on the equivalence classes and to verify the commutative ring axioms.
\ In this sense it’s just like the real numbers – we can construct them using Cauchy sequences or Dedekind cuts or Bourbaki uniform space completions, and the details don’t matter. So surely here it is the same thing: now we know the existence of a localisation, the theory should then be developed using only the universal property and we can forget about the construction. But this turns out to be essentially impossible to do in practice. Let me explain an explicit example.
\ It is well-known that the map R → R[1/S] making R[1/S] into an R-algebra has kernel equal to the annihilator ideal of S, that is the elements r ∈ R such that there exists s ∈ S with rs = 0. How does one prove this only from the universal property of localisations? Certainly rs = 0 implies that r = 0 in R[1/S], because 0 = rss−1 = r1 = r in R[1/S]. The harder part is the converse inclusion. Say r ∈ R and its image in R[1/S] is zero. How do we find some s ∈ S such that rs = 0, using just the universal property? Even the following special case seems hard: how does one prove using only the universal property that if s ∈ R is not a zero-divisor then the map R → R[1/s] is injective?
\ One solution to this problem is to solve the contrapositive question. If there is no s ∈ S such that rs = 0, how do we find a cleverly-chosen ring Z equipped with a map from R to Z sending S to units and such that r is sent to something nonzero? The way this is done in the books is as follows: let Z be the model of R[1/S] consisting of equivalence relations of pairs (r, s) ∈ R × S which we used to prove that localisations exist. Now r is sent to the class of (r, 1), and by definition this is equivalent to the class of (0, 1) if and only if there exists t such that r1t = 01t, i.e. such that rt = 0, as required.
\ Another way of thinking about this proof is that we have computed the kernel of the localisation map from R to the localisation R[1/S] (that is, to our explicit model of the localisation as a quotient of R×S), and have then we have implicitly used an unwritten (but easy) diagram chase to deduce that this kernel is equal the kernel of the map to a localisation. In particular, we have violated assertion 1, because we have resorted to an explicit construction of the localisation; this is somewhat like choosing a basis for a vector space in order to reduce a question about linear maps to a question about matrices.
\ Thanks to a referee for pointing out several other examples where it is unreasonable to expect that the universal property should do all the work. For example the claim that if R is an integral domain then so is the polynomial ring R[X] does not appear to immediately follow from the usual universal property describing maps R[X] → Z for other commutative rings Z; here it’s much more natural to use the usual concrete model of R[X] (functions from the naturals to R with finite support, with notation r0 + r1X + · · · + rnXn).
\ Whilst picking a basis is something which sometimes cannot be avoided, my current understanding of the localisation example above is that it is evidence that we have made a poor choice of universal property. Later on we will see another property which also uniquely characterises localisations up to unique isomorphism, and Lean’s mathematics library mathlib (which uses this choice) can be regarded as evidence that the second property is a much better choice if we want to develop the theory of localisations via universal properties and without ever choosing a model.
\ An unwritten but easy diagram chase might not look like a big deal, but here is an example which showed up in the Lean formalisation of algebraic geometry where this sort of thing really was an issue. In this example we only want to invert one element of a ring at a time, so let us introduce the notation R[1/f] for f ∈ R to mean R[1/S] where S = {1, f, f 2 , f 3 , . . .}.
\ In 2017 Kenny Lau (then a 1st year undergraduate at Imperial College London) constructed the localisation of a commutative ring at a multiplicative subset in Lean (that is, he wrote down an explicit model for the localisation and proved that it satisfied the universal property). In 2018 Chris Hughes (also a 1st year undergraduate) proved [Sta18, Tag 00EJ] in Lean. This lemma claims that if R is a commutative ring and if f1, f2, . . . , fn ∈ R generate the unit ideal then the following sequence is exact:
\ 0 → R → Y i Ri → Y i,j Ri,j .
Here i and j (independently) run from 1 to n, Ri := R[1/fi ] and Ri,j := R[1/fifj]. The first nontrivial map is the structure map R → Ri on each component, and the second sends (rk)k to ri − rj on Ri,j .
\ I then tried to apply this lemma to give a formal proof in Lean that the structure presheaf on an affine scheme is a sheaf (which is the content of the assertion that an affine scheme is a scheme). However when I came to apply Hughes’ lemma it would not apply because it only applied to the localisation of a ring at a multiplicative subset, whereas I wanted to apply it to a localisation. More precisely, The result Hughes had formalised was about localisations such as R[1/fg] and I wanted to apply it to rings such as R[1/f][1/g], which are a localisation of R at the submonoid generated by fg, but are unfortunately not equal to the localisation. For example if we use the R×S model for the localisation, an element of R[1/fg] is an equivalence class in R× {1, fg,(fg) 2 , . . .}, and an element of R[1/f][1/g] is an equivalence class in the completely different set R[1/f] × {1, g, g 2 , . . .}, where g is the image of g in R[1/f] (so itself is an equivalence class).
\ The bottom line is that the lemma did not apply. And yet mathematically there should be no issue at all. How do we fix this problem? Let me go over the issue again. The statement of the ring lemma which Hughes had formalised was a theorem about the localisation of a ring at many different multiplicative sets, but in our application we needed the stronger statement that it was true for a localisation. Mathematicians barely notice the difference, but a computer theorem prover was quick to point it out. The cheap solution would be to deduce the stronger statement from the weaker one, and this is what I did at first.
\ This sounds like it should be a relatively straightforward process, and it is arguably what is happening inside the head of a mathematician who is showing this argument to their algebraic geometry class; it involves introducing no new mathematical ideas, it is just a big diagram chase. However, to my dismay, this was in practice extremely tedious. I had to prove lemma after tedious lemma saying that various diagrams commuted, using the universal property; these lemmas were specifically about morphisms which appeared in the statement of the ring lemma, and in particular were not reusable in other situations; they were just annoying boilerplate which needed to be in the code base and furthermore they felt mathematically completely vacuous.
\ However I could see that they were necessary to make this proof strategy watertight. Furthermore we ran into an unexpected twist: the map from Q k Rk to Q i,j Ri,j in the lemma sending (rk) to ri − rj is not an R-algebra homomorphism but only an R-module homomorphism, and there can be many (non-canonical) R-module maps between two localisations of R. It is however the difference of two R-algebra homomorphisms, and applying the universal property to these ring homomorphisms got us through in the end.
\ The major disadvantage of this approach is that the human formaliser has to generate a bunch of one-off lemmas with no applications elsewhere. It took a while for me to understand that in fact formalising the lemma as stated in the literature, for the localisations, is a mistake. One should in fact never prove this lemma at all; one should instead state and prove a version of the lemma which assumes only that the Ri and Ri,j are each a localisation of R.
\ If one likes, one can then deduce the weaker version (which is of no use to us anyway) from the stronger one. In 2018 Amelia Livingston, also then an undergraduate at Imperial College, developed theories of both “the” and “a” localisation of commutative monoids at submonoids, and then a theory of “the” and “a” localisation of commutative rings at submonoids (thus refactoring Lau’s work).
\ Finally in 2019 Ramon Fern´andez Mir, as part of his MSc thesis, came up with a proof of the version of the ring lemma which applied to rings which were merely “a” localisation, and we could use these results to give a new and much cleaner Lean proof that the structure presheaf on an affine scheme was a sheaf. Mir followed a suggestion of Neil Strickland to instead use a different property which also uniquely characterises localisations, and then to find a proof of the ring theory lemma which used only this new property. The property Mir used was the following:
Theorem. The R-algebra A with structure homomorphism f : R → A is a localisation of R at a multiplicative subset S if and only if it satisfies the following three properties:
• f(s) is invertible for all s ∈ S;
• Every element of A can be written as f(r)/f(s) for some r ∈ R and s ∈ S;
• The kernel of f is the annihilator of S.
\ A referee notes that there is an analogous classification of localisations of Rmodules, which we leave as an exercise for the reader. This property is of a very different nature to the previously-mentioned universal property; for example it does not quantify over all rings, meaning that when applying this theorem to prove results about localisations, one never has to pull a magic ring out of a hat (such as an explicit construction of the localisation). It does however uniquely characterise the localisation up to unique isomorphism, so it is doing the job required of it.
\ It is also a mathematical idea which is not present in the original paper presentation of the proof; although it is not difficult to prove, it is in some sense new mathematics which had to be generated to make the formalisation viable and tidy. It also explicitly states the characterisation of the kernel of f which Massot had observed seemed to be difficult to prove from the traditional universal property without resorting to an explicit model. Mir was fortunate in that the proof formalised by Hughes for the localisation was easily adaptable so that it applied to all localisations satisfying Strickland’s property; it was not at all a priori clear (at least to me) that this would be the case.
\ The very fact that we needed to do some mathematical thinking (the invention of a practical property characterising localisations up to unique isomorphism) in order to come up with an acceptable formalisation of this basic result in algebraic geometry made me wonder just how much more mathematical thinking we will need to do in order to formally prove harder results in algebraic geometry.
:::info Author: KEVIN BUZZARD
:::
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
\

