ON THE NATURE OF MATHEMATICAL CONCEPTS: WHY AND HOW DO MATHEMATICIANS JUMP TO CONCLUSIONS?
Notation: x for products: 2 x 3 =6, ^3 for cubes: 2^3 = 8, ^exponent: 2^11 = 2048.
While engaged in the mathematical endeavor, we simply jump, hardly ever asking "why" or "how." It is the only way we know of grappling with the mathematical problem that we are out to understand, to articulate as a question and to answer by a theorem or a whole theory. What drives our curiosity is a question for psychologists. Only after the jump has landed us on a viable branch can the labor of proving the theorem or constructing a coherent theory set in. The record of the end result, usually a presentation at a conference, a paper in a learned journal or a chapter in a book, is laid out in a sequence of rational deductions from clearly stated premises and rarely conveys the process by which it has been arrived at.
The question of why we have no other choice but to jump has received a remarkably precise answer through Gödel's Proof of Incompleteness in 1931 and Tarski's analysis of the concept of Truth in the thirties in Poland. Since then the development of a rigorous concept of an algorithm has led to a proliferation of so-called undecidability and inseparability results underscoring the limitations of the formal method.
The question of how we jump has many aspects. First: What does the jumping consist of? What are we doing when we jump? What is going on in our minds when we are hunting down a mathematical phenomenon? And then: What is guiding us? How come we jump to CORRECT conclusions? Even if the guess was not quite correct, it usually was a good hunch that, properly adjusted, will open up new territory. Where do these hunches come from? Probably the simplest recorded answer to that question goes back to Plato and has spawned a school of thought in the Foundations of Mathematics that bears his name. It puts those hunches on a par with our spontaneous reactions to physical messages—"smell that? someone must be roasting a lamb in the next clearing," "there is a storm brewing in the South West, I can feel it in my bones." According to Plato's view, mathematical objects exist eternally and immutably in a realm of ideas, an abstract reality accessible, if only dimly, to pure reasoning. That is how we discover them and their properties. By now, what with 2000 years of escalating experience with mathematics and painstaking critical analyses of its tenets, Platonism is no longer the accepted view in the Foundations. But, if nothing else, it is a wonderful allegory and an extremely useful working hypothesis.
To put it bluntly, while at work a mathematician is too busy concentrating on deciphering the hints he can gather from the trail he is following to stop and bother asking how the trail got here. It is enough for him to have a good hunch that the trail will lead to the goal.
The following is a slightly polished version of my spontaneous response to the assortment of EDGE comments on Stanislas Dehaene's question "What Are Numbers, Really? A Cerebral Basis For Number Sense" and the subsequent discussion at The Reality Club. After a simple illustration of how we ponder, jump and then fill in the steps I address some general considerations raised on EDGE, which leads me to an exposition of the limitation phenomena.
Although keeping technicalities to a minimum, both conceptually and typographically, I am careful to be precise and correct. In our field the smallest inaccuracy can have disastrous consequences leading head on into contradictions.
1729 AN EXAMPLE OF MATHEMATICAL REASONING
Stanislas Dehaene brings up the Ramanujan-G.H.Hardy anecdote concerning the number 1729. The idea of running through the cubes of all integers from 1 to 12 in order to arrive at Ramanujan's spontaneous recognition of 1729 as the smallest positive integer that can be written in two distinct ways as the sum of two integral cubes is inappropriate and obscures the workings of the naive mathematical mind. To be sure, a computer-mind could come up with that list at a wink. But what would induce it to pop it up when faced with the number 1729 if not prompted by some hunch? Here is a more likely account:
Confronted with 1729 you will recognize at a glance that:
i) 1729 = 1000 + (810-81) = 10^3 + 81 x (10-1) = 10^3 + 9^2 x 9 = 10^3 + 9^3 = (1 + 9)^3 + 9^3 = 1 + 3 x 9 + 3 x 9^2 + 9^3 + 9^3 = 1 + (3^3 + 3 x 3^2 x 9 + 3 x 3 x 9^2 + 9^3) = 1 + (3 + 9)^3 = 1^3 + 12^3 in view of the pattern ii) (a + b)^3 = a^3 + 3 x a^2 x b + 3 x a x b^2 + b^3.
Now all those 3's in the above expressions spring to attention, you fleetingly call up THE EQUATIONS
iii) (a + b)^3 + d^3 = a^3 + (c + d)^3 a^3 + (3 x a^2 x b + 3 x a x b^2 + b^3) + d^3 = a^3 + (c^3 + 3 x c^2 x d + 3 x c x d^2) + d^3
and JUMP to the conclusion that the choice of (1,9; 3,9) for a,b; c,d will give you the smallest positive integer that can be written as the sum the cubes of two integers (a+b) and d and also of a different pair a and (c+d). You have a well trained instinct. But, if called upon, it will be a simple matter to fill in that jump by a proof, the fixed coefficients 3 ruling out smaller choices for b,c,d, once the minimal possible value 1 is chosen for a.
ANALYSIS OF A TRAIN OF THOUGHT
The best way to understand the process encoded above in technical shorthand is via a metaphor, which should be spun out at leisure. Say you are driving into a strange town, and, for some reason or other, a building complex catches your attention. It does not just pop into your field of vision; at first glance you see it as a museum, a villa, a church, or whatever. And then, depending on your particular interests and background, you may recognize its shape, size and purpose, muse over its style, venture a guess as to its vintage, and so forth.
Upon meeting 1729, your first reaction will probably be to break it up into the sum of 1000 and 729, because of our habit of counting in decimal notation. Stop for a moment to consider what would have been facing Ramanujan if Taxi cab companies were favoring binary notation! [11011000001 = 11011000000 + 1 = 11^3 x 100^3 + 1^3 = 101^3 x 10^3 + 11^3 x 11^3 = 1111101000 + 1011011001]. On the other hand, if you are one of those people obsessed with prime factorization you'll "see" the product 7 x 13 x 19 when somebody says "1729" to you while a before-Thompson-and-Feit but after-Burnside group theorist will say "Aha that is an interesting number, all groups of order 1729 are solvable," and anyone with engineering experience immediately thinks of the 1728 cubic inches contained in a cubic foot [1]. But a historian of Mathematics will see 1729 as the year of Euler's friend and benefactress Catherine the Great's birth.
Next you decide, more or less deliberately, how to investigate the phenomenon. Do you drive to the nearest kiosk, buy a "Baedecker," search for that building and read through all you can find in there about it before you make up your mind about what you want to know? In other words, assuming you have a kiosk full of lists handy in your own mind, do you run through all the integral cubes smaller than 1729? If so, why cubes?
If you have that kind of mind you probably would first run through the squares before getting to the cubes. The less methodical tourist, eager to enjoy rather than out to complete his (or her) knowledge, may choose to investigate in a haphazard way, spurred on by curiosity, guided by experience, using skills automatically while impulsively following hunches, prowling, sniffing, looking behind bushes, and then jump to rational conclusions.
Now return to Ramanujan and see how the first thing that springs to the naive eye beholding the number 729 is that adding 81 = 9^2 turns it into 810, whereupon 10 drops its disguise, shows one of its true natures as the sum of 1 and 9 and, lo and behold, all those powers of 3 start tumbling in. All the while you are aware of the pattern ii), just below the threshold of consciousness, exactly as a driver is aware of the traffic laws and of the coordinated efforts of his body and his jeep. That is how you find your way through the maze of mathematical possibilities to the "interesting" breakdown of 1729 into two distinct sums of integral cubes.
When you stop to ask yourself what is so great about that, something clicks in your mind: you are facing a positive integer with a certain property, you know that
iv) every collection of positive integers has a least member (in terms of its natural ordering).
That knowledge, always hovering below the threshold of consciousness, prompts the question whether 1729 might in fact be the LEAST positive integer expressible in distinct ways as the sum of two cubes. Having another look at the representation of 1729 as a sum of various powers of 3 as held in your mind's eye and exhibited in the third line of i) above, the more or less conscious awareness of ii) invites you to break up those sums of cubes according to the pattern iii) where you assume—"without loss of generality"—that a < d = a + b, and hence c < b. At this point the solution a = 1, b = c^2 = d and c = 3 surfaces by inspection as "obviously" yielding the minimal value for (a + b)^3 + d^3.
ABOUT MATHEMATICAL ACTIVITY
I have gone through this simple illustrative example at such length in order to underscore a few of my pet contentions:
What we sorely need is a phenomenological study of mathematical practice. Polya and Lakatos had independently started out on that path, I do not know to what extent it has been followed up. Mathematicians are well aware of how they work, whether by themselves or in teams. But their goals are results that must be presented in a conclusive and "clean" form that makes them publicly accessible, at least within the profession, a form that necessarily obscures the path that led to them, just as the most beautiful tombstone will sum up a life but give no inkling of how it really has been lived, to use an observation by Claude Chevalley [2].
a) Much mathematical reasoning is done subconsciously, just as we automatically obey traffic rules and handle our cars, whether we know why and how they work or not. Symbolic notation is an "artificial aid" used to secure a hold like a piton, to survey a situation like a geological map and to encode general patterns for repeated application. But it is not mathematics. Mathematics can be done without symbols by a particularly "gifted" individual, like, e.g., Ramanujan. What that gift consists of is one of the questions raised in the EDGE piece. Obviously we are not all of us born with it. Nor do I believe that all people born as potential mathematicians become actual ones. Tenacity of motivation, an uncluttered and receptive mind, an unerring ability to concentrate the mind's focus on long intricate chains of reasoning and relational structures, the self discipline needed for snatching such a mind out of vicious circles, these are only a few characteristics that spring to mind. They can be cultivated. Experience will train the judgment to distinguish between blind alleys and sound trails and to divine hidden animal paths through the wilderness.
b) Free association plays an important role, an agility of mind that allows reasoning to jump ahead with a sure touch, after which comes the dogged toil of constructing proofs.
c) Conceptual visualization is an indispensable attendant to mathematical thinking. Formalization is only a tool and may encourage lazy thinking! Look at the freshmen who enroll in math because they assume they won't be expected to produce coherent arguments or to write grammatical text, that bureaucratic neatness in "plugging in numbers and turning the crank" will suffice to pass the course.
It is fascinating to browse through some of the essays on the Foundations of Mathematics by the topologist and logician L.E.J. Brouwer, the father of Intuitionism. You will find very few formulas in them, and yet they are rigorously reasoned, tightly and succinctly, more so than many formal texts. [3]
d) That practice, familiarity, experience and experimentation are important prerequisites for successful mathematical activity goes without saying. But less obvious and just as important is a tendency to "day dream," an ability to immerse oneself in contemplation oblivious of all surroundings, the way a very small child will abandon himself to his blocks. Anecdotes bearing witness to the enhancement of creative concentration by total relaxation abound, ranging from Archimedes' inspiration in a bath tub to Alfred Tarski's tales of theorems proved in a dental chair.
THE EVOLUTION OF MATHEMATICAL CONCEPTS
The tenet that MATHEMATICAL OBJECTS ARE MENTAL CONSTRUCTS conceived by the human species for the purpose of forging its way through life and environment is compelling. How could we orient ourselves in space without discerning dimensions and estimating distances, how could we keep track of possessions and offsprings without a sense for numbers (cardinals), groupings and hierarchies (ordinals)?
Maybe the prototypical shepherd just kept a heap of pebbles handy by his cave, one for each sheep, to make sure by MATCHING that at the end of the day he had his whole flock together ÷ the first occurrence of the mathematical arrow. The next guy paid attention to the pecking order among his charges and chose his pebbles accordingly. And then ÷ much later ÷ one with a poetic twist of mind gave individual names to his sheep and picked pebbles to match their personalities in looks, color, shape and mood so that, if one went missing, he could tell by looking at the leftover pebble which one of his flock to search for where, according to the culprit's specific idiosyncrasies. Finally, with all that time on their hands, some of the shepherds started creating poetry or inventing music, others projected and extrapolated their minds into higher realms of mathematics ÷ and started wondering.
Here is the beginning of mathematics, not only arithmetic, the whole works, structures (you start grouping your flock, and those groups will interact), mappings and probably even the concept of infinity, "what if those ewes keep lambing and lambing till I run out of pebbles..." Pretty soon these concepts become PHENOMENA and begin evolving in interaction with their creators and with the uses they are put to.
When the ANTHROPOLOGIST has told his story and the PHENOMENO- LOGIST has had a look at how a mathematician's mind works it is for the NEURO PHYSIOLOGISTS to figure out what is going on in the brain of those shepherds and their descendants. The PSYCHOLOGY of mathematical activity ÷ and obsession ÷ also deserves attention and is bound to shed light on the mystery of the prodigy.
The view of mathematical "objects" as mental constructs forever caught up in a dynamic process of evolution was succinctly articulated by L.E.J. Brouwer, the Dutch topologist who, during the first quarter of this century, founded the school of INTUITIONISM as the most compelling alternative to PLATONISM. Occasionally Intuitionism is accused of leading into solipsism. But the understanding of mathematical intuition as a sense for charting one's way around an environment including fellow creatures implies that its tools, the concepts, must be evolving by joint and competing efforts of a community. Very much in keeping with what I understand is Stanislas Dehaene's view. With Brouwer I believe in preverbal mathematical perception, where by perception I mean an activity, a process of "seeing as", picking out of patterns and imposing frames of reference.
Friedrich Wilhelm Nietzsche (1844-1900) had a keen understanding of the anthropological evolution of mathematics and rational thinking. His Der Wille zur Macht (The Will to Power, 1887) contains poignantly expressed insights into the genesis of the laws of Logic, many of them anticipating Intuitionism!
George Lakoff's stress on image schemes and conceptual metaphors is compelling, especially his suggestion of "expansion to abstract mathematics by metaphorical projections from our sensory-motor experience". Yes we do have mathematical bodies! On a primordially homogenous environment we impose a grid commensurate in size and compatible in shape with our bodies as we know them from direct experience. One step further, we project our bodies beyond what is immediately perceivable, spurred on by a tenacious intention "to make sense of it all". Have you ever noticed how many mathematicians are rock climbers? The process of mulling over a mathematical problem displays a striking similarity to that of surveying a cliff before the ascent; of visualizing and comparing alternate routes, from the big lines of ridges, ledges and chimneys down to the details of toe and finger holds, and then weighing possibilities of what might be encountered beyond the visible; all in perfectly focused concentration, projecting ahead, extrapolating, performing so-called "Gedankenexperimente" (thought experiments) and sensing them throughout one's bones and muscles. And finally setting off to break trail through the folds of a brain!
Already in 1623 Blaise Pascal articulated in his PensŽes (thoughts) the observation that the abstract schemata we impose on the world in order to interact meaningfully with it are shaped by the experience of our bodies. [4]
During the last half century the evolution of so-called CATEGORY THEORY out of algebraic topology has developed a dynamic language of diagrams in which the abstract concepts of universal algebra find their natural habitat. [5] "Diagram chasing" ÷ a systematic form of hand waving ÷ is a way of making sense of the abstract structural and conceptual under-pinnings of mathematics, including Arithmetic and Geometry, Logic and set theory, as well as of the juxtaposition between discrete and continuous phenomena. It turns out that Topoi, a particularly prolific species of categories, have the structure of intuitionistic Logic ÷ an amazing corroboration of INTUITIONISM. F. W. Lawvere at SUNY Buffalo, a pioneer in the field since the early sixties, and his associates are beginning to make significant contributions to cognitive science.
As to PLATONISM, whether deliberately or inadvertently, most mathematicians still act and talk as if they were dealing with objects that are part and parcel of the furniture of their Universe. I do it myself, and so does George Lakoff when he refers to the straight line and the reals. It is such a convenient make-believe stance, not to be confounded, however, with the deep allegorical truths revealed in the poetry of Plato's dialogues.
But there is more to be said when we stop to contemplate what we call REALITY. Think how often a writer will create characters only to find them taking on a life of their own, doing things or getting into trouble that their creator had not intended for them at all. So, the positive integers are mental constructs. They are tools shaped by the use they are intended for. And through that use they take on a patina of reality! Nor do they rattle about in isolation. They interrelate, they pick up individual personalities through interaction, by their position in the natural ordering, by splitting into primes, by what they are good for, in what contexts they play what roles.
And before we know it we have a problem on our hands like Fermat's Last Theorem! Its statement can be explained to every child, using a bit of hand waving and the ever handy dots. Through generations the belief in its truth had grown for ever more entrenched. No counter example was found, but no proof was in sight either until Andrew Wiles [6] succeeded in blazing the final trail to the goal through abstract territory, rugged and disconnected in places and prepared by the toil of his peers in others. To the experts the proof is illuminating, but not to the ordinary mathematician in the street. By now our tools are so highly developed that they bring us information about our own creations that we cannot fathom with the unaided mathematical senses, even though it may concern situations whose meaning we can understand perfectly well. In physics and astronomy we are used to similar situations: our instruments can reach physical phenomena way beyond the reach of our physical bodies. The interpretations of these messages from beyond are encoded in theories of our own construction.
The method of FORMALIZATION is by now widely accepted, used and discussed. But it has limitations and is trailing some baffling "non-standard" phenomena in its wake. In order to put these into proper perspective a technical digression is needed.
FORMAL THEORIES
While mathematics is forging mental tools for charting our way through the world, our brains playing very much the part of our senses, things become so intricate that we need artifacts for keeping track of those constructs. That is where symbols come in ÷ algebraic notation, diagrams, technical languages and so forth ÷ as mechanisms for storing and surveying insights and for communicating about them. Extension of this method to the analysis of mathematical reasoning itself leads to so-called meta mathematics and symbolic logic.
Allowing the articulation of "axioms" and of rules of deduction governing their use, the systematic construction of formal languages leads to FORMALIZED THEORIES consisting of theorems, i.e., well formed sentences (wfs' for short) obtained from axioms by chains of deductions according to those rules.
A formal proof is a finite sequence of wfs' starting with axioms, hanging together by the formal rules and ending with the theorem proved by it. The formalized theory itself becomes a topic for theoretical investigation since it is bound to have properties that go beyond what we put into it. Will it be formally consistent in the sense that the negation of a theorem will never show up as a theorem too? Is it formally complete ,i.e., does every sentence have a proof unless its negation has one? These are typical problems for the meta-theory.
The choice of axioms is not arbitrary. We are guided by common sense of mathematical perception, by criteria that deserve investigations to which the EDGE group seems to be making valuable contributions. As we acquire and develop intuitive concepts of sets, spaces, geometries, algebraic structures and all the rest, we try to grasp them by characteristic properties and are led to basic postulates.
Occasionally sustained experience reveals that the original construction was not fully determinate, that the axioms are not complete. They don't suffice to pin down the intended concept uniquely. Some sentence A ÷ Euclid's fifth postulate for instance ÷ is left undecided by what was considered an axiomatic characterization of the concept ÷ of, say, a geometry. Both A and its negation not-A are formally consistent with the axioms. Well, for some purposes it is useful to assume Euclid's parallel axiom for geometry, or well-foundedness for sets, at other times it may be handy to deal with bottomless sets or crooked squares. The tools are evolving as we are using, refining and adjusting them. Such experiences that at first look like failures deepen conceptual understanding and expand mathematical horizons.
The situation of the arithmetic N over the natural numbers 0,1,2,3,... and that of the ordered field R of the reals are more subtle. In both cases we "know exactly" what structure we have in mind, there is no question of bifurcation of concepts. Yet in the case of N a complete axiomatization founders on the requirement of effectiveness while, even though completely formalizable, the elementary theory of R, has so-called non-standard models, as does every theory of an infinite structure
ELEMENTARY THEORIES
These phenomena are a manifestation of the precarious balance between algorithmic precision and expressive power inherent in every formal language and its logic. The most popular, widely taught formalization is the first order predicate calculus, also called elementary logic, a formalization of reasoning in so-called first order predicate languages. That apparatus leads from "elementary axioms" to "elementary" theories.
The important requirement for any formalization is the existence of both a "mechanism" (algorithm) for deciding, given any well formed sentence (wfs) of the language concerned, whether or not that wfs is an axiom, and one for deciding of any given configuration of wfs's whether or not it is an instance of one of the rules. The resulting concept of a formal proof is decidable, i.e., there exists an algorithm, which, when fed any finite sequence of wfs', will come up with the "answer yes" (0) or the "answer no" (1) according as that sequence is a formal proof in the system or not. The resulting axiomatizable theory will in general only be effective in the sense that there exists an algorithmic procedure for listing all and only those wfs' that are theorems. That does by no means guarantee a decision procedure for theoremhood. In fact most common theories have been proven undecidable.
To start with, a familiar structure like N or R will serve as the so-called STANDARD MODEL or INTENDED INTERPRETATION for the elementary theory meant to describe it. Observe that the notion of a standard model presupposes some basic concept of mathematical reality and truth. Gödel talks of "inhaltliches Denken" (formal thinking) in juxtaposition to "formales Denken" (formal thinking). His translators use the term 'contentual'. 'Intentional' might be just as good a choice.
Of course one might dodge the need for a metaphysical position by using terms like "preverbal" or "informal".
But that does not make the problem go away. If we want to talk about standard models, if we want our theories to describe something ÷ approximately and formally ÷ what is it that we want them to describe? A question that would not disturb a Platonist like Gödel. The formalist's way out is to throw away the ladder once he has arrived at his construction and to concentrate on the questions of formal consistency and formal completeness, purely syntactic notions.
A theory is formally consistent if and only if for no wfs A both A and not-A are theorems and formally complete if and only if for every wfs A either A or not-A is a theorem.
To an extreme formalist the existence of an abstract object coincides with the formal consistency of the properties describing it. If at all, he will draw his models from yet another theory, most likely some, necessarily incomplete, formalization or other of elementary set theory, presumed ÷ but only presumed ÷ to be consistent. An unsatisfactory strategy.
We, however, are left with the conundrum of Mathematical Truth and the semantic notions that depend on a "meaning" attached to the theory. With respect to an interpretation of its language over a structure S a theory, formalized or not, is
sound (semantically consistent) if and only if only wfs' true in S are theorems semantically complete if and only if all wfs' true in S are theorems.
Granted a clear and distinct idea of the structures N and R we talk of the sets of all wfs' that are true under the intended interpretations on N and on R as True Elementary Arithmetic, TN, and as the True Theory TR of the Reals.
Consider an EXAMPLE: Leaving aside the question where N comes from, I should think that we all know what we mean by the wfs
(F^3) for ALL positive integers x, y and z: the sum of the cubes of x and y is not equal to the cube of z.
F is short for FERMAT. To explain it to a naive computer mind, we would say: "Make two lists as follows; in the left one, L, write down successively the results of adding the cubes of two positive integers, 1^3 + 1^3, 1^3 + 2^3, 2^3 +2^3, 1^3 + 3^3, 2^3 + 3^3, 3^3 + 3^3,..., and into the right one, R, put all the cubes 1^3 (1), 2^3 (8), 3^3 (27), 4^3 (64), 5^3 (125) and so on. Now run through both lists comparing the entries. (F^3) claims that you will never find the same number showing up both on the left and on the right". A computer can easily compile these lists in so orderly a fashion and run through them so systematically that, for each bound N, it will, after a computable number of steps, say f(N) of them, have calculated and compared all pairs of numbers in L and in R smaller than N. You will probably agree that this tedious explanation makes it sufficiently clear what we mean here by ALL. You may want to use nicer language like talking about NEVER finding a matching pair. The purpose of symbolization, however, is not only orderliness, but clarification. The dual to the so-called UNIVERSAL QUANTIFIER (for all) is the EXISTENTIAL QUANTIFIER. Just think for a moment, assuming (F^3) were false, how easy it would be for your patient computer to prove that. It would only have to go on long enough until it found a COUNTER EXAMPLE, i.e., a positive integer that shows up in both lists, R and L. Having done so it would have proved
NOT(F^3) there EXIST positive integers x, y and z, such that the sum of the cubes of x and y is equal to the cube of z.
In 1753, using clever transformations of the problem, Euler succeeded in proving the restriction (F^3) of Fermat's theorem to cubes. But Fermat's general Conjecture
(F) for ALL positive integers x, y, z and n, n greater than 2, the sum of the n-th powers of x and y is not equal to the n-th power of z
has only been proved conclusively a few years ago by means of techniques way beyond elementary arithmetic. It should be noted here that variable exponentiation is not part of the language of N but can be paraphrased in it. In the above procedure you will have to organize your left list according to an enumeration of triples (x,y,n) and the right one according to pairs (z,n).
The capacity to visualize an ongoing sequence of calculations and comparisons leads to an understanding of what is meant by the truth of (F). Yet, in spite of many efforts, its proof had to wait till algebraic geometry and number theory had achieved the maturity necessary to allow its construction.
We have a pretty good understanding of what we mean when we claim that ALL integers ÷ or all pairs or triples of them ÷ have a certain property, provided that we understand the property itself. The most manageable kind of properties that integers may have are what we call recursive or computable. They are susceptible to a decision procedure as illustrated by the example of checking for fixed n and any given triple of integers x,y,z whether or not the sum of the n-th powers of the first two is equal to that of the third.
A property P of triples of numbers is called recursive if and only if its so-called characteristic function that takes on value 0 at the triple (m,n,q) if that triple has the property and value 1 otherwise (its decision function) is computable by an algorithm like a Turing machine, or, equivalently, is recursive.
The amazing ÷ often elusive ÷ power of the universal quantifier brought home by Gödel's incompleteness proof, discussed in the next section, is again manifest in the intrinsic difficulties with which the conclusive proof of Fermat's theorem is fraught.
ELEMENTARY ARITHMETIC
Based on a naive concept of Truth, every true theory of a definite structure is complete and consistent in both senses, a pretty useless observation. For, a byproduct of Gödel's Incompleteness proof of 1931[7] is the non-formalizability of elementary arithmetic, TN, and with it of many other theories.
EVERY SOUND AXIOMATIZATION OF ELEMENTARY ARITHMETIC IS INCOMPLETE.
The most natural candidate for axiomatizing TN goes back to Giuseppe Peano (1895) and consists of the recursive rules for addition, multiplication and the natural ordering on the set N of non negative integers built up from 0 by the successor operation that leads from n to n+1, together with the Principle P of Mathematical Induction, which postulates that every set of numbers containing 0 and closed under the successor operation exhausts all of N, or, equivalently, that every property enjoyed by 0 and inherited by successors is universal. P is a principle that adults may consider a definition of the set N, while children will ÷ in my experience ÷ take it for granted. But, if you want to articulate it in the language of the first order predicate calculus you run into trouble. As illustrated in example (F) elementary languages can quantify over individuals. But quantification over so-called HIGHER ORDER items like properties is beyond its scope. P is a typical sentence of second order logic.
PEANO ARITHMETIC, PA, is the first order approximation to second order arithmetic obtained by replacing P with the following schema of infinitely many axioms
(PW) If 0 has the property expressed by the wff W and, whenever a number x has that property, then so does x+1, then all natural numbers have property W.
one for each wff (well formed formula) W of the elementary language of arithmetic.
Reformulating (PW) in terms of proofs rather than truth sheds light on it and illustrates how one might want to go about replacing the basic concept of truth in mathematics by a primitive notion of proof. Writing W(x) for "x has property W" the Principle of Proof by Mathematical Induction reads
(PPW) Given 1) a proof of W(0) and 2) a method for turning any proof of W(x) into a proof of W(x+1) THEN there exists a proof of "for all x: W(x)".
Note how appealing this formulation is: Given any number n, you only have to start with the proof given by 1) and then apply the method of 2) n times to obtain a proof of the sentence W(n). But there is a subtlety here. So understood, the principle only guarantees that, for every number n, a proof of W(n) can be found, a typical "for all ÷ there exists ÷" claim. Its power lies, however, in the "there exists-for all ÷" form of the conclusion as exhibited above.
These may sound like nit-picking distinctions, but they are of great proof-theoretic significance. For instance, from the consistency of PA, proved by means transcending PA, follows:
If g is the Gödel number of the Gödel sentence G then:
for each natural number n, the sentence "n is not the Gödel number of a proof of the sentence with Gödel number g"
is a theorem of PA.
However G itself, namely the sentence
"for all x: x is not the Gödel number of a proof of the sentence with Gödel number g"
is not a theorem of PA.
G is the sentence that truthfully claims its own unprovability. Much deep work is required to establish this result rigorously.
Occasionally proofs by mathematical induction are confused with arguments based on so-called 'inductive reasoning', a term used in philosophical discussions of logic and the sciences ÷ yet another reason for all these elaborations.
Axiomatized but undecidable theories are a fortiori incomplete. In 1939 Tarski proved that
THERE IS A FINITELY AXIOMATIZABLE FRAGMENT OF PA ALL OF WHOSE CONSISTENT EXTENSIONS ARE UNDECIDABLE. [8]
And yet, if we accept an intentional concept of truth, we seem to obtain a complete theory from Peano's mere handful of axioms together with that one marvelous second order tool P on which so much of our mathematical thinking hinges. For:
ALL MODELS OF PEANO'S SECOND ORDER AXIOM SYSTEM ARE ISOMORPHIC.
Still, any attempt to formalize second order arithmetic is again doomed to founder on the cliffs identified by Gödel and Tarski. The juxtaposition of these claims is and ought to be baffling. In fact they bring home the discrepancy between the naive and the formalist concept of a model. From the naive point of view they mean that higher order logic cannot be completely formalized. Even so completeness proofs for it are widely hailed ÷ at the price of allowing all sorts of non-isomorphic models even for second order Peano Arithmetic. Enough of that for now. Fermat's last theorem may well be beyond the scope of elementary Peano Arithmetic. In other words, (F) is presumably left undecided by PA. A few mathematically interesting theorems expressible in the language of PA with that property are already known. They are embeddable in stronger but still convincing first order theories, some elementary set theory or other.
After all that we are faced with the question where new axioms come from, in other words with THE PROBLEM OF THE NATURE OF MATHEMATICAL TRUTH. To declare "OK, as of October 27, 1995, the day that Wiles was awarded the Prix Fermat by the town of Toulouse, (F) shall be added to the list of axioms for elementary arithmetic" would seem quite inappropriate. We want more intuitively obvious first principles.
NON STANDARD MODELS
In amazing contrast to TN the first order theory TR of the ordered field of the REALS has been successfully and completely formalized ÷ starting with Euclid's axioms, improved by Hilbert just before the turn of the century and completed as well as proved complete by Tarski about the time of the Second World War. My immediate reaction when I first heard of this feat was shock and distrust of those Berkeley logicians. "How could that be? The reals are so much more complicated than the integers. Aren't the natural numbers defined as the non negative integral reals?" Well, the solution of that conundrum lies in the
LIMITATION OF EXPRESSIVE POWER INHERENT IN FORMAL LANGUAGES.
As a matter of fact, the natural numbers are not "elementarily definable" among the reals; there is no wff of the language of R that picks out the natural numbers among the reals.
Moreover, in spite of its completeness, TR has non-isomorphic models! It has countable models, uncountable ones, Archimedean as well as Non-Archimedean ones; some harbor hyperreals, others only standard reals... What is going on? First the chicken-or-egg question must be faced: what comes first, the model or the theory? Ever since the elaborations by Tarski in 1934 and by Mal'cev in 1936 of the results by Lšwenheim of 1915, and by Skolem of 1920 (a brief exposition will follow below) we understand that first order chickens are prone to lay a medley of eggs, some "real" in the Platonic sense of being standard and others weird, artificial, substitutes, freaks, in short non-standard. The Ur-hen, the axiomatization, originated from a standard egg, the "intended interpretation", a natural mathematical construct like our everyday arithmetic of the positive integers, or, more sophisticated, the real number system of the 19th century. After the chicken has grown to maturity it starts laying models, and, roaming through the virtual reality of model theory instead of free ranging in Platonic realms, it comes up with non-standard eggs. The only constraint on those is consistency and the verification of the axioms, i.e., the genetic chicken code. These models are hatched within the confines of some entrenched formalization of set theory.
What really lies at the basis of non standard objects like hyper reals is ÷ again ÷ the limitation inherent in first order languages. In the elementary language of real number theory we cannot distinguish between Archimedean and non Archimedean orderings and that opens the door to constructions that were scorned by my teachers although they might use infinitesimals as a handy figure of speech the way we still talk Platonically. We thought that Cauchy and Weierstrass' arithmetization of analysis had done away with that alleged abuse of language, but now it is back en vogue again and very useful too (see below).
NON STANDARD PHENOMENA are closely connected with the SEMANTIC COMPLETENESS OF ELEMENTARY LOGIC, first proved by Gödel in 1930 [9] and extended in many ways since, in particular by Henkin who also dealt with formalizations of higher order logic. The underlying meta theorem rests on two facts, one inherent in the finitary nature of a formal deduction, the second involving non- constructive instructions for building a model
WHENEVER ALL FINITE SUBSETS OF A SET OF WFS' ARE CONSISTENT THEN SO IS THE ENTIRE SET and
EVERY CONSISTENT SET OF WFS' HAS A MODEL.
By definition Semantic Completeness of a formal calculus means EQUIVALENCE BETWEEN FORMAL DERIVABILITY AND SEMANTIC VALIDITY where validity stands for truth under all interpretations, i.e. in all models.
At first this looks like an amazing result especially in view of currently rampant incompleteness. It is unfortunate that popular literature so often fails to make a clear distinction between the two concepts of semantic and of syntactic completeness (pp.10,11). Only the experienced reader will automatically know from the context which notion is at stake.
As a matter of fact the completeness of first order logic is achieved at a price: the expressive poverty of the formal language. Completeness proofs for higher order logic are ensnared in the same kind of bargain. They are based on a concept of model that to the naive mind seems contrived. Elementary languages are incapable of distinguishing between arbitrarily large finiteness and infinity, and so are forced to tolerate the infinitely small. Consider the infinite set of wfs' 0 < a < 1, a + a < 1, a + a + a < 1,..,a + a + a +...+ a < 1,... and let U be its union with TR, the set of all wfs' that are true in the field R of the reals. Every finite subset V of U has a model: just take R and interpret a by 1/n, where n is the number of symbols occurring in that finite set V. By 1) then the whole set U is consistent and so, by 2) it and with it the elementary theory of the reals has a model which harbors an element satisfying all these inequalities, i.e., a non- Archimedean, non-standard, or hyper, real a. It is positive and yet smaller than any fraction 1/n, n a positive integer.
Ruled by its logic, the language cannot prohibit such anomalies. But there is a silver lining to this shortcoming: Because of the consistency of infinitesimals with TR every truth about the reals that can be expressed in the elementary language of R holds for all reals ÷ standard or not ÷ and so, by Gödel's completeness theorem, it has a formal proof. And if the approach via infinitesimals is smoother that is just great. One cannot help but marvel at the native instinct with which the seventeenth century mathematicians went about their work
Similarly, any first order theory of N, including TN, has models that contain infinitely large integers. The elementary theory of finite groups has infinite models and so in fact does every first order theory of arbitrarily large finite models.
All this is meant to explain that these non standard phenomena have no bearing on the question whether Platonism is an appropriate view of the origin of Mathematics. I am deliberately not using the word "correct". Whether Platonism is "true" seems an ill posed question, luring into vicious circles. How can we contemplate the truth of this, that or the other "ism" before we have a clear and distinct idea of what ÷ if anything ÷ we mean by the Truth of a theory?
The existence of non standard models should NOT be confounded with the occurrence of incomplete concepts like that of a geometry or that of a set. In the case of hyperreals we are running into limitations of the formal language while dealing with complete theories, in the second case we are simply facing the fact that the intuitive concept, say of a geometry or a set, that we had in mind when setting up the formalization is not completely fathomed yet, in both senses of completeness. Of course the easiest reaction is to say, "that concept is out there, let us go look more closely and we shall eventually find its complete characterization". In this frame of mind Gödel is reputed to have been convinced that we shall eventually understand enough about sets to come up with new axioms that will decide the continuum hypothesis. But in other cases the expedient policy will allow a concept to bifurcate ÷ sailors have no trouble with non-Euclidean geometries.
The big question is where our standard concepts come from, how do we all know what we mean by the Standard Reals? How can we distinguish between Archimedean orderings and non Archimedean ones, when we cannot make the distinction in first order language? Well we can always resort to hand waving when words fail. We can indeed communicate about them beyond the confines of formalism. They are conceptions, constructions, structures, figments of our imagination, of the human mind that is our common heritage. Other creatures may have other ways of making sense of and finding their way in a Universe that we are sharing with them.
This century has seen the development of a powerful tool, that of formalization, in commerce and daily life as well as in the sciences and mathematics. But we must not forget that it is only a tool. An indiscriminate demand for fool proof rules and dogmatic adherence to universal policies must lead to impasses. The other night, watching a program about the American Civil Liberties Union I was repeatedly reminded of Gödel's Theorem: every system is bound to encounter cases which it cannot decide, snags that will confront its user with a choice between either running into a contradiction or jumping out of the system . That is when, with moral issues at stake, cases of precedence are decided by thoughtful judgment going back to first principles of ethics, in the sciences alternate hypotheses are formed and in mathematics new axioms crop up.
Returning to my question, think of mathematics as a jungle in which we are trying to find our way. We scramble up trees for lookouts, we jump from one branch to another guided by a good sense of what to expect until we are ready to span tight ropes (proofs) between out posts (axioms) chosen judiciously. And when we stop to ask what guides us so remarkably well, the most convincing answer is that the whole jungle is of our own collective making ÷ in the sense of being a selection out of a primeval soup of possibilities. Monkeys are making of their habitat something quite different from what a pedestrian experiences as a jungle.
To sum it all up I see mathematical activity as a jumping ahead and then plodding along to chart a path by rational toil.
The process of plodding is being analyzed by proof theory, a prolific branch of meta mathematics. Still riddled with questions is the jumping.