I wish I were a mathematician. There is in the history of the mathematical sciences, as in their substance, something that strangely stirs the imagination even of the most ignorant. Its younger sister, Logic, is as abstract, and its claims are yet wider. But it has never shaken itself free from a certain pretentious futility: it always seems to be telling us, in language quite unnecessarily technical, what we understood much better before it was explained. It never helps to discover, though it may guarantee discovery; it never persuades, though it may show that persuasion has been legitimate; it never aids the work of thought, it only acts as its auditor and accountant-general. I am not referring, of course, to what I see described in recent works as “modern scientific logic.” Of this I do not presume to speak. Still less am I referring to so-called Inductive Logic. Of this it is scarce worth while to speak. I refer to their more famous predecessor, the formal logic of the schools.
But in what different tones must we speak of mathematics! Mill, if I remember rightly, said it was as full of mysteries as theology. But while the value of theology for knowledge is disputed, the value of mathematics for knowledge is indisputable. Its triumphs can be appreciated by the most foolish, they appeal to the most material. If they seem sometimes lost to ordinary view in the realms of abstract infinities, they do not disdain to serve us in the humbler fields of practice. They have helped mankind to all the greatest generalizations about the physical universe: and without them we should still be fumbling over simple problems of practical mechanics, entangled in a costly and ineffectual empiricism.
But while we thank the mathematician for his aid in conquering Nature, we envy him his powers of understanding her. Though he deals, it would seem, entirely with abstractions, they are abstractions which, at his persuasion, supply the key to the profoundest secrets of the physical universe. He holds the clues to mazes where the clearest intellect, unaided, would wander hopelessly astray. He belongs to a privileged caste.
I intend no serious qualification of–this high praise when I add that, as regards the immediate subject of this lecture, I mean Probability, mathematicians do not seem to have given ignorant enquirers like myself all the aid which perhaps we have a right to ask. They have treated the subject as a branch of applied mathematics. They have supplied us with much excellent theory. They have exercised admirable skill in the solution of problems. But I own that, when we enquire into the rational basis of all this imposing superstructure, their explanations, from the lay point of view, leave much to be desired.
“Probability,” says an often-quoted phrase of Butler, “is the guide of life.” But the Bishop did not define the term; and he wrote before the theory of probability had attained to all its present dignities. Neither D'Alembert nor Laplace had discussed it. Quetelet had not applied it to sociology, nor Maxwell to physics. Jevons had not described it as the “noblest creation of the intellect.” It is doubtful whether Butler meant by it exactly what the mathematicians mean by it, and certain that he did not suspect any lurking ambiguity in the expression.
Nor, indeed, would the existence of such ambiguity be commonly admitted by any school of thought. The ordinary view is that the theory of probabilities is, as Laplace described it, “common sense reduced to calculation.” That there could be two kinds of probability, only one of which fitted this description, would be generally regarded as a heresy. But it is a heresy in which I myself believe; and which, with much diffidence, I now propose to defend.
The well-known paradox of the theory of probabilities is that, to all seeming, it can extract knowledge from ignorance and certainty from doubt. The point cannot be better put than by Poincare in discussing the physical theory of gases, where the doctrine of probability finds an important application. Let me give you his view—partly in paraphrase, partly in translation. “For omniscience,” he says in substance, chance would not exist. It is but the measure of our ignorance. When we describe an event as accidental we mean no more than that we do not fully comprehend the conditions by which it was brought about.
“But is this the full truth of the matter? Are not the laws of chance a source of knowledge? And, stranger still, is it not sometimes easier to generalise (say) about random movements than about movements which obey even a simple law—witness the kinetic theory of gases? And, if this be so, how can chance be the equivalent of ignorance? Ask a physicist to explain what goes on in a gas. He might, perhaps, express his views in some such terms as these: ‘You wish me to tell you about these complex phenomena. If by ill luck I happened to know the laws which govern them, I should be helpless. I should be lost in endless calculations, and could never hope to supply you with an answer to your questions. Fortunately for both of us, I am completely ignorant about the matter; I can, therefore, supply you with an answer at once. This may seem odd. But there is something odder still, namely, that my answer will be right.’”
Now, what are the conditions which make it possible thus to extract a correct answer from material apparently so unpromising? They would seem to be a special combination of ignorance and knowledge, the joint effect of which is to justify us in supposing that the particular collection of facts or events with which we are concerned are happening “at random.” If we could calculate the complex causes which determine the fall of a penny, or the collisions of a molecule, we might conceivably deal with pennies or molecules individually; and the calculus of probability might be dispensed with. But we cannot; ignorance, therefore, real or assumed, is thus one of the conditions required to provide us with the kind of chaos to which the doctrine of chances may most fittingly be applied. But there is another condition not less needful, namely, knowledge—the knowledge that no extraneous cause or internal tendency is infecting our chaotic group with some bias or drift whereby its required randomness would be destroyed. Our penny must be symmetrical, and Maxwell's demons1 must not meddle with the molecules.
The slow disintegration of radium admirably illustrates the behaviour of a group or collection possessing all the qualities which we require. The myriad atoms of which the minutest visible fragment is composed are numerous enough to neutralise eccentricities such as those which, in the case of a game of chance, we call “runs of luck.” Of these atoms we have no individual knowledge. What we know of one we know of all; and we treat them not only as a collection, but as a collection made at random. Now, physicists tell us that out of any such random collection a certain proportion will disintegrate in a given time; and always the same proportion. But whence comes their confidence in the permanence of this ratio? Why are they so assured of its fixity that these random explosions are thought to provide us with a better time-keeper than the astronomical changes which have served mankind in that capacity through immemorial ages? The reason is that we have here the necessary ignorance and the necessary knowledge in a very complete form. Nothing can well exceed our ignorance and the differences between one individual radium atom and another, though relevant differences there must be. Nothing, again, seems better assured than our knowledge that no special bias or drift will make one collection of these atoms behave differently from another. For the atomic disintegration is due to no external shock or mutual reaction which might affect not one atom only, but the whole group. A milligram of radium is not like a magazine of shells, where if one spontaneously explodes all the rest may follow suit. The disruption of the atom is due to some internal principle of decay whose effects no known external agent can either hasten or retard. Although, therefore, the proportion of atoms which will disintegrate in a given time can only be discovered, like the annual death-rate among men, by observation, yet once discovered it is discovered for ever. Our human death-rate not only may change, but does change. The death-rate of radium atoms changes not. In the one case, causes are in operation which modify both the organism and the surroundings on which its life depends. In the other case, it would seem that the average of successive generations of atoms does not vary, and that, once brought into existence, they severally run their appointed course unaffected by each other or by the world outside.
So far we have been concerned with groups or collections or series; and about these the doctrine of chances and the theory of error may apparently supply most valuable information. But in practical affairs—nay, even in many questions of scientific speculation—we are yet more concerned about individual happenings. We have, therefore, next to ask how we can infer the probability of a particular event from our knowledge of some group or series to which it belongs.
There seems at first sight no difficulty in this, provided we have sufficient knowledge of the group or series of which the particular event is a member. If we know that a tossed penny will in the long run give heads and tails equally often, we do not hesitate to declare that the chances of a particular throw giving “heads” are even. To expect in any given case heads rather than tails, or tails rather than heads, is inconsistent with the objective knowledge of the series which by hypothesis we actually possess.
But what if our information about the group or series is much less than this? Suppose that, instead of knowing that the two possible alternatives do in fact occur equally often, we are in the less advantageous position of knowing no reason why they should not occur equally often. We ought, I suppose, still to regard the chances of a particular toss as even; although this estimate, expressed by the same fraction (½) and held with the same confidence, is apparently a conclusion based on ignorance, whereas the first conclusion was apparently based on knowledge.
If, for example, we know that a die is fairly made and fairly thrown, we can tell how often a particular number will turn up in a long series of throws, and we can tell what the chances are that it will turn up on the occasion of a single throw. Moreover, the two conclusions seem to be logically connected.
But if we know that the die is loaded we can no longer say how the numbers will be distributed in a series of throws, however long, though we are sure that the distribution will be very different from what it would have been had the die been a fair one. Nevertheless, we can still say (before the event) what the chances are of a particular number turning up on a single throw; and these chances are exactly the same whether the die be loaded or whether it be fair—namely, one to five. Our objective knowledge of the group or series has vanished, but, with the theory of probability to help us, our subjective conviction on this point apparently remains unchanged. There is here, surely, a rather awkward transition from the “objective” to the “subjective” point of view. We were dealing, in the first case, with groups or series of events about which the doctrine of chances enabled us to say something positive, something which experience would always confirm if the groups or series were large enough. A perfect calculator, endowed with complete knowledge of all the separate group members, would have no correction to make in our conclusions. His information would be more complete than our own, but not more accurate. It is true that for him “averages” would have no interest and “chance” no meaning. Nevertheless, he would agree that in a long series of fair throws of a fair die any selected face would turn up one-sixth times as often as all the faces taken together. But in the second case this is no longer so. Foresight based on complete knowledge would apparently differ from foresight based on the calculation of chances. Our calculator would be aware of the exact manner in which the die was loaded, and of the exact advantage which this gave to certain numbers. He would, therefore, know that in asserting the chance of any particular number turning up on the first throw to be one to five, we were wrong. In what sense, then, do we deem ourselves to have been right? The answer, I suppose, is that we were right not about a group of throws made with this loaded die, but about a group of such groups made with dice loaded at random—a group in which “randomness” was so happily preserved amongst its constituent groups that its absence within each of these groups was immaterial, and no one of the six alternative numbers was favoured above another.
A similar reply might be given if we suppose our ignorance carried yet a step further. Instead of knowing that our die was loaded, and being ignorant only of the manner of its loading, we might be entirely ignorant whether it was loaded or not. The chances of a particular number turning up on the first throw would still be one to five. But the series to which this estimate would refer would neither be one composed of fair throws with a fair die, nor one composed of a series of throws with dice loaded at random, but one composed of a series of throws with dice chosen at random from a random collection of dice, loaded and not loaded!
It seems plain that we have no experimental knowledge of series piled on series after this fashion. Our conclusions about them are not based on observation, nor collected from statistics. They are arrived at a priori; and when the character of a series is arrived at a priori, the probability of a particular event belonging to it can be arrived at independently by the same method. No reference to the series is required. The reason we estimate the chances against any one of the six possible throws of a die as five to one under each and all of the suppositions we have been discussing is that under none of them have we any ground for thinking any one of the six more probable than another;—even though we may have ground for thinking that in a series of throws, made with that particular die, some number, to us unknown, will in fact turn up with exceptional frequency.
The most characteristic examples, therefore, of problems in probability depend for their solution on a bold use of the “principle of sufficient reason.” We treat alternatives as equally likely when we cannot see any ground for supposing that one is more likely than another. This seems sensible enough; but how far may we carry this process of extracting knowledge from ignorance? An agnostic declines to offer any opinion on the being of God because it is a matter about which he professes to know nothing. But the universe either has a spiritual cause, or it has not. If the agnostic is as ignorant as he supposes, he cannot have any reason for preferring the first alternative to the second, or the second to the first. Must he, therefore, conclude that the chances of Theism are even? The man who knows this knows much. He knows, or may know, that God's existence is slightly more probable than his own chance of winning a coup at Monte Carlo. He knows, or may know, the exact fraction by which the two probabilities differ. How, then, can he call himself an agnostic?
Every one must, I think, feel that such reasoning involves a misuse of the theory of probability. But is that misuse without some justification? The theory, unless I misread it, permits, or rather requires, us to express by the same fraction probabilities based on what is little less than complete knowledge, and probabilities based on what is little more than complete ignorance. To arrive at a clear conclusion, it seems only necessary to apply the “law of sufficient reason” to defined alternatives; and it is apparently a matter of perfect indifference whether we apply this law in its affirmative or its negative shape; whether we say “there is every reason for believing that such and such alternatives happen equally often,” or whether we say “there is no reason for thinking that one alternative happens more often than the other.” I do not criticise this method; still less do I quarrel with it. On the contrary, I am lost in admiration of this instrument of investigation, the quality of whose output seems to depend so little on the sort of raw material with which it is supplied.
My object, indeed, is neither to discuss the basis on which rests the calculus of probabilities—a task for which I own myself totally unfit—nor yet to show that a certain obscurity hangs over the limits within which it may properly be employed. I desire rather to suggest that, wherever those limits are placed, there lies behind them a kind of probability yet more fundamental, about which the mathematical methods can tell us nothing, though it possesses supreme value as a “guide of life.”
Wherein lies the distinction between the two? In this: the doctrine of calculable probability (if I may so call it) has its only application, or its only assured application, within groups whose character is either postulated, or is independently arrived at by inference and observation. These groups, be they natural or conventional, provide a framework, marking out a region wherein prevails the kind of ignorance which is the subjective reflection of objective “randomness.” This is the kind of ignorance which the calculus of probabilities can most successfully transmute into knowledge; and herein lies the reason why the discoverers of the calculus found their original inspiration in the hazards of the gambling-table, and why their successors still find in games of chance its happiest illustrations. For in games of chance the group framework is provided by convention; perfect “randomness” is secured by fitting devices; and he who attempts to modify it is expelled from society as a cheat.
None of these observations apply to the kind of probability on whose importance I am now insisting. If calculable probability be indeed “common sense reduced to calculation,” intuitive probability lies deeper. It supports common sense, and it supplies the ultimate ground—be it secure or insecure—of all work-a-day practice and all scientific theory. It has nothing to do with “randomness”; it knows nothing of averages; it obeys no formal laws; no light is thrown on it by cards or dice; it cannot be reduced to calculation. How, then, is it to be treated? What place is it to occupy in our general scheme?
Maxwell, as all who interest themselves in physics are aware, arrived at very interesting conclusions by considering what would happen if little demons interfered with the random motions of the molecules constituting a gas.