Posit a language over a set of characters capable of expressing any mathematical proof. Now consider the set of all strings
that can be generated by taking any
characters from
. For example,
, where each
would be such a string in
. The cardinality of
is given by
. Now consider the portion of
comprised of strings that express a true theorem of mathematics. It is of course going to be extremely small, compared to the total cardinality of
, which grows exponentially as a function of
.
In my paper, Information, Knowledge, and Uncertainty [1], I showed both as a matter of theory, and empiricism, that the knowledge conveyed by an observation is given by,
,
where is the information content of the observation, and
is the uncertainty of that observation. The information content of an observation is simply the logarithm of the number of states the observation could have taken on. See [1] generally.
Posit a theorem , and further assume that its shortest proof is of length
, and that it is unique (i.e. there is no other proof of length
). It follows that the information content is in this case
. The uncertainty is given by
, because the state space of the problem is reduced to one possibility (i.e., the proof and theorem sought after). See Section 2 of [1]. It follows that the knowledge conveyed by
is
. Therefore, the knowledge conveyed by a theorem is a linear function of the length of its shortest proof.
Now you could of course argue that the calculation of is unduly generous, because mathematicians don’t churn through all possible strings. As a consequence, perhaps a more fair measure would be the number of valid statements of length
. However, this is intuitively still exponential as a function of
, though it is an interesting combinatorial problem. Nonetheless, the point being, that a standard page contains roughly 2,000 characters, implying an enormous amount of information is conveyed by theorems. This is contrast to the amount of information conveyed through observation, which is discussed at length in [1], and is plainly not as useful, because in order to produce an amount of knowledge equivalent to a 1 page proof, you would have to produce roughly 2,000 observations. No human being can consciously store 2,000 observations in their memory. In contrast, you can study, understand, and memorize a proof. This implies that mathematical knowledge allows human beings to maximize their measurable knowledge, whereas observation plainly has a low threshold, since you simply can’t keep too much information in memory.
The plain takeaway is that mathematicians produce knowledge that is plainly more useful than the information produced by ordinary people, and this is obviously the case anyway.
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.