Measuring the Information Content of a Function

It just dawned on me that my paper, Information, Knowledge, and Uncertainty [1], seems to allow us to measure the amount of information a predictive function provides about a variable. Specifically, assume F: \mathbb{R}^K \rightarrow S \subset \mathbb{R}. Quantize S so that it creates M uniform intervals. It follows any sequence of N predictions can produce any one of M^N possible outcomes. Now assume that the predictions generated by F produce exactly one error out of N predictions. Because this system is perfect but for one prediction, there is only one unknown prediction, and it can be in any one of M states (i.e., all other predictions are fixed as correct). Therefore,

U = \log(M).

As a general matter, our Knowledge, given E errors over N predictions, is given by,

K = I - U = (N - E) \log(M).

If we treat \log(M) as a constant, and ignore it, we arrive at N - E. This is simply the equation for accuracy, multiplied by the number of predictions. However, the number of predictions is relevant, since a small number of predictions doesn’t really tell you much. As a consequence, this is an arguably superior measure of accuracy, that is rooted in information theory. For the same reasons, it captures the intuitive connection between ordinary accuracy and uncertainty.


Discover more from Information Overload

Subscribe to get the latest posts sent to your email.

Leave a comment