On Distributions of Complexities

Last night, I came to the rather obvious, but in my opinion fascinating conclusion, that there are more binary N \times N matrices than there are graphs with N vertices. Stated differently, not all binary matrices correspond to graphs. This is because of the way graphs are generally represented as matrices, where entry (i,j) is 1 only if vertex i is adjacent to vertex j. In an ordinary, undirected graph, a vertex is never adjacent to itself, and because the edges are undirected, if vertex i is adjacent to vertex j, then vertex j is adjacent to vertex i. As a result, if a given matrix represents a graph, then it must be the case that entry (i,i) = 0, since a vertex is never adjacent to itself, and entry (i,j) = (j,i), since the edges are undirected. Therefore, the set of N \times N matrices that correspond to graphs is smaller than the set of all N \times N matrices.

In fact, you can easily show that the portion of N \times N matrices that correspond to graphs goes to zero as N increases. There are a maximum of N choose 2 edges in a graph of order N. Therefore, the number of labelled graphs on N vertices is 2^{N \choose 2} = 2^{\frac{N(N-1)}{2}} <  A = 2^{\frac{N^2}{2}}. In contrast, the number of N \times N matrices is B = 2^{N^2}. The ratio \frac{A}{B} = 2^{- \frac{N^2}{2}}, which approaches zero as N approaches infinity. Since you can trivially construct a binary string from a binary matrix by appending row i + 1 to the end of row i, it follows that the portion of binary strings of length N that correspond to graphs approaches zero as N approaches infinity.

As a result, graphs, as a class of mathematical objects, are fewer in number than binary strings for any given order.

This might seem like it’s just some high school combinatorics, but it points to something profound, if you consider this fact in the context of complexity. Specifically, you can show that most strings of a given length are random in the sense that their Kolmogorov complexity is approximately equal to their length. Stated mathematically, if x is a string, and x is Kolmogorov-random, then K(x) \approx ||x||. This implies that compressible strings are actually rare, despite the fact that most of what we see in the world is compressible. That is, the everyday objects you see have significant symmetries, which necessarily implies that they are not Kolmogorov-random, using any sensible representation of the object. For example, the laptop I’m using to write this article is symmetrical both horizontally and vertically, down to reasonable levels of measurement. As a result, I can reconstruct a reasonable image of the entire laptop given only a fraction of an image of the laptop.

This is also true of physics itself at our scale of observation, in that the outputs of Newton’s equations are never Kolmogorov-random, since they are by definition computable. That is, because Newtonian physics is written in the language of computable functions, it is necessarily the case that the behavior of Newtonian objects is compressible. For example, if you track the path of a baseball thrown across home plate, you will produce a curve that is Newtonian, and therefore, of a very low complexity, when constructed using any sensible scale of observation. It is, therefore, at least anecdotally fair to conclude that the distribution of complexities of everyday objects differs significantly from the distribution of complexities of binary strings. Specifically, most binary strings are Kolmogorov-random, whereas most everyday objects are not.

This suggests a purely mathematical question about the distribution of complexities over sets of mathematical objects. Specifically, in this case, what is the distribution of the complexities over the set of graphs of a given order?

Intuitively, it seems like graphs should have a distribution of complexities that is similar to the distribution of complexities of binary strings. Let’s imagine a binary string s = a_1, a_2, \ldots, a_N. If each a_i = 0, then the string will have a low complexity. Similarly, if each a_j = 1, then the string will have a low complexity. In fact, the complexity of the compliment of any given string can differ by at most a constant from the original string, since we can write a simple program that can take the compliment of any given string, the complexity of which will not depend upon the string in question. Therefore, the average complexity over all strings with a given number of bits set to one is symmetrical as a function of the number of bits that are set to one.

Now let’s consider graphs using analogous reasoning, and consider a graph as a set of N \choose 2 edges, each of which is either included or not. If we begin with an empty graph, intuitively, this graph should have a low complexity, since it contains no edges. This is supported by the associated graph matrix, which will consist of nothing but zeros. Similarly, if we consider a complete graph, which will include all possible edges, this graph should again have a low complexity, since it has a simple rule of construction: connect every pair of vertices. This is again supported by the associated graph matrix, which will consist of nothing but ones (except along the diagonal). Again, the complexity of a graph and its compliment can differ by at most a constant, since we can write a simple program that calculates the complement of any graph, that will not depend upon the graph in question. Therefore, the average complexity for a graph with a given number of edges will again be symmetrical as a function of the number of edges. Note that we are ignoring yet another detail, which is that we will need a separate program that takes a matrix, and actually displays a graph. But again, you can write a rudimentary program that does exactly this, the complexity of which will not depend upon the graph in question.

As a result, we have established that the distribution of complexities over the set of binary strings necessarily has some structure in common with the distribution of complexities over the set of graphs. At the same time, the portion of strings that correspond to graph matrices approaches zero as the length of the string approaches infinity. This suggests that the set of binary strings contains a subset that is, in terms of the distribution of its complexities, similar to itself.


Discover more from Information Overload

Subscribe to get the latest posts sent to your email.

Leave a comment