I started thinking about correlation again, and it dawned on me, that when we do math and science, we’re looking for computable relationships between sets of numbers. The most basic example is a function that describes the behavior of a system. Ideally, we have exact descriptions, but we often settle for probabilistic descriptions, or descriptions that are exact within some tolerance. I’d say that the search for computable relationships between sets of numbers is in fact science. But when you consider the full set of relationships between two sets of numbers, the set we’re interested in is therefore minuscule in terms of its density. In the extreme case, the set of computable functions has a density of zero compared to the set of non-computable functions. Yet, science and mathematics have come to the point where we can describe the behaviors of far away worlds, and nanoscopic terrestrial systems.
Therefore, it’s tempting to think that the computable is everything, and the balance of relationships are nonsense. For physical intuition, consider a Picasso painting represented as an RGB image, and compare that to the full set of random RGB images of the same size. What’s strange, is that the set in question will contain other known masterpieces, and new unknown masterpieces, if the image is large enough and the pixel size is small enough. And while I haven’t done the math, I’d wager the density of masterpieces is quite small, otherwise we’d all be famous painters, and since we’re not, I don’t think you can gamble your way into a masterpiece.
Similarly, if I have two sets of random numbers, and I simply connect them with strings, you’d probably think I’m a lunatic, though I’ve just defined a function. Whereas if I point out that I can inject the set of integers into the set of even integers, you’d swear I’m a genius. This might seem like unscientific thinking, but it isn’t. It’s arguably all the same, in that humans have a clear preference for computability, and that translates into a preference for symmetry over asymmetry. Personally, I listen to a lot of strange, highly random music, and enjoy Gregory Chaitin’s work as well, but in all seriousness, are we missing valuable information in the form of more complex functions, and perhaps tragically in the form of non-computable functions, assuming no non-computable machine exists?
I’m certainly not a scholar on the topic, but I am interested in the history of Machine Learning, and this morning, I discovered a concept known as the Fisher Information. This is the same Sir Ronald Fisher that developed the Iris Dataset in 1936, which is most certainly a Machine Learning dataset, though it predates the first true computer the ENIAC which was built in 1945. The point being that the Iris Dataset itself is way ahead of its time, using measurable characteristics of various flowers to then determine the species of the flowers. This is a deep idea, in that you have the mathematical classification of species, which I would argue goes beyond the anatomical, and brings biology into the mathematical sciences.
But on top of this, and what seem to be many other achievements I don’t know much about, he had a really clever idea regarding mutual information between variables. Specifically, how much does a given probability distribution change as a function of . His answer was to look at the derivative of as a function of , though the specific formula used is a bit more complicated. Nonetheless, the basic idea is, how sensitive is a distribution to one of its parameters, and what does that tell me.
This is exactly what Machine Learning engineers do all the time, which is to test the relevance of a dimension. Just imagine you had a dataset with dimensions through , and that you have a prediction function on that dataset . Now imagine you add a set of weights , for , so that you instead consider the function . That is, we’ve added weights that will reduce the contribution of each dimension simply by multiplying by a constant in . This is one of the most basic things you’ll learn in Machine Learning, and the rate of change in accuracy as a function of each will provide information about how important each dimension is to the prediction function. This is basically what Fisher did, except almost one hundred years ago, effectively discovering a fundamental tool of Machine Learning.
The point is more than just historical, I think Machine Learning is a buzzword used to cover up the fact that a lot of this stuff was known a long time ago, and that Artificial Intelligence is, generally speaking, far more advanced than the public realizes, and that as a matter of logical implication, most of what we believe to be new and exciting breakthroughs are often mundane adaptations of existing methods and technology. The fact that so much money is being poured into the market is disturbing, because I have no idea what these people do all day.
I’ve noticed in the past that Finns have significantly higher IQ’s than the Swedes and Norwegians. This is in my opinion the group of people to study if you’re interested in the nature of intelligence, because they’re all very similar people, from roughly equally rich nations, in the same part of the world, which should allow innate ability to take control. One notable difference is that the Finns speak an Uralic language, whereas the Norwegians and Swedes speak a Germanic language. There could be something to this, but investigating the problem again today led me to what seems an inescapable conclusion, that whatever the connection is between mtDNA and intelligence, it simply cannot account for the distribution of IQ as it exists.
Instead I now believe that brain structure is the most important factor in intelligence, which simply cannot be controlled by mtDNA in any credible way. Specifically, my thinking is rooted in algorithmic complexity, that if you have two equally powered machines, running different algorithms that accomplish the same task, then the machine with the more efficient algorithm of the two will be the most powerful of the two. Translated to biology, if you have two brains that both consume the same amount of power per unit of time, and have the same “clock rate”, one brain could still be vastly more powerful than the other, due simply to different structure. This could explain e.g., the fact that some birds can talk, whereas some dogs will eat until they vomit, despite the fact that birds have brain volumes that are a small fraction of a dog’s brain volume.
mtDNA and Intelligence
Despite the apparent complexity of the subject, this is going to be a short note, because the idea that mtDNA controls for IQ is apparently nonsense, despite the scholarship on the topic (not picking on anyone, but here’s a decent article that runs through some credible arguments for the role of mtDNA in intelligence). But as you’ll see, whole-genome sequencing throws the argument in the garbage.
There’s no nice way to say this, but the Roma people have pretty low IQs, but what’s most interesting about them, is that they are basically identical to each other, and all other people of that maternal line, including about 100% of Papuans, 67% of Russians, and about 30% of Taiwanese people. If you want to test the results yourself, you can see my paper, “A New Model of Computational Genomics” [1], which includes all the software, and a detailed walkthrough to explain how I end up with these numbers. At a high level, the Papuans, Russians, and Taiwanese people in this group of Roma lineage, are all a 99% match to the Iberian Roma, with respect to their mtDNA. If mtDNA controlled intelligence, then all of those populations should have similarly low IQ’s, since they’re basically identical to the Roma. This is just not true, and instead the Taiwanese have around the highest and second highest IQ on Earth, and the Russians have roughly the same IQ as the Norwegians and Swedes, despite the fact that Russia is, quite frankly, poor and dysfunctional compared to Norway and Sweden.
One important note, though you’ll often hear that “humans are 98% monkey”, or some nonsense like that, the algorithms in [1] use what’s called a global alignment, and as a consequence, they’re extremely sensitive to changes in position, causing e.g., the Roma to have little more than chance in common with some people (i.e., about 25% of the mtDNA bases). This sensitivity is probably why the software in [1] is so powerful, and is able to predict ethnicity with about 80% accuracy, using mtDNA alone (which is pretty amazing). In contrast, NIH’s BLAST algorithm uses a local alignment, and so it deliberately seeks to maximize the number of matching bases, by shifting two genomes around, causing everyone to look the same, and therefore, throwing away valuable information about the genome.
Getting back to the core topic, if you pay attention to this limited set of facts, mtDNA is in the garbage as a driver of intelligence, and moreover, the role of poverty is not exactly clear either, since Russia is really poor compared to Norway and Sweden, and yet they have roughly the same IQs. So what is driving this? Cynically, I think IQ testing is really just testing for basic education (when you look at a map), which is absent in the truly poorest countries, but that doesn’t mean that we can’t debunk the connection between mtDNA and intelligence. But to be clear, I do think intelligence is genetic, and in anomalous cases like Finland, Cambodia, and Suriname, IQ becomes something interesting, because it’s at least a test. I just doubt it’s mtDNA driving the bus.
Some Answers from Computer Science
Even if we posit arguendo (which is not very nice) that there’s something wrong with Roma mtDNA, this would simply imply that they have low energy per unit of time, perhaps as a function of fixed caloric intake and environment. To make this less abstract, let’s fix a Norwegian guy (not Roma) and a Russian guy (Roma), and give them the same food, education, climate, environment, clothes, etc., over a lifetime. Under this assumption, the Russian guy will produce less energy over his lifetime, and therefore, his brain has a lower output. But this is garbage as an argument, for mechanical reasons: if the Russian guy has a more efficient brain, then he doesn’t need as much power to run his brain. As a consequence, his output over a lifetime could in fact be higher.
To make things completely concrete, if you use a brute force method to sort a list of 10 letters, you’ll have to perform 10! = 3,628,800 calculations. If you instead use my parallel method, you’ll have to make between 3 and 4 calculations. As you can plainly see, there is an ocean between these two approaches to solving even the simple problem of sorting a list. As a consequence, the most sensible answer is, in my opinion, that brain structure controls for intelligence, for the simple reason, that it encodes the algorithms we use to solve the problems we face every day. Some people have fast ones, some people have dumb ones, and then there’s (probably) most people in the middle.
Returning to the birds versus dogs analogy, I think it’s not ridiculous to argue that birds have vastly more efficient brains than dogs, that something along the lines of computational efficiency is taking place in the brain of a bird, that allows it to perform complex tasks, with a smaller, presumably lower-energy brain. For the same reasons, this could explain the obvious fact that some people are wildly more intelligent than others, despite (possibly) having the same maternal line. Because intelligence varies within a given ethnicity, I can tell you that you are e.g., Norwegian, with high accuracy using just your mtDNA, but there’s no way of knowing (to my knowledge) whether you’re one of the dumb ones. This doesn’t preclude identifying deficiencies in mtDNA that will make you dangerously ill, and therefore not very bright, but it just doesn’t make sense that the means of power-production controls the most complex structure in the Universe –
It’s a single bean, in an ocean of genetic information.
I came to the conclusion last night, that I may have wasted a lot of time, thinking about time as an actual dimension of space. In my defense, I’m certainly not the first physicist or philosopher to do so. Specifically, my entire paper, A Computational Mode of Time-Dilation [1], describes time as a measurement of physical change, not a dimension. Nonetheless, it produces the correct equations for time-dilation, without treating time as a dimension, though for convenience, in a few places, I do treat it as a dimension, since a debate on the corporeal nature of time is not the subject of the paper, and instead, the point is, you can have objective time, and still have time-dilation.
As a general matter, my view now is that reality is a three-dimensional canvas, that is updated according to the application of a rule, effectively creating a recursive function. See Section 1.4 of [1]. Because [1] is years old at this point, this is obviously not a “new” view, but one that I’ve returned to, after spending a lot of time thinking about time as an independent dimension, that could, e.g., store all possible states of the Universe. The quantum vacuum was one of the primary drivers for this view, specifically, that other realities temporarily cross over into ours, and because that’s presumably a random interaction, you should have a net-zero charge (i.e. equal representation from all charges), momentum, etc, on average, creating an otherwise invisible background to reality, save for extremely close inspection.
I don’t think I’m aware of any experiment that warrants such an exotic assumption, and I’m not even convinced the quantum vacuum is real. As such, I think it is instead rational to reject the idea of a space of time, until there is an experiment that, e.g., literally looks into the future, as opposed to predicting the future using computation.
I’ll concede the recursive function view of reality has some problems without time as a dimension, because it must be implemented in parallel, everywhere in space, otherwise, e.g., one system would update its states, whereas another wouldn’t, creating a single reality with multiple independent timelines. This is not true at our scale, and I don’t think there’s any experiment that shows it’s true at any scale. So if time doesn’t really exist as a dimension, we still need the notion of syncopation, which is in all fairness, typically rooted in time. But that doesn’t imply time is some form of memory of the past, or some projection of the future.
This is plainly an incomplete note, but the point is to reject the exotic assumptions that are floating around in modern physics, in favor of something that is far simpler, yet works. Reality as a recursive function makes perfect sense, taking the present moment, transforming it everywhere, producing the next moment, which will then be the present, with no record of the past, other than by inference from the present moment.
We’re still left with the peculiar fact that all of mathematics seems immutable (e.g., all theorems of combinatorics govern reality, in a manner that is more primary than physics, since they can never change or be wrong), but that doesn’t imply time is a dimension, and instead, my view, is that mathematics is beyond causation, and simply an aspect of the fabric of reality, whereas physics is a rule, that is applied to the substance contained in reality, specifically, energy. Physics doesn’t seem to change, but it could, in contrast, mathematics will never change, it’s just not possible.