In a previous article, I introduced an algorithm that can partition data observed over time, as an analog of my core image partitioning algorithm, which of course operates over space. I’ve refined the technique, and below is an additional algorithm that can partition a dataset (e.g., an audio wave file) given a fixed compression percentage, and it runs in about .5 seconds, per 1 second of audio (44100 Hz mono).
Here are the facts:
If you fix compression at about 90% of the original data, leaving a file of 10% of the original size, you get a decent sounding file, that visually plainly preserves the original structure of the underlying wave. Compression in this case takes about .5 seconds, per 1 second of underlying mono audio, which is close to real time (run on an iMac). If you want the algorithm to solve for “ideal” compression, then that takes about 1 minute, per 1 second of underlying mono audio, which is obviously not real time, but still not that bad.
What’s interesting, is that not only is the audio quality pretty good, even when, e.g., compressing 98% of the underlying audio data, you also preserve the visual shape of the wave (see above). For a simple spoken audio dataset, my algorithm places the “ideal” compression percentage at around 98%, which is not keyed off of any normal notions of compression, because it’s not designed for humans, but is instead designed for a machine to be able to make sense of the actual underlying data, despite compression. So, even if you think the ostensibly ideal compression percentage sounds like shit, the reality is, you get a wave file that contains the basic structure of the original underlying wave, with radical compression, which i’ll admit, I’ve yet to really unpack and apply to any real tasks (e.g., speech recognition, which I’m just starting). However, if your algorithms (e.g., classification or prediction) work on the underlying wave file, then it is at least at this point not yet absurd that they would also work on the compressed wave, and that’s what basically all of my work in A.I. is based upon:
Compression for machines, not people.
And so, any function that turns on macroscopic structure, and not particulars, which is obviously the case for many tasks in A.I., like classification, can probably be informed by a dataset that makes use of far more compression than a human being would like. Moreover, if you have fixed capacity for parallel computing, then these types of algorithms allow you to take an underlying signal, and compress it radically, so that you can then generate multiple threads, so that you can, e.g., apply multiple experiments to the same compressed signal. Note that this is not yet fully vectorized, though it plainly can be, because it relies upon the calculation of independent averages. So even if, e.g., Octave doesn’t allow for full vectorization in this case, as a practical matter, you can definitely make it happen, because the calculations are independent.
Attached is the applicable code, together with a simple audio dataset of me counting in English.
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.