Autonomous Noise Filtering

I’ve updated Prometheus to include autonomous noise filtering.

This means that you can give it data that has dimensions that you’re not certain contribute to the classification, and might instead be noise. This allows Prometheus to take datasets that might currently produce very low accuracy classifications, and autonomously eliminate dimensions until it produces accurate classifications.

It can handle significant amounts of noise:

I’ve given it datasets where 50% of the dimensions were noise, and it was able to uncover the actual dataset within a few minutes.

In short, you can give it garbage, and it will turn into gold, on its own.

Since I’ve proven that it’s basically mathematically impossible to beat nearest neighbor using real-world Euclidean data, and I’ve come up with a vectorized implementation of nearest neighbor, this version of Prometheus uses only nearest neighbor-based methods.

As a result, the speed is insane.

If you don’t use filtering, classifications occur basically instantaneously on a cheap laptop. If you do have some noise, it still takes only a few minutes for a dataset of a few hundred vectors to be processed, even on a cheap laptop.

Attached should be all the code you need to run it, together with a command line script that demonstrates how to use it. But, it’s probably easier to simply download the my full library as a zip file from my researchgate blog.

Enjoy, and if you’re interested in a commercial version, please let me know.

calculate_std_dev

CMND_LINE

express_normalize_dataset

find_NN

find_NN_dataset

generate_categories_N

optimize_categories_N