Attached is a completely vectorized, highly efficient version of my original clustering algorithm. Its runtime is drastically shorter, and comparable to my real-time algorithms, though it is based upon my original technique of iterating through different levels of discernment until we find the level that generates the greatest change in the entropy of the categorization. Also attached is a command line script that demonstrates how to apply it to a dataset.
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.