I’ve developed an algorithm that can generate a cluster for a single input vector in a fraction of a second. See “no_model_optimize_cluster” using the link below.
This will allow you to extract items that are similar to a given input vector from a dataset without any prior training, basically instantaneously.
Further, I presented a related hypothesis that there is a single objective value that warrants distinction for any given dataset in this research note:
To test this hypothesis again, I’ve also included a script that repeatedly calls the clustering function over an entire dataset, and measures the norm of the difference between the items in each cluster.
The resulting difference appears to be very close to the value of delta generated by my categorization algorithm, providing further evidence for this hypothesis.
The code is available here: Real-time Clustering
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.