k-Means Algorithm

Oracle Data Mining implements an enhanced version of the k-Means algorithm with the following features:

The clusters discovered by enhanced k-Means are used to generate a Bayesian probability model that is then used during scoring (model apply) for assigning data points to clusters. The k-Means algorithm can be interpreted as a mixture model where the mixture components are spherical multivariate normal distributions with the same variance for all components.


Note:

The k-Means algorithm samples one million rows. You can use the sample to build the model.