Skip to main content
URL: http://<db.host>:<db.port>/aggregate/kmeans
This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search. NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.

Input Parameter Description

table_name
string
Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.
column_names
array of strings
List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
k
int
The number of mean points to be determined by the algorithm.
tolerance
double
Stop iterating when the distances between successive points is less than the given tolerance.
options
map of string to strings
Optional parameters.The default value is an empty map ( {} ).

Output Parameter Description

The Kinetica server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:
status
String
‘OK’ or ‘ERROR’
message
String
Empty if success or an error message
data_type
String
‘aggregate_k_means_response’ or ‘none’ in case of an error
data
String
Empty string
data_str
JSON or String
This embedded JSON represents the result of the /aggregate/kmeans endpoint:Empty string in case of an error.