> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kinetica.com/llms.txt
> Use this file to discover all available pages before exploring further.

# /aggregate/kmeans

```
URL: http://<db.host>:<db.port>/aggregate/kmeans
```

This endpoint runs the k-means algorithm - a heuristic algorithm that attempts
to do k-means clustering.  An ideal k-means clustering algorithm selects k
points such that the sum of the mean squared distances of each member of the
set to the nearest of the k points is minimized.  The k-means algorithm however
does not necessarily produce such an ideal cluster.   It begins with a randomly
selected set of k points and then refines the location of the points
iteratively and settles to a local minimum.  Various parameters and options are
provided to control the heuristic search.

NOTE:  The Kinetica instance being accessed must be running a CUDA (GPU-based)
build to service this request.

## Input Parameter Description

<ParamField body="table_name" type="string">
  Name of the table on which the operation will be performed. Must be an existing table, in \[schema\_name.]table\_name format, using standard [name resolution rules](../../concepts/tables/#table-name-resolution).
</ParamField>

<ParamField body="column_names" type="array of strings">
  List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
</ParamField>

<ParamField body="k" type="int">
  The number of mean points to be determined by the algorithm.
</ParamField>

<ParamField body="tolerance" type="double">
  Stop iterating when the distances between successive points is less than the given tolerance.
</ParamField>

<ParamField body="options" type="map of string to strings">
  Optional parameters.

  The default value is an empty map ( \{} ).

  <Expandable title="options">
    <ParamField body="whiten">
      When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
    </ParamField>

    <ParamField body="max_iters">
      Number of times to try to hit the tolerance limit before giving up - default is 10.
    </ParamField>

    <ParamField body="num_tries">
      Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
    </ParamField>

    <ParamField body="create_temp_table">
      If *true*, a unique temporary table name will be generated in the sys\_temp schema and used in place of *result\_table*. If *result\_table\_persist* is *false* (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in *qualified\_result\_table\_name*.

      The default value is `false`.

      The supported values are:

      * true
      * false
    </ParamField>

    <ParamField body="result_table">
      The name of a table used to store the results, in \[schema\_name.]table\_name format, using standard [name resolution rules](../../concepts/tables/#table-name-resolution) and meeting [table naming criteria](../../concepts/tables/#table-naming-criteria). If this option is specified, the results are not returned in the response.
    </ParamField>

    <ParamField body="result_table_persist">
      If *true*, then the result table specified in *result\_table* will be persisted and will not expire unless a *ttl* is specified.   If *false*, then the result table will be an in-memory table and will expire unless a *ttl* is specified otherwise.

      The default value is `false`.

      The supported values are:

      * true
      * false
    </ParamField>

    <ParamField body="ttl">
      Sets the [TTL](../../concepts/ttl/) of the table specified in *result\_table*.
    </ParamField>
  </Expandable>
</ParamField>

## Output Parameter Description

The Kinetica server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query.  Here is a description of the various fields of the wrapper:

<ResponseField name="status" type="String">
  'OK' or 'ERROR'
</ResponseField>

<ResponseField name="message" type="String">
  Empty if success or an error message
</ResponseField>

<ResponseField name="data_type" type="String">
  'aggregate\_k\_means\_response' or 'none' in case of an error
</ResponseField>

<ResponseField name="data" type="String">
  Empty string
</ResponseField>

<ResponseField name="data_str" type="JSON or String">
  This embedded JSON represents the result of the /aggregate/kmeans endpoint:

  <Expandable title="data_str">
    <ResponseField name="means" type="array of arrays of doubles">
      The k-mean values found.
    </ResponseField>

    <ResponseField name="counts" type="array of longs">
      The number of elements in the cluster closest the corresponding k-means values.
    </ResponseField>

    <ResponseField name="rms_dists" type="array of doubles">
      The root mean squared distance of the elements in the cluster for each of the k-means values.
    </ResponseField>

    <ResponseField name="count" type="long">
      The total count of all the clusters - will be the size of the input table.
    </ResponseField>

    <ResponseField name="rms_dist" type="double">
      The sum of all the rms\_dists - the value the k-means algorithm is attempting to minimize.
    </ResponseField>

    <ResponseField name="tolerance" type="double">
      The distance between the last two iterations of the algorithm before it quit.
    </ResponseField>

    <ResponseField name="num_iters" type="int">
      The number of iterations the algorithm executed before it quit.
    </ResponseField>

    <ResponseField name="info" type="map of string to strings">
      Additional information.

      The default value is an empty map ( \{} ).

      <Expandable title="info">
        <ResponseField name="qualified_result_table_name">
          The fully qualified name of the result table (i.e. including the schema) used to store the results.
        </ResponseField>
      </Expandable>
    </ResponseField>
  </Expandable>

  Empty string in case of an error.
</ResponseField>
