public class AggregateKMeansRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.aggregateKMeans
.
This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
Modifier and Type | Class and Description |
---|---|
static class |
AggregateKMeansRequest.Options
A set of string constants for the
AggregateKMeansRequest
parameter options . |
Constructor and Description |
---|
AggregateKMeansRequest()
Constructs an AggregateKMeansRequest object with default parameters.
|
AggregateKMeansRequest(String tableName,
List<String> columnNames,
int k,
double tolerance,
Map<String,String> options)
Constructs an AggregateKMeansRequest object with the specified
parameters.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
List<String> |
getColumnNames()
List of column names on which the operation would be performed.
|
int |
getK()
The number of mean points to be determined by the algorithm.
|
Map<String,String> |
getOptions()
Optional parameters.
|
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
String |
getTableName()
Name of the table on which the operation will be performed.
|
double |
getTolerance()
Stop iterating when the distances between successive points is less than
the given tolerance.
|
int |
hashCode() |
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
AggregateKMeansRequest |
setColumnNames(List<String> columnNames)
List of column names on which the operation would be performed.
|
AggregateKMeansRequest |
setK(int k)
The number of mean points to be determined by the algorithm.
|
AggregateKMeansRequest |
setOptions(Map<String,String> options)
Optional parameters.
|
AggregateKMeansRequest |
setTableName(String tableName)
Name of the table on which the operation will be performed.
|
AggregateKMeansRequest |
setTolerance(double tolerance)
Stop iterating when the distances between successive points is less than
the given tolerance.
|
String |
toString() |
public AggregateKMeansRequest()
public AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)
tableName
- Name of the table on which the operation will be
performed. Must be an existing table, in
[schema_name.]table_name format, using standard name resolution rules.columnNames
- List of column names on which the operation would be
performed. If n columns are provided then each of
the k result points will have n dimensions
corresponding to the n columns.k
- The number of mean points to be determined by the algorithm.tolerance
- Stop iterating when the distances between successive
points is less than the given tolerance.options
- Optional parameters.
WHITEN
: When set to 1
each of the columns is first normalized by its
stdv - default is not to whiten.
MAX_ITERS
: Number of
times to try to hit the tolerance limit before
giving up - default is 10.
NUM_TRIES
: Number of
times to run the k-means algorithm with a
different randomly selected starting points -
helps avoid local minimum. Default is 1.
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in
place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have
permission to create tables. The generated name
is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
RESULT_TABLE
: The
name of a table used to store the results, in
[schema_name.]table_name format, using standard
name resolution rules and
meeting table naming criteria. If
this option is specified, the results are not
returned in the response.
RESULT_TABLE_PERSIST
: If TRUE
, then the result table specified in RESULT_TABLE
will be
persisted and will not expire unless a TTL
is specified. If FALSE
, then the result table will
be an in-memory table and will expire unless a
TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified in
RESULT_TABLE
.
Map
.public static org.apache.avro.Schema getClassSchema()
public String getTableName()
tableName
.public AggregateKMeansRequest setTableName(String tableName)
tableName
- The new value for tableName
.this
to mimic the builder pattern.public List<String> getColumnNames()
columnNames
.public AggregateKMeansRequest setColumnNames(List<String> columnNames)
columnNames
- The new value for columnNames
.this
to mimic the builder pattern.public int getK()
k
.public AggregateKMeansRequest setK(int k)
k
- The new value for k
.this
to mimic the builder pattern.public double getTolerance()
tolerance
.public AggregateKMeansRequest setTolerance(double tolerance)
tolerance
- The new value for tolerance
.this
to mimic the builder pattern.public Map<String,String> getOptions()
WHITEN
: When set to 1 each of the columns
is first normalized by its stdv - default is not to whiten.
MAX_ITERS
: Number of times to try to
hit the tolerance limit before giving up - default is 10.
NUM_TRIES
: Number of times to run the
k-means algorithm with a different randomly selected starting
points - helps avoid local minimum. Default is 1.
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have permission to create
tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
RESULT_TABLE
: The name of a table
used to store the results, in [schema_name.]table_name format,
using standard name resolution rules and meeting table naming criteria. If this option is
specified, the results are not returned in the response.
RESULT_TABLE_PERSIST
: If
TRUE
, then the result table specified in
RESULT_TABLE
will be persisted and
will not expire unless a TTL
is specified.
If FALSE
, then the result table will be an
in-memory table and will expire unless a TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of
the table specified in RESULT_TABLE
.
Map
.options
.public AggregateKMeansRequest setOptions(Map<String,String> options)
WHITEN
: When set to 1 each of the columns
is first normalized by its stdv - default is not to whiten.
MAX_ITERS
: Number of times to try to
hit the tolerance limit before giving up - default is 10.
NUM_TRIES
: Number of times to run the
k-means algorithm with a different randomly selected starting
points - helps avoid local minimum. Default is 1.
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have permission to create
tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
RESULT_TABLE
: The name of a table
used to store the results, in [schema_name.]table_name format,
using standard name resolution rules and meeting table naming criteria. If this option is
specified, the results are not returned in the response.
RESULT_TABLE_PERSIST
: If
TRUE
, then the result table specified in
RESULT_TABLE
will be persisted and
will not expire unless a TTL
is specified.
If FALSE
, then the result table will be an
in-memory table and will expire unless a TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of
the table specified in RESULT_TABLE
.
Map
.options
- The new value for options
.this
to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema
in interface org.apache.avro.generic.GenericContainer
public Object get(int index)
get
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to getIndexOutOfBoundsException
public void put(int index, Object value)
put
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to setvalue
- the value to setIndexOutOfBoundsException
Copyright © 2025. All rights reserved.