public class AggregateKMeansRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.aggregateKMeans(AggregateKMeansRequest).
This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
| Modifier and Type | Class and Description |
|---|---|
static class |
AggregateKMeansRequest.Options
Optional parameters.
|
| Constructor and Description |
|---|
AggregateKMeansRequest()
Constructs an AggregateKMeansRequest object with default parameters.
|
AggregateKMeansRequest(String tableName,
List<String> columnNames,
int k,
double tolerance,
Map<String,String> options)
Constructs an AggregateKMeansRequest object with the specified
parameters.
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
equals(Object obj) |
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
List<String> |
getColumnNames() |
int |
getK() |
Map<String,String> |
getOptions() |
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
String |
getTableName() |
double |
getTolerance() |
int |
hashCode() |
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
AggregateKMeansRequest |
setColumnNames(List<String> columnNames) |
AggregateKMeansRequest |
setK(int k) |
AggregateKMeansRequest |
setOptions(Map<String,String> options) |
AggregateKMeansRequest |
setTableName(String tableName) |
AggregateKMeansRequest |
setTolerance(double tolerance) |
String |
toString() |
public AggregateKMeansRequest()
public AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)
tableName - Name of the table on which the operation will be
performed. Must be an existing table, in
[schema_name.]table_name format, using standard name resolution rules.columnNames - List of column names on which the operation would be
performed. If n columns are provided then each of
the k result points will have n dimensions
corresponding to the n columns.k - The number of mean points to be determined by the algorithm.tolerance - Stop iterating when the distances between successive
points is less than the given tolerance.options - Optional parameters.
WHITEN: When set to 1 each of the columns is first
normalized by its stdv - default is not to whiten.
MAX_ITERS: Number of times to try to hit the tolerance
limit before giving up - default is 10.
NUM_TRIES: Number of times to run the k-means algorithm
with a different randomly selected starting points -
helps avoid local minimum. Default is 1.
CREATE_TEMP_TABLE: If true, a unique temporary
table name will be generated in the sys_temp schema and
used in place of result_table. If result_table_persist is false (or unspecified),
then this is always allowed even if the caller does not
have permission to create tables. The generated name is
returned in qualified_result_table_name.
Supported values:
The default value is FALSE.
RESULT_TABLE: The name of a table used to store the
results, in [schema_name.]table_name format, using
standard name resolution rules and meeting table naming criteria. If this option
is specified, the results are not returned in the
response.
RESULT_TABLE_PERSIST: If true, then the result
table specified in result_table will be
persisted and will not expire unless a ttl is
specified. If false, then the result table
will be an in-memory table and will expire unless a
ttl is specified otherwise.
Supported values:
The default value is FALSE.
TTL: Sets the TTL of the table specified in result_table.
Map.public static org.apache.avro.Schema getClassSchema()
public String getTableName()
public AggregateKMeansRequest setTableName(String tableName)
tableName - Name of the table on which the operation will be
performed. Must be an existing table, in
[schema_name.]table_name format, using standard name resolution rules.this to mimic the builder pattern.public List<String> getColumnNames()
public AggregateKMeansRequest setColumnNames(List<String> columnNames)
columnNames - List of column names on which the operation would be
performed. If n columns are provided then each of
the k result points will have n dimensions
corresponding to the n columns.this to mimic the builder pattern.public int getK()
public AggregateKMeansRequest setK(int k)
k - The number of mean points to be determined by the algorithm.this to mimic the builder pattern.public double getTolerance()
public AggregateKMeansRequest setTolerance(double tolerance)
tolerance - Stop iterating when the distances between successive
points is less than the given tolerance.this to mimic the builder pattern.public Map<String,String> getOptions()
WHITEN: When set to 1 each of the columns is first normalized
by its stdv - default is not to whiten.
MAX_ITERS: Number of times to try to hit the tolerance limit
before giving up - default is 10.
NUM_TRIES: Number of times to run the k-means algorithm with a
different randomly selected starting points - helps avoid local
minimum. Default is 1.
CREATE_TEMP_TABLE: If true, a unique temporary table
name will be generated in the sys_temp schema and used in place
of result_table. If result_table_persist is
false (or unspecified), then this is always allowed even
if the caller does not have permission to create tables. The
generated name is returned in qualified_result_table_name.
Supported values:
The default value is FALSE.
RESULT_TABLE: The name of a table used to store the results, in
[schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is
specified, the results are not returned in the response.
RESULT_TABLE_PERSIST: If true, then the result table
specified in result_table will be persisted and will not
expire unless a ttl is specified. If false,
then the result table will be an in-memory table and will expire
unless a ttl is specified otherwise.
Supported values:
The default value is FALSE.
TTL: Sets
the TTL of the table specified in result_table.
Map.public AggregateKMeansRequest setOptions(Map<String,String> options)
options - Optional parameters.
WHITEN: When set to 1 each of the columns is first
normalized by its stdv - default is not to whiten.
MAX_ITERS: Number of times to try to hit the tolerance
limit before giving up - default is 10.
NUM_TRIES: Number of times to run the k-means algorithm
with a different randomly selected starting points -
helps avoid local minimum. Default is 1.
CREATE_TEMP_TABLE: If true, a unique temporary
table name will be generated in the sys_temp schema and
used in place of result_table. If result_table_persist is false (or unspecified),
then this is always allowed even if the caller does not
have permission to create tables. The generated name is
returned in qualified_result_table_name.
Supported values:
The default value is FALSE.
RESULT_TABLE: The name of a table used to store the
results, in [schema_name.]table_name format, using
standard name resolution rules and meeting table naming criteria. If this option
is specified, the results are not returned in the
response.
RESULT_TABLE_PERSIST: If true, then the result
table specified in result_table will be
persisted and will not expire unless a ttl is
specified. If false, then the result table
will be an in-memory table and will expire unless a
ttl is specified otherwise.
Supported values:
The default value is FALSE.
TTL: Sets the TTL of the table specified in result_table.
Map.this to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema in interface org.apache.avro.generic.GenericContainerpublic Object get(int index)
get in interface org.apache.avro.generic.IndexedRecordindex - the position of the field to getIndexOutOfBoundsExceptionpublic void put(int index,
Object value)
put in interface org.apache.avro.generic.IndexedRecordindex - the position of the field to setvalue - the value to setIndexOutOfBoundsExceptionCopyright © 2024. All rights reserved.