AggregateKMeansRequest

java.lang.Object

com.gpudb.protocol.AggregateKMeansRequest

All Implemented Interfaces:

org.apache.avro.generic.GenericContainer, org.apache.avro.generic.IndexedRecord

public class AggregateKMeansRequest extends Object implements org.apache.avro.generic.IndexedRecord

A set of parameters for GPUdb.aggregateKMeans.

This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.

NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.

Nested Class Summary
Nested Classes
Modifier and Type
Class
Description
static final class
AggregateKMeansRequest.Options
A set of string constants for the AggregateKMeansRequest parameter options.
Constructor Summary
Constructors
Constructor
Description
AggregateKMeansRequest()
Constructs an AggregateKMeansRequest object with default parameters.
AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)
Constructs an AggregateKMeansRequest object with the specified parameters.
Method Summary
Modifier and Type
Method
Description
boolean
equals(Object obj)

Object
get(int index)
This method supports the Avro framework and is not intended to be called directly by the user.
static org.apache.avro.Schema
getClassSchema()
This method supports the Avro framework and is not intended to be called directly by the user.
List<String>
getColumnNames()
List of column names on which the operation would be performed.
int
getK()
The number of mean points to be determined by the algorithm.
Map<String,String>
getOptions()
Optional parameters.
org.apache.avro.Schema
getSchema()
This method supports the Avro framework and is not intended to be called directly by the user.
String
getTableName()
Name of the table on which the operation will be performed.
double
getTolerance()
Stop iterating when the distances between successive points is less than the given tolerance.
int
hashCode()

void
put(int index, Object value)
This method supports the Avro framework and is not intended to be called directly by the user.
AggregateKMeansRequest
setColumnNames(List<String> columnNames)
List of column names on which the operation would be performed.
AggregateKMeansRequest
setK(int k)
The number of mean points to be determined by the algorithm.
AggregateKMeansRequest
setOptions(Map<String,String> options)
Optional parameters.
AggregateKMeansRequest
setTableName(String tableName)
Name of the table on which the operation will be performed.
AggregateKMeansRequest
setTolerance(double tolerance)
Stop iterating when the distances between successive points is less than the given tolerance.
String
toString()

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Constructor Details
- AggregateKMeansRequest
  public AggregateKMeansRequest()
  Constructs an AggregateKMeansRequest object with default parameters.
- AggregateKMeansRequest
  public AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)
  Constructs an AggregateKMeansRequest object with the specified parameters.
  Parameters:
  tableName - Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.
  columnNames - List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
  k - The number of mean points to be determined by the algorithm.
  tolerance - Stop iterating when the distances between successive points is less than the given tolerance.
  options - Optional parameters.
  WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
  MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.
  NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
  CREATE_TEMP_TABLE: If TRUE, a unique temporary table name will be generated in the sys_temp schema and used in place of RESULT_TABLE. If RESULT_TABLE_PERSIST is FALSE (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.
  RESULT_TABLE_PERSIST: If TRUE, then the result table specified in RESULT_TABLE will be persisted and will not expire unless a TTL is specified. If FALSE, then the result table will be an in-memory table and will expire unless a TTL is specified otherwise. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  TTL: Sets the TTL of the table specified in RESULT_TABLE.
  The default value is an empty Map.
Method Details
- getClassSchema
  public static org.apache.avro.Schema getClassSchema()
  This method supports the Avro framework and is not intended to be called directly by the user.
  Returns:
  The schema for the class.
- getTableName
  public String getTableName()
  Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.
  Returns:
  The current value of tableName.
- setTableName
  public AggregateKMeansRequest setTableName(String tableName)
  Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.
  Parameters:
  tableName - The new value for tableName.
  Returns:
  this to mimic the builder pattern.
- getColumnNames
  public List<String> getColumnNames()
  List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
  Returns:
  The current value of columnNames.
- setColumnNames
  public AggregateKMeansRequest setColumnNames(List<String> columnNames)
  List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
  Parameters:
  columnNames - The new value for columnNames.
  Returns:
  this to mimic the builder pattern.
- getK
  public int getK()
  The number of mean points to be determined by the algorithm.
  Returns:
  The current value of k.
- setK
  public AggregateKMeansRequest setK(int k)
  The number of mean points to be determined by the algorithm.
  Parameters:
  k - The new value for k.
  Returns:
  this to mimic the builder pattern.
- getTolerance
  public double getTolerance()
  Stop iterating when the distances between successive points is less than the given tolerance.
  Returns:
  The current value of tolerance.
- setTolerance
  public AggregateKMeansRequest setTolerance(double tolerance)
  Stop iterating when the distances between successive points is less than the given tolerance.
  Parameters:
  tolerance - The new value for tolerance.
  Returns:
  this to mimic the builder pattern.
- getOptions
  public Map<String,String> getOptions()
  Optional parameters.
  WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
  MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.
  NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
  CREATE_TEMP_TABLE: If TRUE, a unique temporary table name will be generated in the sys_temp schema and used in place of RESULT_TABLE. If RESULT_TABLE_PERSIST is FALSE (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.
  RESULT_TABLE_PERSIST: If TRUE, then the result table specified in RESULT_TABLE will be persisted and will not expire unless a TTL is specified. If FALSE, then the result table will be an in-memory table and will expire unless a TTL is specified otherwise. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  TTL: Sets the TTL of the table specified in RESULT_TABLE.
  The default value is an empty Map.
  Returns:
  The current value of options.
- setOptions
  public AggregateKMeansRequest setOptions(Map<String,String> options)
  Optional parameters.
  WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
  MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.
  NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
  CREATE_TEMP_TABLE: If TRUE, a unique temporary table name will be generated in the sys_temp schema and used in place of RESULT_TABLE. If RESULT_TABLE_PERSIST is FALSE (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.
  RESULT_TABLE_PERSIST: If TRUE, then the result table specified in RESULT_TABLE will be persisted and will not expire unless a TTL is specified. If FALSE, then the result table will be an in-memory table and will expire unless a TTL is specified otherwise. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  TTL: Sets the TTL of the table specified in RESULT_TABLE.
  The default value is an empty Map.
  Parameters:
  options - The new value for options.
  Returns:
  this to mimic the builder pattern.
- getSchema
  public org.apache.avro.Schema getSchema()
  This method supports the Avro framework and is not intended to be called directly by the user.
  Specified by:
  getSchema in interface org.apache.avro.generic.GenericContainer
  Returns:
  The schema object describing this class.
- get
  public Object get(int index)
  This method supports the Avro framework and is not intended to be called directly by the user.
  Specified by:
  get in interface org.apache.avro.generic.IndexedRecord
  Parameters:
  index - the position of the field to get
  Returns:
  value of the field with the given index.
  Throws:
  IndexOutOfBoundsException
- put
  public void put(int index, Object value)
  This method supports the Avro framework and is not intended to be called directly by the user.
  Specified by:
  put in interface org.apache.avro.generic.IndexedRecord
  Parameters:
  index - the position of the field to set
  value - the value to set
  Throws:
  IndexOutOfBoundsException
- equals
  public boolean equals(Object obj)
  Overrides:
  equals in class Object
- toString
  public String toString()
  Overrides:
  toString in class Object
- hashCode
  public int hashCode()
  Overrides:
  hashCode in class Object

API

Class AggregateKMeansRequest

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details