Package com.gpudb.protocol
Class AggregateKMeansRequest
- java.lang.Object
-
- com.gpudb.protocol.AggregateKMeansRequest
-
- All Implemented Interfaces:
org.apache.avro.generic.GenericContainer,org.apache.avro.generic.IndexedRecord
public class AggregateKMeansRequest extends Object implements org.apache.avro.generic.IndexedRecord
A set of parameters forGPUdb.aggregateKMeans.This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classAggregateKMeansRequest.OptionsA set of string constants for theAggregateKMeansRequestparameteroptions.
-
Constructor Summary
Constructors Constructor Description AggregateKMeansRequest()Constructs an AggregateKMeansRequest object with default parameters.AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)Constructs an AggregateKMeansRequest object with the specified parameters.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanequals(Object obj)Objectget(int index)This method supports the Avro framework and is not intended to be called directly by the user.static org.apache.avro.SchemagetClassSchema()This method supports the Avro framework and is not intended to be called directly by the user.List<String>getColumnNames()List of column names on which the operation would be performed.intgetK()The number of mean points to be determined by the algorithm.Map<String,String>getOptions()Optional parameters.org.apache.avro.SchemagetSchema()This method supports the Avro framework and is not intended to be called directly by the user.StringgetTableName()Name of the table on which the operation will be performed.doublegetTolerance()Stop iterating when the distances between successive points is less than the given tolerance.inthashCode()voidput(int index, Object value)This method supports the Avro framework and is not intended to be called directly by the user.AggregateKMeansRequestsetColumnNames(List<String> columnNames)List of column names on which the operation would be performed.AggregateKMeansRequestsetK(int k)The number of mean points to be determined by the algorithm.AggregateKMeansRequestsetOptions(Map<String,String> options)Optional parameters.AggregateKMeansRequestsetTableName(String tableName)Name of the table on which the operation will be performed.AggregateKMeansRequestsetTolerance(double tolerance)Stop iterating when the distances between successive points is less than the given tolerance.StringtoString()
-
-
-
Constructor Detail
-
AggregateKMeansRequest
public AggregateKMeansRequest()
Constructs an AggregateKMeansRequest object with default parameters.
-
AggregateKMeansRequest
public AggregateKMeansRequest(String tableName, List<String> columnNames, int k, double tolerance, Map<String,String> options)
Constructs an AggregateKMeansRequest object with the specified parameters.- Parameters:
tableName- Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.columnNames- List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.k- The number of mean points to be determined by the algorithm.tolerance- Stop iterating when the distances between successive points is less than the given tolerance.options- Optional parameters.WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.CREATE_TEMP_TABLE: IfTRUE, a unique temporary table name will be generated in the sys_temp schema and used in place ofRESULT_TABLE. IfRESULT_TABLE_PERSISTisFALSE(or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned inQUALIFIED_RESULT_TABLE_NAME. Supported values: The default value isFALSE.RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.RESULT_TABLE_PERSIST: IfTRUE, then the result table specified inRESULT_TABLEwill be persisted and will not expire unless aTTLis specified. IfFALSE, then the result table will be an in-memory table and will expire unless aTTLis specified otherwise. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified inRESULT_TABLE.
Map.
-
-
Method Detail
-
getClassSchema
public static org.apache.avro.Schema getClassSchema()
This method supports the Avro framework and is not intended to be called directly by the user.- Returns:
- The schema for the class.
-
getTableName
public String getTableName()
Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.- Returns:
- The current value of
tableName.
-
setTableName
public AggregateKMeansRequest setTableName(String tableName)
Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules.- Parameters:
tableName- The new value fortableName.- Returns:
thisto mimic the builder pattern.
-
getColumnNames
public List<String> getColumnNames()
List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.- Returns:
- The current value of
columnNames.
-
setColumnNames
public AggregateKMeansRequest setColumnNames(List<String> columnNames)
List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.- Parameters:
columnNames- The new value forcolumnNames.- Returns:
thisto mimic the builder pattern.
-
getK
public int getK()
The number of mean points to be determined by the algorithm.- Returns:
- The current value of
k.
-
setK
public AggregateKMeansRequest setK(int k)
The number of mean points to be determined by the algorithm.- Parameters:
k- The new value fork.- Returns:
thisto mimic the builder pattern.
-
getTolerance
public double getTolerance()
Stop iterating when the distances between successive points is less than the given tolerance.- Returns:
- The current value of
tolerance.
-
setTolerance
public AggregateKMeansRequest setTolerance(double tolerance)
Stop iterating when the distances between successive points is less than the given tolerance.- Parameters:
tolerance- The new value fortolerance.- Returns:
thisto mimic the builder pattern.
-
getOptions
public Map<String,String> getOptions()
Optional parameters.WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.CREATE_TEMP_TABLE: IfTRUE, a unique temporary table name will be generated in the sys_temp schema and used in place ofRESULT_TABLE. IfRESULT_TABLE_PERSISTisFALSE(or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned inQUALIFIED_RESULT_TABLE_NAME. Supported values: The default value isFALSE.RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.RESULT_TABLE_PERSIST: IfTRUE, then the result table specified inRESULT_TABLEwill be persisted and will not expire unless aTTLis specified. IfFALSE, then the result table will be an in-memory table and will expire unless aTTLis specified otherwise. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified inRESULT_TABLE.
Map.- Returns:
- The current value of
options.
-
setOptions
public AggregateKMeansRequest setOptions(Map<String,String> options)
Optional parameters.WHITEN: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.MAX_ITERS: Number of times to try to hit the tolerance limit before giving up - default is 10.NUM_TRIES: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.CREATE_TEMP_TABLE: IfTRUE, a unique temporary table name will be generated in the sys_temp schema and used in place ofRESULT_TABLE. IfRESULT_TABLE_PERSISTisFALSE(or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned inQUALIFIED_RESULT_TABLE_NAME. Supported values: The default value isFALSE.RESULT_TABLE: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.RESULT_TABLE_PERSIST: IfTRUE, then the result table specified inRESULT_TABLEwill be persisted and will not expire unless aTTLis specified. IfFALSE, then the result table will be an in-memory table and will expire unless aTTLis specified otherwise. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified inRESULT_TABLE.
Map.- Parameters:
options- The new value foroptions.- Returns:
thisto mimic the builder pattern.
-
getSchema
public org.apache.avro.Schema getSchema()
This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
getSchemain interfaceorg.apache.avro.generic.GenericContainer- Returns:
- The schema object describing this class.
-
get
public Object get(int index)
This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
getin interfaceorg.apache.avro.generic.IndexedRecord- Parameters:
index- the position of the field to get- Returns:
- value of the field with the given index.
- Throws:
IndexOutOfBoundsException
-
put
public void put(int index, Object value)This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
putin interfaceorg.apache.avro.generic.IndexedRecord- Parameters:
index- the position of the field to setvalue- the value to set- Throws:
IndexOutOfBoundsException
-
-