A set of parameters for GPUdb::aggregateKMeans. More…
#include <gpudb/protocol/aggregate_k_means.h>
Public Member Functions | |
| AggregateKMeansRequest () | |
| Constructs an AggregateKMeansRequest object with default parameters. | |
| AggregateKMeansRequest (const std::string &tableName_, const std::vector< std::string > &columnNames_, const int32_t k_, const double tolerance_, const std::map< std::string, std::string > &options_) | |
| Constructs an AggregateKMeansRequest object with the specified parameters. | |
Public Attributes | |
| std::string | tableName |
| Name of the table on which the operation will be performed. | |
| std::vector< std::string > | columnNames |
| List of column names on which the operation would be performed. | |
| int32_t | k |
| The number of mean points to be determined by the algorithm. | |
| double | tolerance |
| Stop iterating when the distances between successive points is less than the given tolerance. | |
| std::map< std::string, std::string > | options |
| Optional parameters. | |
Detailed Description
A set of parameters for GPUdb::aggregateKMeans.
This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
Definition at line 29 of file aggregate_k_means.h.
Constructor & Destructor Documentation
◆ AggregateKMeansRequest() [1/2]
| inline |
Constructs an AggregateKMeansRequest object with default parameters.
Definition at line 34 of file aggregate_k_means.h.
◆ AggregateKMeansRequest() [2/2]
| inline |
Constructs an AggregateKMeansRequest object with the specified parameters.
| [in] | tableName_ | Name of the table on which the operation will be performed. Must be an existing table, in [schema_name.]table_name format, using standard name resolution rules. |
| [in] | columnNames_ | List of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns. |
| [in] | k_ | The number of mean points to be determined by the algorithm. |
| [in] | tolerance_ | Stop iterating when the distances between successive points is less than the given tolerance. |
| [in] | options_ | Optional parameters.
|
Definition at line 162 of file aggregate_k_means.h.
Member Data Documentation
◆ columnNames
| std::vector<std::string> gpudb::AggregateKMeansRequest::columnNames |
List of column names on which the operation would be performed.
If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
Definition at line 184 of file aggregate_k_means.h.
◆ k
| int32_t gpudb::AggregateKMeansRequest::k |
The number of mean points to be determined by the algorithm.
Definition at line 189 of file aggregate_k_means.h.
◆ options
| std::map<std::string, std::string> gpudb::AggregateKMeansRequest::options |
Optional parameters.
- aggregate_k_means_whiten: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
- aggregate_k_means_max_iters: Number of times to try to hit the tolerance limit before giving up - default is 10.
- aggregate_k_means_num_tries: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
- aggregate_k_means_create_temp_table: If true, a unique temporary table name will be generated in the sys_temp schema and used in place of result_table. If result_table_persist is false (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in qualified_result_table_name. Supported values:The default value is aggregate_k_means_false.
- aggregate_k_means_result_table: The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. If this option is specified, the results are not returned in the response.
- aggregate_k_means_result_table_persist: If true, then the result table specified in result_table will be persisted and will not expire unless a ttl is specified. If false, then the result table will be an in-memory table and will expire unless a ttl is specified otherwise. Supported values:The default value is aggregate_k_means_false.
- aggregate_k_means_ttl: Sets the TTL of the table specified in result_table.
The default value is an empty map.
Definition at line 268 of file aggregate_k_means.h.
◆ tableName
| std::string gpudb::AggregateKMeansRequest::tableName |
Name of the table on which the operation will be performed.
Must be an existing table, in [ schema_name. ]table_name format, using standard name resolution rules.
Definition at line 177 of file aggregate_k_means.h.
◆ tolerance
| double gpudb::AggregateKMeansRequest::tolerance |
Stop iterating when the distances between successive points is less than the given tolerance.
Definition at line 195 of file aggregate_k_means.h.
The documentation for this struct was generated from the following file:
- gpudb/protocol/aggregate_k_means.h