GPUdb C++ API  Version 5.2.0.0
gpudb::AggregateKMeansRequest Struct Reference

A set of input parameters for aggregateKMeans(const AggregateKMeansRequest&) const. More...

#include <gpudb/protocol/aggregate_k_means.h>

Public Member Functions

 AggregateKMeansRequest ()
 Constructs an AggregateKMeansRequest object with default parameter values. More...
 
 AggregateKMeansRequest (const std::string &tableName, const std::vector< std::string > &columnNames, const int32_t k, const double tolerance, const std::map< std::string, std::string > &options)
 Constructs an AggregateKMeansRequest object with the specified parameters. More...
 

Public Attributes

std::string tableName
 
std::vector< std::string > columnNames
 
int32_t k
 
double tolerance
 
std::map< std::string, std::string > options
 

Detailed Description

A set of input parameters for aggregateKMeans(const AggregateKMeansRequest&) const.

This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.

Definition at line 26 of file aggregate_k_means.h.

Constructor & Destructor Documentation

gpudb::AggregateKMeansRequest::AggregateKMeansRequest ( )
inline

Constructs an AggregateKMeansRequest object with default parameter values.

Definition at line 33 of file aggregate_k_means.h.

gpudb::AggregateKMeansRequest::AggregateKMeansRequest ( const std::string &  tableName,
const std::vector< std::string > &  columnNames,
const int32_t  k,
const double  tolerance,
const std::map< std::string, std::string > &  options 
)
inline

Constructs an AggregateKMeansRequest object with the specified parameters.

Parameters
[in]tableNameName of the table on which the operation will be performed. Must be a valid table or collection in GPUdb.
[in]columnNamesList of column names on which the operation would be performed. If n columns are provided then each of the k result points will have n dimensions corresponding to the n columns.
[in]kThe number of mean points to be determined by the algorithm.
[in]toleranceStop iterating when the distances between successive points is less than the given tolerance.
[in]optionsOptional parameters.
  • whiten: When set to 1 each of the columns is first normalized by its stdv - default is not to whiten.
  • max_iters: Number of times to try to hit the tolerance limit before giving up - default is 10.
  • num_tries: Number of times to run the k-means algorithm with a different randomly selected starting points - helps avoid local minimum. Default is 1.
Default value is an empty std::map.

Definition at line 75 of file aggregate_k_means.h.

Member Data Documentation

std::vector<std::string> gpudb::AggregateKMeansRequest::columnNames

Definition at line 85 of file aggregate_k_means.h.

int32_t gpudb::AggregateKMeansRequest::k

Definition at line 86 of file aggregate_k_means.h.

std::map<std::string, std::string> gpudb::AggregateKMeansRequest::options

Definition at line 88 of file aggregate_k_means.h.

std::string gpudb::AggregateKMeansRequest::tableName

Definition at line 84 of file aggregate_k_means.h.

double gpudb::AggregateKMeansRequest::tolerance

Definition at line 87 of file aggregate_k_means.h.


The documentation for this struct was generated from the following file: