Calculates unique combinations (groups) of values for the given columns in a
given table or view and computes aggregates on each unique combination. This is
somewhat analogous to an SQL-style SELECT…GROUP BY.For aggregation details and examples, see
Aggregation. For limitations, see
Aggregation Limitations.Any column(s) can be grouped on, and all column types except
unrestricted-length strings may be used for computing applicable aggregates.The results can be paged via the input parameter offset and input parameter
limit parameters. For example, to get 10 groups with the largest counts the
inputs would be: limit=10, options={“sort_order”:“descending”,
“sort_by”:“value”}.Input parameter options can be used to customize behavior of this call e.g.
filtering or sorting the results.To group by columns ‘x’ and ‘y’ and compute the number of objects within each
group, use: column_names=[‘x’,‘y’,‘count(*)’].To also compute the sum of ‘z’ over each group, use:
column_names=[‘x’,‘y’,‘count(*)’,‘sum(z)’].Available
aggregation functions
are: count(*), sum, min, max, avg, mean, stddev, stddev_pop, stddev_samp, var,
var_pop, var_samp, arg_min, arg_max and count_distinct.Available grouping functions are Rollup,
Cube, and
Grouping SetsThis service also provides support for Pivot
operations.Filtering on aggregates is supported via expressions using
aggregation functions
supplied to having.The response is returned as a dynamic schema. For details see:
dynamic schemas documentation.If a result_table name is specified in the input parameter options, the
results are stored in a new table with that name—no results are returned in
the response. Both the table name and resulting column names must adhere to
standard naming conventions;
column/aggregation expressions will need to be aliased. If the source table’s
shard key is used as the grouping
column(s) and all result records are selected (input parameter offset is 0
and input parameter limit is -9999), the result table will be sharded, in all
other cases it will be replicated. Sorting will properly function only if the
result table is replicated or if there is only one processing node and should
not be relied upon in other cases. Not available when any of the values of
input parameter column_names is an unrestricted-length string.
A positive integer indicating the number of initial results to skip (this can be useful for paging through the results).The default value is 0. The minimum allowed value is 0. The maximum allowed value is MAX_INT.
A positive integer indicating the maximum number of results to be returned, or END_OF_SET (-9999) to indicate that the maximum number of results allowed by the server should be returned. The number of records returned will never exceed the server’s own limit, defined by the max_get_records_size parameter in the server configuration. Use output parameter has_more_records to see if more records exist in the result to be fetched, and input parameter offset and input parameter limit to request subsequent pages of results.The default value is -9999.
If true, a unique temporary table name will be generated in the sys_temp schema and used in place of result_table. If result_table_persist is false (or unspecified), then this is always allowed even if the caller does not have permission to create tables. The generated name is returned in qualified_result_table_name.The default value is false.The supported values are:
[DEPRECATED—please specify the containing schema as part of result_table and use /create/schema to create the schema if non-existent] Name of a schema which is to contain the table specified in result_table. If the schema provided is non-existent, it will be automatically created.
[DEPRECATED—use order_by instead] String determining how the results are sorted.The default value is value.
key: Indicates that the returned values should be sorted by key, which corresponds to the grouping columns. If you have multiple grouping columns (and are sorting by key), it will first sort the first grouping column, then the second grouping column, etc.
value: Indicates that the returned values should be sorted by value, which corresponds to the aggregates. If you have multiple aggregates (and are sorting by value), it will first sort by the first aggregate, then the second aggregate, etc.
The name of a table used to store the results, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria. Column names (group-by and aggregate fields) need to be given aliases e.g. [“FChar256 as fchar256”, “sum(FDouble) as sfd”]. If present, no results are returned in the response. This option is not available if one of the grouping attributes is an unrestricted string (i.e.; not charN) type.
If true, then the result table specified in result_table will be persisted and will not expire unless a ttl is specified. If false, then the result table will be an in-memory table and will expire unless a ttl is specified otherwise.The default value is false.The supported values are:
Force the result table to be replicated (ignores any sharding). Must be used in combination with the result_table option.The default value is false.The supported values are:
If true then set a primary key for the result table. Must be used in combination with the result_table option.The default value is false.The supported values are:
If true then set a soft primary key for the result table. Must be used in combination with the result_table option.The default value is false.The supported values are:
Indicates the target maximum data size for each column in a chunk to be used for the result table. Must be used in combination with the result_table option.
Indicates the target maximum data size for all columns in a chunk to be used for the result table. Must be used in combination with the result_table option.
Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined by partition_definitions.
If true, a new partition will be created for values which don’t fall into an existing partition. Currently only supported for list partitions.The default value is false.The supported values are:
Customize the grouping attribute sets to compute the aggregates. These sets can include ROLLUP or CUBE operators. The attribute sets should be enclosed in parentheses and can include composite attributes. All attributes specified in the grouping sets must present in the group-by attributes.
Comma-separated list of the columns to be sharded on; e.g. ‘column1, column2’. The columns specified must be present in input parameter column_names. If any alias is given for any column name, the alias must be used, rather than the original column name.The default value is ”.
The Kinetica server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:
Total/Filtered number of records. This may be an over-estimate if a limit was applied and there are additional records (i.e., when output parameter has_more_records is true).