URL: http://GPUDB_IP_ADDRESS:GPUDB_PORT/aggregate/groupby
Calculates unique combinations (groups) of values for the given columns in a given table/view/collection and computes aggregates on each unique combination. This is somewhat analogous to an SQL-style SELECT...GROUP BY.
Any column(s) can be grouped on, and all column types except unrestricted-length strings may be used for computing applicable aggregates; columns marked as store-only are unable to be used in grouping or aggregation.
The results can be paged via the input parameter offset and input parameter limit parameters. For example, to get 10 groups with the largest counts the inputs would be: limit=10, options={"sort_order":"descending", "sort_by":"value"}.
Input parameter options can be used to customize behavior of this call e.g. filtering or sorting the results.
To group by columns 'x' and 'y' and compute the number of objects within each group, use: column_names=['x','y','count(*)'].
To also compute the sum of 'z' over each group, use: column_names=['x','y','count(*)','sum(z)'].
Available aggregation functions are: count(*), sum, min, max, avg, mean, stddev, stddev_pop, stddev_samp, var, var_pop, var_samp, arg_min, arg_max and count_distinct.
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
If a result_table name is specified in the input parameter options, the results are stored in a new table with that name--no results are returned in the response. Both the table name and resulting column names must adhere to standard naming conventions; column/aggregation expressions will need to be aliased. If the source table's shard key is used as the grouping column(s), the result table will be sharded, in all other cases it will be replicated. Sorting will properly function only if the result table is replicated or if there is only one processing node and should not be relied upon in other cases. Not available when any of the values of input parameter column_names is an unrestricted-length string.
Input Parameter Description
Name | Type | Description | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
table_name | string | Name of the table on which the operation will be performed. Must be an existing table/view/collection. | ||||||||||||||||||||||||||||||
column_names | array of strings | List of one or more column names, expressions, and aggregate expressions. Must include at least one 'grouping' column or expression. If no aggregate is included, count(*) will be computed as a default. | ||||||||||||||||||||||||||||||
offset | long | A positive integer indicating the number of initial results to skip (this can be useful for paging through the results). The minimum allowed value is 0. The maximum allowed value is MAX_INT. | ||||||||||||||||||||||||||||||
limit | long | A positive integer indicating the maximum number of results to be returned Or END_OF_SET (-9999) to indicate that the max number of results should be returned. Default value is 1000. | ||||||||||||||||||||||||||||||
encoding | string | Specifies the encoding for returned records. Default value is 'binary'.
|
||||||||||||||||||||||||||||||
options | map of strings | Optional parameters. Default value is an empty map ( {} ).
|
Output Parameter Description
The GPUdb server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:
Name | Type | Description | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
status | String | 'OK' or 'ERROR' | ||||||||||||||||||
message | String | Empty if success or an error message | ||||||||||||||||||
data_type | String | 'aggregate_group_by_request' or 'none' in case of an error | ||||||||||||||||||
data | String | Empty string | ||||||||||||||||||
data_str | JSON or String | This embedded JSON represents the result of the /aggregate/groupby endpoint:
Empty string in case of an error. |