Cube

The cube operation takes n number of columns and produces 2ⁿ aggregates.

For example, given a cube on columns A, B, & C, it computes the requested aggregates for the following combinations of columns:

{ABC} - unique A B C triplets
{AB } - unique A B pairs
{A C} - unique A C pairs
{ BC} - unique B C pairs
{A } - unique A values
{ B } - unique B values
{ C} - unique C values
{ } - all values

The cube operation is an aggregate function that can be invoked natively in the options map of the /aggregate/groupby endpoint. It is also available via the SQL CUBE function. Be mindful that cube is an expensive operation.

Grouping and Nulls

If using the cube operation with a column that contains null values, it may not be apparent which aggregations pertain to the column's null values and which pertain to all the column's values, as both will be represented by null in the result set. To avoid this confusion, it's recommended to use the GROUPING() function to make a distinction between null values in the data and null grouping values generated by the cube operation. There will be an example of this below.

Example

The following request will aggregate the average opening stock price for these groups:

Each market sector & stock symbol pair
Each market sector
Each stock symbol
All sectors and symbols

SQL
Python

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
SELECT
    CASE
        WHEN (GROUPING(Sector) = 1) THEN '<ALL SECTORS>'
        ELSE NVL(Sector, '<UNKNOWN SECTOR>')
    END AS SectorGroup,
    CASE
        WHEN (GROUPING(Symbol) = 1) THEN '<ALL SYMBOLS>'
        ELSE NVL(Symbol, '<UNKNOWN SYMBOL>')
    END AS SymbolGroup,
    DECIMAL(AVG("Open"), 7, 2) AS AvgOpen
FROM demo.Stocks
GROUP BY CUBE(Sector, Symbol)
ORDER BY SectorGroup, SymbolGroup
Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
kinetica.aggregate_group_by(
    table_name = "demo.stocks",
    column_names = [
        "Sector",
        "Symbol",
        "IF(Grouping(Sector) = 0, Sector, '<ALL SECTORS>') as SectorGroup",
        "IF(Grouping(Symbol) = 0, Symbol, '<ALL SYMBOLS>') as SymbolGroup",
        "AVG(Open) as AvgOpen"
    ],
    options = {"cube": "(Sector, Symbol)"}
)

Limitations

The maximum number of dimensions that can be computed is 256, the maximum number of columns that can be aggregated is 8
The column(s) used in the cube operation must be listed as columns in the column_names parameter
As the cube operation is executed via /aggregate/groupby, all Aggregation Limitations also apply