An aggregate groupby in-memory table is created by specifying the
result_table option while using the
/aggregate/groupby endpoint. Aggregate groupby
in-memory tables are not persisted by default but can
be persisted (like a table) using the result_table_persist
option. Aggregate groupby in-memory tables can be created from any table or
view (even other aggregate groupby in-memory tables) and will create a
groupby_result type schema specific to the columns used in the call to the
endpoint.
The aggregate groupby in-memory table will be sharded
if the entire shard key is included in the column_names parameter. The
aggregate groupby in-memory table can be replicated if the
result_table_force_replicated option is set to true, regardless if the
source table or view is sharded or not.
Several limitations are discussed in further detail in the Limitations section.
To create an aggregate groupby in-memory table using the /aggregate/groupby endpoint requires six parameters:
Given source table trip_data, an aggregate groupby in-memory table can be
created in Python like so:
gpudb.aggregate_group_by(
  table_name = "trip_data",
  column_names = ["vendor_id", "count(vendor_id) as num_trips", "avg(trip_distance) as avg_trip_dist"],
  offset = 0,
  limit = 10,
  encoding = "json",
  options = {
    "result_table":"vendor_aggregates"
  }
)
Creating a persisted aggregate groupby in-memory table in Python:
gpudb.aggregate_group_by(
  table_name = "pay_data",
  column_names = ["payment_type", "count(payment_type) as num_pay_type", "avg(total_amount) as avg_amt_paid"],
  offset = 0,
  limit = 10,
  encoding = "json",
  options = {
    "result_table":"pay_aggregates",
    "result_table_persist":"true"
  }
)
There are a few limitations and cautions when creating and using aggregate groupby in-memory tables:
sum(store_only_integer))count(column_name) as count_col_records