An aggregate unique in-memory table is created by specifying the
result_table
option while using the
/aggregate/unique endpoint. Aggregate unique
in-memory tables are not persisted by default but can
be persisted (like a table) using the result_table_persist
option. Aggregate unique in-memory tables can be created from any table or
view and will create a unique_result
type schema specific to the column
used in the call to the endpoint.
An aggregate unique in-memory table is replicated by
default, but it can be sharded if the shard key is included
in the column_names
parameter. If the result_table_force_replicated
option is set to true
, the aggregate unique in-memory table will be
replicated regardless if the source table or view is sharded or not.
Several limitations are discussed in further detail in the Limitations section.
To create an aggregate unique in-memory table using the /aggregate/unique endpoint requires six parameters:
Given source table trip_data
, an aggregate unique in-memory table can be
created in Python like so:
gpudb.aggregate_unique(
table_name = "trip_data",
column_name = "vendor_id",
offset = 0,
limit = -9999,
encoding = "json",
options = {
"result_table":"vendor_unique"
}
)
Creating a persisted aggregate unique table in Python:
gpudb.aggregate_unique(
table_name = "trip_data",
column_name = "pickup_datetime",
offset = 0,
limit = -9999,
encoding = "json",
options = {
"result_table":"unique_pickup_datetime",
"result_table_persist":"true"
}
)
There are a few limitations and cautions when creating and using aggregate unique in-memory tables:
(column_name / 2) as halved_column