Aggregate Unique In-Memory Tables¶

An aggregate unique in-memory table is created by specifying the result_table option while using the /aggregate/unique endpoint. Aggregate unique in-memory tables are not persisted by default but can be persisted (like a table) using the result_table_persist option. Aggregate unique in-memory tables can be created from any table or view and will create a unique_result type schema specific to the column used in the call to the endpoint.

An aggregate unique in-memory table is replicated by default, but it can be sharded if the shard key is included in the column_names parameter. If the result_table_force_replicated option is set to true, the aggregate unique in-memory table will be replicated regardless if the source table or view is sharded or not.

Several limitations are discussed in further detail in the Limitations section.

Creating an Aggregate Unique In-Memory Table¶

To create an aggregate unique in-memory table using the /aggregate/unique endpoint requires six parameters:

the name of the data set to create the aggregate unique in-memory table from
the columns from the source data set to use in the new in-memory table
the number of initial results to skip in the source data set
the number of records to limit the in-memory table to
the type of encoding to use
the name of the in-memory table to store the results in

Examples¶

Given source table trip_data, an aggregate unique in-memory table can be created in Python like so:

gpudb.aggregate_unique(
  table_name = "trip_data",
  column_name = "vendor_id",
  offset = 0,
  limit = -9999,
  encoding = "json",
  options = {
    "result_table":"vendor_unique"
  }
)

Creating a persisted aggregate unique table in Python:

gpudb.aggregate_unique(
  table_name = "trip_data",
  column_name = "pickup_datetime",
  offset = 0,
  limit = -9999,
  encoding = "json",
  options = {
    "result_table":"unique_pickup_datetime",
    "result_table_persist":"true"
  }
)

Limitations and Cautions¶

There are a few limitations and cautions when creating and using aggregate unique in-memory tables:

Creating an aggregate unique in-memory table results in an entirely new data set, so be mindful of the memory usage implications
An aggregate unique in-memory table cannot be created using a non-charN string column type
An aggregate unique in-memory table cannot be created using a store-only column type
If an aggregate unique in-memory table is created and its source data set's rows are updated (or removed), the in-memory table will not be updated to reflect the changes in the source data set
An aggregate unique in-memory table cannot be created from a collection
If using column or aggregate functions in the column_name parameter, an alias must be used for the column name, e.g., (column_name / 2) as halved_column
An aggregate unique in-memory table has no permanence by default because it is not protected and will expire after the default TTL setting

Table Of Contents

Aggregate Unique In-Memory Tables¶

Creating an Aggregate Unique In-Memory Table¶

Examples¶

Limitations and Cautions¶