public class AggregateGroupByRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.aggregateGroupBy
.
Calculates unique combinations (groups) of values for the given columns in a given table or view and computes aggregates on each unique combination. This is somewhat analogous to an SQL-style SELECT...GROUP BY.
For aggregation details and examples, see Aggregation. For limitations, see Aggregation Limitations.
Any column(s) can be grouped on, and all column types except unrestricted-length strings may be used for computing applicable aggregates; columns marked as store-only are unable to be used in grouping or aggregation.
The results can be paged via the offset
and limit
parameters. For example, to get 10 groups with the
largest counts the inputs would be: limit=10,
options={"sort_order":"descending", "sort_by":"value"}.
options
can be used to customize behavior of this call
e.g. filtering or sorting the results.
To group by columns 'x' and 'y' and compute the number of objects within each group, use: column_names=['x','y','count(*)'].
To also compute the sum of 'z' over each group, use: column_names=['x','y','count(*)','sum(z)'].
Available aggregation functions are: count(*), sum, min, max, avg, mean, stddev, stddev_pop, stddev_samp, var, var_pop, var_samp, arg_min, arg_max and count_distinct.
Available grouping functions are Rollup, Cube, and Grouping Sets
This service also provides support for Pivot operations.
Filtering on aggregates is supported via expressions using aggregation functions supplied to HAVING
.
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
If a RESULT_TABLE
name is specified in the
options
, the results are stored in a new table with
that name--no results are returned in the response. Both the table name and
resulting column names must adhere to standard
naming conventions; column/aggregation expressions will need to be
aliased. If the source table's shard
key is used as the grouping column(s) and all result records are
selected (offset
is 0 and limit
is
-9999), the result table will be sharded, in all other cases it will be
replicated. Sorting will properly function only if the result table is
replicated or if there is only one processing node and should not be relied
upon in other cases. Not available when any of the values of columnNames
is an unrestricted-length string.
Modifier and Type | Class and Description |
---|---|
static class |
AggregateGroupByRequest.Encoding
A set of string constants for the
AggregateGroupByRequest
parameter encoding . |
static class |
AggregateGroupByRequest.Options
A set of string constants for the
AggregateGroupByRequest
parameter options . |
Constructor and Description |
---|
AggregateGroupByRequest()
Constructs an AggregateGroupByRequest object with default parameters.
|
AggregateGroupByRequest(String tableName,
List<String> columnNames,
long offset,
long limit,
Map<String,String> options)
Constructs an AggregateGroupByRequest object with the specified
parameters.
|
AggregateGroupByRequest(String tableName,
List<String> columnNames,
long offset,
long limit,
String encoding,
Map<String,String> options)
Constructs an AggregateGroupByRequest object with the specified
parameters.
|
Modifier and Type | Method and Description | ||
---|---|---|---|
boolean |
equals(Object obj) |
||
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
List<String> |
getColumnNames()
List of one or more column names, expressions, and aggregate
expressions.
|
||
String |
getEncoding()
Specifies the encoding for returned records.
|
||
long |
getLimit()
A positive integer indicating the maximum number of results to be
returned, or END_OF_SET (-9999) to indicate that the maximum number of
results allowed by the server should be returned.
|
||
long |
getOffset()
A positive integer indicating the number of initial results to skip
(this can be useful for paging through the results).
|
||
Map<String,String> |
getOptions()
Optional parameters.
|
||
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
String |
getTableName()
Name of an existing table or view on which the operation will be
performed, in [schema_name.]table_name format, using standard
int hashCode() | ||
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
AggregateGroupByRequest |
setColumnNames(List<String> columnNames)
List of one or more column names, expressions, and aggregate
expressions.
|
||
AggregateGroupByRequest |
setEncoding(String encoding)
Specifies the encoding for returned records.
|
||
AggregateGroupByRequest |
setLimit(long limit)
A positive integer indicating the maximum number of results to be
returned, or END_OF_SET (-9999) to indicate that the maximum number of
results allowed by the server should be returned.
|
||
AggregateGroupByRequest |
setOffset(long offset)
A positive integer indicating the number of initial results to skip
(this can be useful for paging through the results).
|
||
AggregateGroupByRequest |
setOptions(Map<String,String> options)
Optional parameters.
|
||
AggregateGroupByRequest |
setTableName(String tableName)
|
public AggregateGroupByRequest()
public AggregateGroupByRequest(String tableName, List<String> columnNames, long offset, long limit, Map<String,String> options)
tableName
- Name of an existing table or view on which the
operation will be performed, in
[schema_name.]table_name format, using standard name resolution rules.columnNames
- List of one or more column names, expressions, and
aggregate expressions.offset
- A positive integer indicating the number of initial
results to skip (this can be useful for paging through
the results). The default value is 0. The minimum allowed
value is 0. The maximum allowed value is MAX_INT.limit
- A positive integer indicating the maximum number of
results to be returned, or END_OF_SET (-9999) to indicate
that the maximum number of results allowed by the server
should be returned. The number of records returned will
never exceed the server's own limit, defined by the max_get_records_size parameter in the
server configuration. Use hasMoreRecords
to see if more records exist in the result
to be fetched, and offset
& limit
to
request subsequent pages of results. The default value is
-9999.options
- Optional parameters.
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in
place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have
permission to create tables. The generated name
is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
COLLECTION_NAME
:
[DEPRECATED--please specify the containing
schema as part of RESULT_TABLE
and use GPUdb.createSchema
to create the schema if
non-existent] Name of a schema which is to
contain the table specified in RESULT_TABLE
. If the
schema provided is non-existent, it will be
automatically created.
EXPRESSION
: Filter
expression to apply to the table prior to
computing the aggregate group by.
CHUNKED_EXPRESSION_EVALUATION
: evaluate the
filter expression during group-by chunk
processing.
Supported values:
The default value is FALSE
.
HAVING
: Filter expression
to apply to the aggregated results.
SORT_ORDER
:
[DEPRECATED--use order_by instead] String
indicating how the returned values should be
sorted - ascending or descending.
Supported values:
ASCENDING
:
Indicates that the returned values
should be sorted in ascending order.
DESCENDING
:
Indicates that the returned values
should be sorted in descending order.
ASCENDING
.
SORT_BY
:
[DEPRECATED--use order_by instead] String
determining how the results are sorted.
Supported values:
KEY
: Indicates that
the returned values should be sorted by
key, which corresponds to the grouping
columns. If you have multiple grouping
columns (and are sorting by key), it
will first sort the first grouping
column, then the second grouping column,
etc.
VALUE
: Indicates
that the returned values should be
sorted by value, which corresponds to
the aggregates. If you have multiple
aggregates (and are sorting by value),
it will first sort by the first
aggregate, then the second aggregate,
etc.
VALUE
.
ORDER_BY
:
Comma-separated list of the columns to be sorted
by as well as the sort direction, e.g.,
'timestamp asc, x desc'. The default value is
''.
STRATEGY_DEFINITION
: The tier strategy for the table
and its columns.
RESULT_TABLE
: The
name of a table used to store the results, in
[schema_name.]table_name format, using standard
name resolution rules and
meeting table naming criteria. Column
names (group-by and aggregate fields) need to be
given aliases e.g. ["FChar256 as fchar256",
"sum(FDouble) as sfd"]. If present, no results
are returned in the response. This option is
not available if one of the grouping attributes
is an unrestricted string (i.e.; not charN)
type.
RESULT_TABLE_PERSIST
: If TRUE
, then the result table specified in RESULT_TABLE
will be
persisted and will not expire unless a TTL
is specified. If FALSE
, then the result table will
be an in-memory table and will expire unless a
TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
RESULT_TABLE_FORCE_REPLICATED
: Force the result
table to be replicated (ignores any sharding).
Must be used in combination with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
RESULT_TABLE_GENERATE_PK
: If TRUE
then set a primary key for
the result table. Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified in
RESULT_TABLE
.
CHUNK_SIZE
: Indicates
the number of records per chunk to be used for
the result table. Must be used in combination
with the RESULT_TABLE
option.
CHUNK_COLUMN_MAX_MEMORY
: Indicates the target
maximum data size for each column in a chunk to
be used for the result table. Must be used in
combination with the RESULT_TABLE
option.
CHUNK_MAX_MEMORY
: Indicates the target maximum
data size for all columns in a chunk to be used
for the result table. Must be used in
combination with the RESULT_TABLE
option.
CREATE_INDEXES
:
Comma-separated list of columns on which to
create indexes on the result table. Must be used
in combination with the RESULT_TABLE
option.
VIEW_ID
: ID of view of
which the result table will be a member. The
default value is ''.
PIVOT
: pivot column
PIVOT_VALUES
: The
value list provided will become the column
headers in the output. Should be the values from
the pivot_column.
GROUPING_SETS
:
Customize the grouping attribute sets to compute
the aggregates. These sets can include ROLLUP or
CUBE operartors. The attribute sets should be
enclosed in paranthesis and can include
composite attributes. All attributes specified
in the grouping sets must present in the groupby
attributes.
ROLLUP
: This option is
used to specify the multilevel aggregates.
CUBE
: This option is used
to specify the multidimensional aggregates.
SHARD_KEY
:
Comma-separated list of the columns to be
sharded on; e.g. 'column1, column2'. The
columns specified must be present in columnNames
. If any alias is given for any
column name, the alias must be used, rather than
the original column name. The default value is
''.
Map
.public AggregateGroupByRequest(String tableName, List<String> columnNames, long offset, long limit, String encoding, Map<String,String> options)
tableName
- Name of an existing table or view on which the
operation will be performed, in
[schema_name.]table_name format, using standard name resolution rules.columnNames
- List of one or more column names, expressions, and
aggregate expressions.offset
- A positive integer indicating the number of initial
results to skip (this can be useful for paging through
the results). The default value is 0. The minimum allowed
value is 0. The maximum allowed value is MAX_INT.limit
- A positive integer indicating the maximum number of
results to be returned, or END_OF_SET (-9999) to indicate
that the maximum number of results allowed by the server
should be returned. The number of records returned will
never exceed the server's own limit, defined by the max_get_records_size parameter in the
server configuration. Use hasMoreRecords
to see if more records exist in the result
to be fetched, and offset
& limit
to
request subsequent pages of results. The default value is
-9999.encoding
- Specifies the encoding for returned records.
Supported values:
BINARY
: Indicates that
the returned records should be binary encoded.
JSON
: Indicates that the
returned records should be json encoded.
BINARY
.options
- Optional parameters.
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in
place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have
permission to create tables. The generated name
is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
COLLECTION_NAME
:
[DEPRECATED--please specify the containing
schema as part of RESULT_TABLE
and use GPUdb.createSchema
to create the schema if
non-existent] Name of a schema which is to
contain the table specified in RESULT_TABLE
. If the
schema provided is non-existent, it will be
automatically created.
EXPRESSION
: Filter
expression to apply to the table prior to
computing the aggregate group by.
CHUNKED_EXPRESSION_EVALUATION
: evaluate the
filter expression during group-by chunk
processing.
Supported values:
The default value is FALSE
.
HAVING
: Filter expression
to apply to the aggregated results.
SORT_ORDER
:
[DEPRECATED--use order_by instead] String
indicating how the returned values should be
sorted - ascending or descending.
Supported values:
ASCENDING
:
Indicates that the returned values
should be sorted in ascending order.
DESCENDING
:
Indicates that the returned values
should be sorted in descending order.
ASCENDING
.
SORT_BY
:
[DEPRECATED--use order_by instead] String
determining how the results are sorted.
Supported values:
KEY
: Indicates that
the returned values should be sorted by
key, which corresponds to the grouping
columns. If you have multiple grouping
columns (and are sorting by key), it
will first sort the first grouping
column, then the second grouping column,
etc.
VALUE
: Indicates
that the returned values should be
sorted by value, which corresponds to
the aggregates. If you have multiple
aggregates (and are sorting by value),
it will first sort by the first
aggregate, then the second aggregate,
etc.
VALUE
.
ORDER_BY
:
Comma-separated list of the columns to be sorted
by as well as the sort direction, e.g.,
'timestamp asc, x desc'. The default value is
''.
STRATEGY_DEFINITION
: The tier strategy for the table
and its columns.
RESULT_TABLE
: The
name of a table used to store the results, in
[schema_name.]table_name format, using standard
name resolution rules and
meeting table naming criteria. Column
names (group-by and aggregate fields) need to be
given aliases e.g. ["FChar256 as fchar256",
"sum(FDouble) as sfd"]. If present, no results
are returned in the response. This option is
not available if one of the grouping attributes
is an unrestricted string (i.e.; not charN)
type.
RESULT_TABLE_PERSIST
: If TRUE
, then the result table specified in RESULT_TABLE
will be
persisted and will not expire unless a TTL
is specified. If FALSE
, then the result table will
be an in-memory table and will expire unless a
TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
RESULT_TABLE_FORCE_REPLICATED
: Force the result
table to be replicated (ignores any sharding).
Must be used in combination with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
RESULT_TABLE_GENERATE_PK
: If TRUE
then set a primary key for
the result table. Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified in
RESULT_TABLE
.
CHUNK_SIZE
: Indicates
the number of records per chunk to be used for
the result table. Must be used in combination
with the RESULT_TABLE
option.
CHUNK_COLUMN_MAX_MEMORY
: Indicates the target
maximum data size for each column in a chunk to
be used for the result table. Must be used in
combination with the RESULT_TABLE
option.
CHUNK_MAX_MEMORY
: Indicates the target maximum
data size for all columns in a chunk to be used
for the result table. Must be used in
combination with the RESULT_TABLE
option.
CREATE_INDEXES
:
Comma-separated list of columns on which to
create indexes on the result table. Must be used
in combination with the RESULT_TABLE
option.
VIEW_ID
: ID of view of
which the result table will be a member. The
default value is ''.
PIVOT
: pivot column
PIVOT_VALUES
: The
value list provided will become the column
headers in the output. Should be the values from
the pivot_column.
GROUPING_SETS
:
Customize the grouping attribute sets to compute
the aggregates. These sets can include ROLLUP or
CUBE operartors. The attribute sets should be
enclosed in paranthesis and can include
composite attributes. All attributes specified
in the grouping sets must present in the groupby
attributes.
ROLLUP
: This option is
used to specify the multilevel aggregates.
CUBE
: This option is used
to specify the multidimensional aggregates.
SHARD_KEY
:
Comma-separated list of the columns to be
sharded on; e.g. 'column1, column2'. The
columns specified must be present in columnNames
. If any alias is given for any
column name, the alias must be used, rather than
the original column name. The default value is
''.
Map
.public static org.apache.avro.Schema getClassSchema()
public String getTableName()
tableName
.public AggregateGroupByRequest setTableName(String tableName)
tableName
- The new value for tableName
.this
to mimic the builder pattern.public List<String> getColumnNames()
columnNames
.public AggregateGroupByRequest setColumnNames(List<String> columnNames)
columnNames
- The new value for columnNames
.this
to mimic the builder pattern.public long getOffset()
offset
.public AggregateGroupByRequest setOffset(long offset)
offset
- The new value for offset
.this
to mimic the builder pattern.public long getLimit()
hasMoreRecords
to see if more records exist in the result to be
fetched, and offset
& limit
to
request subsequent pages of results. The default value is -9999.limit
.public AggregateGroupByRequest setLimit(long limit)
hasMoreRecords
to see if more records exist in the result to be
fetched, and offset
& limit
to
request subsequent pages of results. The default value is -9999.limit
- The new value for limit
.this
to mimic the builder pattern.public String getEncoding()
BINARY
: Indicates that the returned
records should be binary encoded.
JSON
: Indicates that the returned records
should be json encoded.
BINARY
.encoding
.public AggregateGroupByRequest setEncoding(String encoding)
BINARY
: Indicates that the returned
records should be binary encoded.
JSON
: Indicates that the returned records
should be json encoded.
BINARY
.encoding
- The new value for encoding
.this
to mimic the builder pattern.public Map<String,String> getOptions()
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have permission to create
tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
COLLECTION_NAME
:
[DEPRECATED--please specify the containing schema as part of
RESULT_TABLE
and use GPUdb.createSchema
to create the schema if non-existent] Name
of a schema which is to contain the table specified in RESULT_TABLE
. If the schema provided is
non-existent, it will be automatically created.
EXPRESSION
: Filter expression to
apply to the table prior to computing the aggregate group by.
CHUNKED_EXPRESSION_EVALUATION
: evaluate the filter expression
during group-by chunk processing.
Supported values:
The default value is FALSE
.
HAVING
: Filter expression to apply to the
aggregated results.
SORT_ORDER
: [DEPRECATED--use order_by
instead] String indicating how the returned values should be
sorted - ascending or descending.
Supported values:
ASCENDING
: Indicates that the
returned values should be sorted in ascending order.
DESCENDING
: Indicates that
the returned values should be sorted in descending
order.
ASCENDING
.
SORT_BY
: [DEPRECATED--use order_by
instead] String determining how the results are sorted.
Supported values:
KEY
: Indicates that the returned
values should be sorted by key, which corresponds to the
grouping columns. If you have multiple grouping columns
(and are sorting by key), it will first sort the first
grouping column, then the second grouping column, etc.
VALUE
: Indicates that the returned
values should be sorted by value, which corresponds to
the aggregates. If you have multiple aggregates (and are
sorting by value), it will first sort by the first
aggregate, then the second aggregate, etc.
VALUE
.
ORDER_BY
: Comma-separated list of the
columns to be sorted by as well as the sort direction, e.g.,
'timestamp asc, x desc'. The default value is ''.
STRATEGY_DEFINITION
: The tier strategy for the table and its columns.
RESULT_TABLE
: The name of a table
used to store the results, in [schema_name.]table_name format,
using standard name resolution rules and meeting table naming criteria. Column names (group-by
and aggregate fields) need to be given aliases e.g. ["FChar256
as fchar256", "sum(FDouble) as sfd"]. If present, no results
are returned in the response. This option is not available if
one of the grouping attributes is an unrestricted string (i.e.;
not charN) type.
RESULT_TABLE_PERSIST
: If
TRUE
, then the result table specified in
RESULT_TABLE
will be persisted and
will not expire unless a TTL
is specified.
If FALSE
, then the result table will be an
in-memory table and will expire unless a TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
RESULT_TABLE_FORCE_REPLICATED
: Force the result table to be
replicated (ignores any sharding). Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
RESULT_TABLE_GENERATE_PK
: If TRUE
then set
a primary key for the result table. Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of
the table specified in RESULT_TABLE
.
CHUNK_SIZE
: Indicates the number of
records per chunk to be used for the result table. Must be used
in combination with the RESULT_TABLE
option.
CHUNK_COLUMN_MAX_MEMORY
:
Indicates the target maximum data size for each column in a
chunk to be used for the result table. Must be used in
combination with the RESULT_TABLE
option.
CHUNK_MAX_MEMORY
: Indicates the
target maximum data size for all columns in a chunk to be used
for the result table. Must be used in combination with the
RESULT_TABLE
option.
CREATE_INDEXES
: Comma-separated
list of columns on which to create indexes on the result table.
Must be used in combination with the RESULT_TABLE
option.
VIEW_ID
: ID of view of which the result
table will be a member. The default value is ''.
PIVOT
: pivot column
PIVOT_VALUES
: The value list
provided will become the column headers in the output. Should be
the values from the pivot_column.
GROUPING_SETS
: Customize the
grouping attribute sets to compute the aggregates. These sets
can include ROLLUP or CUBE operartors. The attribute sets should
be enclosed in paranthesis and can include composite attributes.
All attributes specified in the grouping sets must present in
the groupby attributes.
ROLLUP
: This option is used to specify
the multilevel aggregates.
CUBE
: This option is used to specify the
multidimensional aggregates.
SHARD_KEY
: Comma-separated list of the
columns to be sharded on; e.g. 'column1, column2'. The columns
specified must be present in columnNames
. If any alias is given for any column name, the
alias must be used, rather than the original column name. The
default value is ''.
Map
.options
.public AggregateGroupByRequest setOptions(Map<String,String> options)
CREATE_TEMP_TABLE
: If TRUE
, a unique temporary table name will be
generated in the sys_temp schema and used in place of RESULT_TABLE
. If RESULT_TABLE_PERSIST
is FALSE
(or unspecified), then this is always
allowed even if the caller does not have permission to create
tables. The generated name is returned in QUALIFIED_RESULT_TABLE_NAME
.
Supported values:
The default value is FALSE
.
COLLECTION_NAME
:
[DEPRECATED--please specify the containing schema as part of
RESULT_TABLE
and use GPUdb.createSchema
to create the schema if non-existent] Name
of a schema which is to contain the table specified in RESULT_TABLE
. If the schema provided is
non-existent, it will be automatically created.
EXPRESSION
: Filter expression to
apply to the table prior to computing the aggregate group by.
CHUNKED_EXPRESSION_EVALUATION
: evaluate the filter expression
during group-by chunk processing.
Supported values:
The default value is FALSE
.
HAVING
: Filter expression to apply to the
aggregated results.
SORT_ORDER
: [DEPRECATED--use order_by
instead] String indicating how the returned values should be
sorted - ascending or descending.
Supported values:
ASCENDING
: Indicates that the
returned values should be sorted in ascending order.
DESCENDING
: Indicates that
the returned values should be sorted in descending
order.
ASCENDING
.
SORT_BY
: [DEPRECATED--use order_by
instead] String determining how the results are sorted.
Supported values:
KEY
: Indicates that the returned
values should be sorted by key, which corresponds to the
grouping columns. If you have multiple grouping columns
(and are sorting by key), it will first sort the first
grouping column, then the second grouping column, etc.
VALUE
: Indicates that the returned
values should be sorted by value, which corresponds to
the aggregates. If you have multiple aggregates (and are
sorting by value), it will first sort by the first
aggregate, then the second aggregate, etc.
VALUE
.
ORDER_BY
: Comma-separated list of the
columns to be sorted by as well as the sort direction, e.g.,
'timestamp asc, x desc'. The default value is ''.
STRATEGY_DEFINITION
: The tier strategy for the table and its columns.
RESULT_TABLE
: The name of a table
used to store the results, in [schema_name.]table_name format,
using standard name resolution rules and meeting table naming criteria. Column names (group-by
and aggregate fields) need to be given aliases e.g. ["FChar256
as fchar256", "sum(FDouble) as sfd"]. If present, no results
are returned in the response. This option is not available if
one of the grouping attributes is an unrestricted string (i.e.;
not charN) type.
RESULT_TABLE_PERSIST
: If
TRUE
, then the result table specified in
RESULT_TABLE
will be persisted and
will not expire unless a TTL
is specified.
If FALSE
, then the result table will be an
in-memory table and will expire unless a TTL
is specified otherwise.
Supported values:
The default value is FALSE
.
RESULT_TABLE_FORCE_REPLICATED
: Force the result table to be
replicated (ignores any sharding). Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
RESULT_TABLE_GENERATE_PK
: If TRUE
then set
a primary key for the result table. Must be used in combination
with the RESULT_TABLE
option.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of
the table specified in RESULT_TABLE
.
CHUNK_SIZE
: Indicates the number of
records per chunk to be used for the result table. Must be used
in combination with the RESULT_TABLE
option.
CHUNK_COLUMN_MAX_MEMORY
:
Indicates the target maximum data size for each column in a
chunk to be used for the result table. Must be used in
combination with the RESULT_TABLE
option.
CHUNK_MAX_MEMORY
: Indicates the
target maximum data size for all columns in a chunk to be used
for the result table. Must be used in combination with the
RESULT_TABLE
option.
CREATE_INDEXES
: Comma-separated
list of columns on which to create indexes on the result table.
Must be used in combination with the RESULT_TABLE
option.
VIEW_ID
: ID of view of which the result
table will be a member. The default value is ''.
PIVOT
: pivot column
PIVOT_VALUES
: The value list
provided will become the column headers in the output. Should be
the values from the pivot_column.
GROUPING_SETS
: Customize the
grouping attribute sets to compute the aggregates. These sets
can include ROLLUP or CUBE operartors. The attribute sets should
be enclosed in paranthesis and can include composite attributes.
All attributes specified in the grouping sets must present in
the groupby attributes.
ROLLUP
: This option is used to specify
the multilevel aggregates.
CUBE
: This option is used to specify the
multidimensional aggregates.
SHARD_KEY
: Comma-separated list of the
columns to be sharded on; e.g. 'column1, column2'. The columns
specified must be present in columnNames
. If any alias is given for any column name, the
alias must be used, rather than the original column name. The
default value is ''.
Map
.options
- The new value for options
.this
to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema
in interface org.apache.avro.generic.GenericContainer
public Object get(int index)
get
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to getIndexOutOfBoundsException
public void put(int index, Object value)
put
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to setvalue
- the value to setIndexOutOfBoundsException
Copyright © 2025. All rights reserved.