public class AggregateStatisticsRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.aggregateStatistics.
Calculates the requested statistics of the given column(s) in a given table.
The available statistics are: COUNT (number of total
objects), MEAN, STDV (standard
deviation), VARIANCE, SKEW, KURTOSIS, SUM, MIN,
MAX, WEIGHTED_AVERAGE,
CARDINALITY (unique count), ESTIMATED_CARDINALITY, PERCENTILE, and PERCENTILE_RANK.
Estimated cardinality is calculated by using the hyperloglog approximation technique.
Percentiles and percentile ranks are approximate and are calculated using
the t-digest algorithm. They must include the desired PERCENTILE/PERCENTILE_RANK.
To compute multiple percentiles each value must be specified separately
(i.e.
'percentile(75.0),percentile(99.0),percentile_rank(1234.56),percentile_rank(-5)').
A second, comma-separated value can be added to the PERCENTILE statistic to calculate percentile resolution, e.g., a 50th
percentile with 200 resolution would be 'percentile(50,200)'.
The weighted average statistic requires a weight column to be specified in
WEIGHT_COLUMN_NAME. The weighted average
is then defined as the sum of the products of columnName times the WEIGHT_COLUMN_NAME
values divided by the sum of the WEIGHT_COLUMN_NAME values.
Additional columns can be used in the calculation of statistics via ADDITIONAL_COLUMN_NAMES. Values in these
columns will be included in the overall aggregate calculation--individual
aggregates will not be calculated per additional column. For instance,
requesting the COUNT & MEAN of columnName x and ADDITIONAL_COLUMN_NAMES y & z, where x holds the numbers 1-10, y holds
11-20, and z holds 21-30, would return the total number of x, y, & z values
(30), and the single average value across all x, y, & z values (15.5).
The response includes a list of key/value pairs of each statistic requested and its corresponding value.
| Modifier and Type | Class and Description |
|---|---|
static class |
AggregateStatisticsRequest.Options
A set of string constants for the
AggregateStatisticsRequest
parameter options. |
static class |
AggregateStatisticsRequest.Stats
A set of string constants for the
AggregateStatisticsRequest
parameter stats. |
| Constructor and Description |
|---|
AggregateStatisticsRequest()
Constructs an AggregateStatisticsRequest object with default parameters.
|
AggregateStatisticsRequest(String tableName,
String columnName,
String stats,
Map<String,String> options)
Constructs an AggregateStatisticsRequest object with the specified
parameters.
|
| Modifier and Type | Method and Description | ||
|---|---|---|---|
boolean |
equals(Object obj) |
||
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
String |
getColumnName()
Name of the primary column for which the statistics are to be
calculated.
|
||
Map<String,String> |
getOptions()
Optional parameters.
|
||
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
String |
getStats()
Comma separated list of the statistics to calculate,
e.g.
|
||
String |
getTableName()
Name of the table on which the statistics operation will be performed,
in [schema_name.]table_name format, using standard
inthashCode() | ||
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
||
AggregateStatisticsRequest |
setColumnName(String columnName)
Name of the primary column for which the statistics are to be
calculated.
|
||
AggregateStatisticsRequest |
setOptions(Map<String,String> options)
Optional parameters.
|
||
AggregateStatisticsRequest |
setStats(String stats)
Comma separated list of the statistics to calculate,
e.g.
|
||
AggregateStatisticsRequest |
setTableName(String tableName)
|
public AggregateStatisticsRequest()
public AggregateStatisticsRequest(String tableName, String columnName, String stats, Map<String,String> options)
tableName - Name of the table on which the statistics operation
will be performed, in [schema_name.]table_name format,
using standard name resolution rules.columnName - Name of the primary column for which the statistics
are to be calculated.stats - Comma separated list of the statistics to calculate, e.g.
"sum,mean".
Supported values:
COUNT: Number of objects
(independent of the given column(s)).
MEAN: Arithmetic mean
(average), equivalent to sum/count.
STDV: Sample standard deviation
(denominator is count-1).
VARIANCE: Unbiased sample
variance (denominator is count-1).
SKEW: Skewness (third
standardized moment).
KURTOSIS: Kurtosis (fourth
standardized moment).
SUM: Sum of all values in the
column(s).
MIN: Minimum value of the
column(s).
MAX: Maximum value of the
column(s).
WEIGHTED_AVERAGE:
Weighted arithmetic mean (using the option WEIGHT_COLUMN_NAME as
the weighting column).
CARDINALITY: Number of
unique values in the column(s).
ESTIMATED_CARDINALITY: Estimate (via hyperloglog
technique) of the number of unique values in the
column(s).
PERCENTILE: Estimate (via
t-digest) of the given percentile of the column(s)
(percentile(50.0) will be an approximation of the
median). Add a second, comma-separated value to
calculate percentile resolution, e.g.,
'percentile(75,150)'
PERCENTILE_RANK:
Estimate (via t-digest) of the percentile rank of
the given value in the column(s) (if the given
value is the median of the column(s),
percentile_rank(<median>) will return
approximately 50.0).
options - Optional parameters.
ADDITIONAL_COLUMN_NAMES: A list of comma
separated column names over which statistics can
be accumulated along with the primary column.
All columns listed and columnName must
be of the same type. Must not include the
column specified in columnName and no
column can be listed twice.
WEIGHT_COLUMN_NAME: Name of column used as
weighting attribute for the weighted average
statistic.
Map.public static org.apache.avro.Schema getClassSchema()
public String getTableName()
tableName.public AggregateStatisticsRequest setTableName(String tableName)
tableName - The new value for tableName.this to mimic the builder pattern.public String getColumnName()
columnName.public AggregateStatisticsRequest setColumnName(String columnName)
columnName - The new value for columnName.this to mimic the builder pattern.public String getStats()
COUNT: Number of objects (independent of the
given column(s)).
MEAN: Arithmetic mean (average), equivalent
to sum/count.
STDV: Sample standard deviation (denominator
is count-1).
VARIANCE: Unbiased sample variance
(denominator is count-1).
SKEW: Skewness (third standardized moment).
KURTOSIS: Kurtosis (fourth standardized
moment).
SUM: Sum of all values in the column(s).
MIN: Minimum value of the column(s).
MAX: Maximum value of the column(s).
WEIGHTED_AVERAGE: Weighted
arithmetic mean (using the option WEIGHT_COLUMN_NAME as the weighting
column).
CARDINALITY: Number of unique values
in the column(s).
ESTIMATED_CARDINALITY:
Estimate (via hyperloglog technique) of the number of unique
values in the column(s).
PERCENTILE: Estimate (via t-digest) of
the given percentile of the column(s) (percentile(50.0) will be
an approximation of the median). Add a second, comma-separated
value to calculate percentile resolution, e.g.,
'percentile(75,150)'
PERCENTILE_RANK: Estimate (via
t-digest) of the percentile rank of the given value in the
column(s) (if the given value is the median of the column(s),
percentile_rank(<median>) will return approximately 50.0).
stats.public AggregateStatisticsRequest setStats(String stats)
COUNT: Number of objects (independent of the
given column(s)).
MEAN: Arithmetic mean (average), equivalent
to sum/count.
STDV: Sample standard deviation (denominator
is count-1).
VARIANCE: Unbiased sample variance
(denominator is count-1).
SKEW: Skewness (third standardized moment).
KURTOSIS: Kurtosis (fourth standardized
moment).
SUM: Sum of all values in the column(s).
MIN: Minimum value of the column(s).
MAX: Maximum value of the column(s).
WEIGHTED_AVERAGE: Weighted
arithmetic mean (using the option WEIGHT_COLUMN_NAME as the weighting
column).
CARDINALITY: Number of unique values
in the column(s).
ESTIMATED_CARDINALITY:
Estimate (via hyperloglog technique) of the number of unique
values in the column(s).
PERCENTILE: Estimate (via t-digest) of
the given percentile of the column(s) (percentile(50.0) will be
an approximation of the median). Add a second, comma-separated
value to calculate percentile resolution, e.g.,
'percentile(75,150)'
PERCENTILE_RANK: Estimate (via
t-digest) of the percentile rank of the given value in the
column(s) (if the given value is the median of the column(s),
percentile_rank(<median>) will return approximately 50.0).
stats - The new value for stats.this to mimic the builder pattern.public Map<String,String> getOptions()
ADDITIONAL_COLUMN_NAMES:
A list of comma separated column names over which statistics can
be accumulated along with the primary column. All columns
listed and columnName must be of the
same type. Must not include the column specified in columnName and no column can be listed twice.
WEIGHT_COLUMN_NAME: Name of
column used as weighting attribute for the weighted average
statistic.
Map.options.public AggregateStatisticsRequest setOptions(Map<String,String> options)
ADDITIONAL_COLUMN_NAMES:
A list of comma separated column names over which statistics can
be accumulated along with the primary column. All columns
listed and columnName must be of the
same type. Must not include the column specified in columnName and no column can be listed twice.
WEIGHT_COLUMN_NAME: Name of
column used as weighting attribute for the weighted average
statistic.
Map.options - The new value for options.this to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema in interface org.apache.avro.generic.GenericContainerpublic Object get(int index)
get in interface org.apache.avro.generic.IndexedRecordindex - the position of the field to getIndexOutOfBoundsExceptionpublic void put(int index,
Object value)
put in interface org.apache.avro.generic.IndexedRecordindex - the position of the field to setvalue - the value to setIndexOutOfBoundsExceptionCopyright © 2025. All rights reserved.