Class AggregateStatisticsByRangeRequest

  • All Implemented Interfaces:
    org.apache.avro.generic.GenericContainer, org.apache.avro.generic.IndexedRecord

    public class AggregateStatisticsByRangeRequest
    extends Object
    implements org.apache.avro.generic.IndexedRecord
    A set of parameters for GPUdb.aggregateStatisticsByRange.

    Divides the given set into bins and calculates statistics of the values of a value-column in each bin. The bins are based on the values of a given binning-column. The statistics that may be requested are mean, stdv (standard deviation), variance, skew, kurtosis, sum, min, max, first, last and weighted average. In addition to the requested statistics the count of total samples in each bin is returned. This counts vector is just the histogram of the column used to divide the set members into bins. The weighted average statistic requires a weight column to be specified in WEIGHT_COLUMN_NAME. The weighted average is then defined as the sum of the products of the value column times the weight column divided by the sum of the weight column.

    There are two methods for binning the set members. In the first, which can be used for numeric valued binning-columns, a min, max and interval are specified. The number of bins, nbins, is the integer upper bound of (max-min)/interval. Values that fall in the range [min+n*interval,min+(n+1)*interval) are placed in the nth bin where n ranges from 0..nbin-2. The final bin is [min+(nbin-1)*interval,max]. In the second method, BIN_VALUES specifies a list of binning column values. Binning-columns whose value matches the nth member of the BIN_VALUES list are placed in the nth bin. When a list is provided, the binning-column must be of type string or int.

    NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.

    • Constructor Detail

      • AggregateStatisticsByRangeRequest

        public AggregateStatisticsByRangeRequest()
        Constructs an AggregateStatisticsByRangeRequest object with default parameters.
      • AggregateStatisticsByRangeRequest

        public AggregateStatisticsByRangeRequest​(String tableName,
                                                 String selectExpression,
                                                 String columnName,
                                                 String valueColumnName,
                                                 String stats,
                                                 double start,
                                                 double end,
                                                 double interval,
                                                 Map<String,​String> options)
        Constructs an AggregateStatisticsByRangeRequest object with the specified parameters.
        Parameters:
        tableName - Name of the table on which the ranged-statistics operation will be performed, in [schema_name.]table_name format, using standard name resolution rules.
        selectExpression - For a non-empty expression statistics are calculated for those records for which the expression is true. The default value is ''.
        columnName - Name of the binning-column used to divide the set samples into bins.
        valueColumnName - Name of the value-column for which statistics are to be computed.
        stats - A string of comma separated list of the statistics to calculate, e.g. 'sum,mean'. Available statistics: mean, stdv (standard deviation), variance, skew, kurtosis, sum.
        start - The lower bound of the binning-column.
        end - The upper bound of the binning-column.
        interval - The interval of a bin. Set members fall into bin i if the binning-column falls in the range [start+interval*i, start+interval*(i+1)).
        options - Map of optional parameters:
        • ADDITIONAL_COLUMN_NAMES: A list of comma separated value-column names over which statistics can be accumulated along with the primary value_column.
        • BIN_VALUES: A list of comma separated binning-column values. Values that match the nth bin_values value are placed in the nth bin.
        • WEIGHT_COLUMN_NAME: Name of the column used as weighting column for the weighted_average statistic.
        • ORDER_COLUMN_NAME: Name of the column used for candlestick charting techniques.
        The default value is an empty Map.
    • Method Detail

      • getClassSchema

        public static org.apache.avro.Schema getClassSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Returns:
        The schema for the class.
      • getTableName

        public String getTableName()
        Name of the table on which the ranged-statistics operation will be performed, in [schema_name.]table_name format, using standard name resolution rules.
        Returns:
        The current value of tableName.
      • setTableName

        public AggregateStatisticsByRangeRequest setTableName​(String tableName)
        Name of the table on which the ranged-statistics operation will be performed, in [schema_name.]table_name format, using standard name resolution rules.
        Parameters:
        tableName - The new value for tableName.
        Returns:
        this to mimic the builder pattern.
      • getSelectExpression

        public String getSelectExpression()
        For a non-empty expression statistics are calculated for those records for which the expression is true. The default value is ''.
        Returns:
        The current value of selectExpression.
      • setSelectExpression

        public AggregateStatisticsByRangeRequest setSelectExpression​(String selectExpression)
        For a non-empty expression statistics are calculated for those records for which the expression is true. The default value is ''.
        Parameters:
        selectExpression - The new value for selectExpression.
        Returns:
        this to mimic the builder pattern.
      • getColumnName

        public String getColumnName()
        Name of the binning-column used to divide the set samples into bins.
        Returns:
        The current value of columnName.
      • setColumnName

        public AggregateStatisticsByRangeRequest setColumnName​(String columnName)
        Name of the binning-column used to divide the set samples into bins.
        Parameters:
        columnName - The new value for columnName.
        Returns:
        this to mimic the builder pattern.
      • getValueColumnName

        public String getValueColumnName()
        Name of the value-column for which statistics are to be computed.
        Returns:
        The current value of valueColumnName.
      • setValueColumnName

        public AggregateStatisticsByRangeRequest setValueColumnName​(String valueColumnName)
        Name of the value-column for which statistics are to be computed.
        Parameters:
        valueColumnName - The new value for valueColumnName.
        Returns:
        this to mimic the builder pattern.
      • getStats

        public String getStats()
        A string of comma separated list of the statistics to calculate, e.g. 'sum,mean'. Available statistics: mean, stdv (standard deviation), variance, skew, kurtosis, sum.
        Returns:
        The current value of stats.
      • setStats

        public AggregateStatisticsByRangeRequest setStats​(String stats)
        A string of comma separated list of the statistics to calculate, e.g. 'sum,mean'. Available statistics: mean, stdv (standard deviation), variance, skew, kurtosis, sum.
        Parameters:
        stats - The new value for stats.
        Returns:
        this to mimic the builder pattern.
      • getStart

        public double getStart()
        The lower bound of the binning-column.
        Returns:
        The current value of start.
      • setStart

        public AggregateStatisticsByRangeRequest setStart​(double start)
        The lower bound of the binning-column.
        Parameters:
        start - The new value for start.
        Returns:
        this to mimic the builder pattern.
      • getEnd

        public double getEnd()
        The upper bound of the binning-column.
        Returns:
        The current value of end.
      • setEnd

        public AggregateStatisticsByRangeRequest setEnd​(double end)
        The upper bound of the binning-column.
        Parameters:
        end - The new value for end.
        Returns:
        this to mimic the builder pattern.
      • getInterval

        public double getInterval()
        The interval of a bin. Set members fall into bin i if the binning-column falls in the range [start+interval*i, start+interval*(i+1)).
        Returns:
        The current value of interval.
      • setInterval

        public AggregateStatisticsByRangeRequest setInterval​(double interval)
        The interval of a bin. Set members fall into bin i if the binning-column falls in the range [start+interval*i, start+interval*(i+1)).
        Parameters:
        interval - The new value for interval.
        Returns:
        this to mimic the builder pattern.
      • getOptions

        public Map<String,​String> getOptions()
        Map of optional parameters:
        • ADDITIONAL_COLUMN_NAMES: A list of comma separated value-column names over which statistics can be accumulated along with the primary value_column.
        • BIN_VALUES: A list of comma separated binning-column values. Values that match the nth bin_values value are placed in the nth bin.
        • WEIGHT_COLUMN_NAME: Name of the column used as weighting column for the weighted_average statistic.
        • ORDER_COLUMN_NAME: Name of the column used for candlestick charting techniques.
        The default value is an empty Map.
        Returns:
        The current value of options.
      • setOptions

        public AggregateStatisticsByRangeRequest setOptions​(Map<String,​String> options)
        Map of optional parameters:
        • ADDITIONAL_COLUMN_NAMES: A list of comma separated value-column names over which statistics can be accumulated along with the primary value_column.
        • BIN_VALUES: A list of comma separated binning-column values. Values that match the nth bin_values value are placed in the nth bin.
        • WEIGHT_COLUMN_NAME: Name of the column used as weighting column for the weighted_average statistic.
        • ORDER_COLUMN_NAME: Name of the column used for candlestick charting techniques.
        The default value is an empty Map.
        Parameters:
        options - The new value for options.
        Returns:
        this to mimic the builder pattern.
      • getSchema

        public org.apache.avro.Schema getSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        getSchema in interface org.apache.avro.generic.GenericContainer
        Returns:
        The schema object describing this class.
      • get

        public Object get​(int index)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        get in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to get
        Returns:
        value of the field with the given index.
        Throws:
        IndexOutOfBoundsException
      • put

        public void put​(int index,
                        Object value)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        put in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to set
        value - the value to set
        Throws:
        IndexOutOfBoundsException
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object