> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kinetica.com/llms.txt
> Use this file to discover all available pages before exploring further.

# /aggregate/statistics

```
URL: http://<db.host>:<db.port>/aggregate/statistics
```

Calculates the requested statistics of the given column(s) in a given table.

The available statistics are: *count* (number of total objects), *mean*, *stdv*
(standard deviation), *variance*, *skew*, *kurtosis*, *sum*, *min*, *max*,
*weighted\_average*, *cardinality* (unique count), *estimated\_cardinality*,
*percentile*, and *percentile\_rank*.

Estimated cardinality is calculated by using the hyperloglog approximation
technique.

Percentiles and percentile ranks are approximate and are calculated using the
t-digest algorithm. They must include the desired
*percentile*/*percentile\_rank*. To compute multiple percentiles each value must
be specified separately (i.e.
'percentile(75.0),percentile(99.0),percentile\_rank(1234.56),percentile\_rank(-5)').

A second, comma-separated value can be added to the *percentile* statistic to
calculate percentile resolution, e.g., a 50th percentile with 200 resolution
would be 'percentile(50,200)'.

The weighted average statistic requires a weight column to be specified in
*weight\_column\_name*.  The weighted average is then defined as the sum of the
products of input parameter *column\_name* times the *weight\_column\_name* values
divided by the sum of the *weight\_column\_name* values.

Additional columns can be used in the calculation of statistics via
*additional\_column\_names*.  Values in these columns will be included in the
overall aggregate calculation--individual aggregates will not be calculated per
additional column.  For instance, requesting the *count* and *mean* of input
parameter *column\_name* x and *additional\_column\_names* y and z, where x holds
the numbers 1-10, y holds 11-20, and z holds 21-30, would return the total
number of x, y, and z values (30), and the single average value across all x,
y, and z values (15.5).

The response includes a list of key/value pairs of each statistic requested and
its corresponding value.

## Input Parameter Description

<ParamField body="table_name" type="string">
  Name of the table on which the statistics operation will be performed, in \[schema\_name.]table\_name format, using standard [name resolution rules](../../concepts/tables/#table-name-resolution).
</ParamField>

<ParamField body="column_name" type="string">
  Name of the primary column for which the statistics are to be calculated.
</ParamField>

<ParamField body="stats" type="string">
  Comma separated list of the statistics to calculate, e.g. "sum,mean".

  * **count**: Number of objects (independent of the given column(s)).
  * **mean**: Arithmetic mean (average), equivalent to sum/count.
  * **stdv**: Sample standard deviation (denominator is count-1).
  * **variance**: Unbiased sample variance (denominator is count-1).
  * **skew**: Skewness (third standardized moment).
  * **kurtosis**: Kurtosis (fourth standardized moment).
  * **sum**: Sum of all values in the column(s).
  * **min**: Minimum value of the column(s).
  * **max**: Maximum value of the column(s).
  * **weighted\_average**: Weighted arithmetic mean (using the option *weight\_column\_name* as the weighting column).
  * **cardinality**: Number of unique values in the column(s).
  * **estimated\_cardinality**: Estimate (via hyperloglog technique) of the number of unique values in the column(s).
  * **percentile**: Estimate (via t-digest) of the given percentile of the column(s) (percentile(50.0) will be an approximation of the median). Add a second, comma-separated value to calculate percentile resolution, e.g., 'percentile(75,150)'.
  * **percentile\_rank**: Estimate (via t-digest) of the percentile rank of the given value in the column(s) (if the given value is the median of the column(s), percentile\_rank(\<median>) will return approximately 50.0).
</ParamField>

<ParamField body="options" type="map of string to strings">
  Optional parameters.

  The default value is an empty map ( \{} ).

  <Expandable title="options">
    <ParamField body="additional_column_names">
      A list of comma separated column names over which statistics can be accumulated along with the primary column.  All columns listed and input parameter *column\_name* must be of the same type. Must not include the column specified in input parameter *column\_name* and no column can be listed twice.
    </ParamField>

    <ParamField body="weight_column_name">
      Name of column used as weighting attribute for the weighted average statistic.
    </ParamField>
  </Expandable>
</ParamField>

## Output Parameter Description

The Kinetica server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query.  Here is a description of the various fields of the wrapper:

<ResponseField name="status" type="String">
  'OK' or 'ERROR'
</ResponseField>

<ResponseField name="message" type="String">
  Empty if success or an error message
</ResponseField>

<ResponseField name="data_type" type="String">
  'aggregate\_statistics\_response' or 'none' in case of an error
</ResponseField>

<ResponseField name="data" type="String">
  Empty string
</ResponseField>

<ResponseField name="data_str" type="JSON or String">
  This embedded JSON represents the result of the /aggregate/statistics endpoint:

  <Expandable title="data_str">
    <ResponseField name="stats" type="map of string to doubles">
      (statistic name, double value) pairs of the requested statistics, including the total count by default.
    </ResponseField>

    <ResponseField name="info" type="map of string to strings">
      Additional information.
    </ResponseField>
  </Expandable>

  Empty string in case of an error.
</ResponseField>
