Class GPUdbTable
Parameters
Returns
Returns a randomly generated UUID-based name. Use underscores instead of hyphens.
Returns a random name with the specified prefix
Return the table’s size/length/count.
Set the log level for the GPUdbTable class and any multi-head i/o related classes it uses.
Parameters
Return user given name for this table (or the randomly generated one, if applicable).
Return user given name for this table (or the randomly generated one, if applicable). Return the qualified version
Return the fully qualified name for this table, including any schemas.
Is the table read-only, or can we modify it?
Return the table’s size/length/count.
Returns True if the table is a collection; False otherwise.
Returns the name of the collection this table is a member of; None if this table does not belong to any collection.
Returns True if the table is replicated.
Return the table’s (record) type (the GPUdbRecordType object, not the c-extension RecordType).
Create an alias string for this table.
Parameters
Returns
Given a view name and a related response, create a new GPUdbTable object which is a read-only table with the intermediate tables automatically updated.
Returns
Clear/drop all intermediate tables if settings allow it.
Returns
Checks for the existence of a table with the given name.
Returns
If multi-head ingestion is enabled, then flush all records in the ingestors’ worker queues so that they actually get inserted to the server database.
Convert any special value types (array, json, vector etc.) of a single record suitably
Parameters
Returns
Convert any special value type (array, json, vector etc.) suitably
Parameters
Returns
Insert one or more records.
Parameters
Returns
Generates a specified number of random records and adds them to the given table. There is an optional parameter that allows the user to customize the ranges of the column values. It also allows the user to specify linear profiles for some or all columns in which case linear values are generated rather than random ones. Only individual tables are supported for this operation.
This operation is synchronous, meaning that a response will not be returned until all random records are fully available.
Parameters
Returns
Fetches the record(s) from the appropriate worker rank directly (or, if multi-head record retrieval is not set up, then from the head node) that map to the given shard key.
Parameters
Returns
Retrieves records from a given table, optionally filtered by an expression and/or sorted by a column. This operation can be performed on tables, views, or on homogeneous collections (collections containing tables of all the same type). Records can be returned encoded as binary or json.
This operation supports paging through the data via the input parameter offset and input parameter limit parameters. Note that when paging through a table, if the table (or the underlying table in case of a view) is updated (records are inserted, deleted or modified) the records retrieved may differ between calls based on the updates applied.
Decodes and returns the fetched records.
Parameters
Returns
For a given table, retrieves the values of the given columns within a given range. It returns maps of column name to the vector of values for each supported data type (double, float, long, int and string). This operation supports pagination feature, i.e. values that are retrieved are those associated with the indices between the start (offset) and end value (offset + limit) parameters (inclusive). If there are num_points values in the table then each of the indices between 0 and num_points-1 retrieves a unique value.
Note that when using the pagination feature, if the table (or the underlying table in case of a view) is updated (records are inserted, deleted or modified) the records or values retrieved may differ between calls (noncontiguous or overlap) based on the type of the update.
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
Parameters
Decodes the fetched records and saves them in the response class in an attribute called data.
Returns
Retrieves the complete series/track records from the given input parameter world_table_name based on the partial track information contained in the input parameter table_name.
This operation supports paging through the data via the input parameter offset and input parameter limit parameters.
In contrast to get_records() this returns records grouped by series/track. So if input parameter offset is 0 and input parameter limit is 5 this operation would return the first 5 series/tracks in input parameter table_name. Each series/track will be returned sorted by their TIMESTAMP column.
Parameters
Returns
Retrieves records from a collection. The operation can optionally return the record IDs which can be used in certain queries such as delete_records().
This operation supports paging through the data via the input parameter offset and input parameter limit parameters.
Parameters
Returns
Retrieves records as a GeoJSON from a given table, optionally filtered by an expression and/or sorted by a column. This operation can be performed on tables, views, or on homogeneous collections (collections containing tables of all the same type). Records can be returned encoded as binary or json.
This operation supports paging through the data via the input parameter offset and input parameter limit parameters. Note that when paging through a table, if the table (or the underlying table in case of a view) is updated (records are inserted, deleted or modified) the records retrieved may differ between calls based on the updates applied.
Decodes and returns the fetched records.
Parameters
Returns
Converts the table data to a Pandas Data Frame.
Parameters
Returns
Load a Data Frame into a table; optionally dropping any existing table, creating it if it doesn’t exist, and loading data into it; and then returning a GPUdbTable reference to the table.
Parameters
Raises
Returns
Return table columns as a dataframe for inspection.
Insert into a GPUdbTable from a dataframe.
Creates a table that is the result of a SQL JOIN.
For join details and examples see: Joins. For limitations, see Join Limitations and Cautions.
Parameters
Returns
Raises
Merges data from one or more tables with comparable data types into a new table.
The following merges are supported:
UNION (DISTINCT/ALL) - For data set union details and examples, see Union. For limitations, see Union Limitations and Cautions.
INTERSECT (DISTINCT/ALL) - For data set intersection details and examples, see Intersect. For limitations, see Intersect Limitations.
EXCEPT (DISTINCT/ALL) - For data set subtraction details and examples, see Except. For limitations, see Except Limitations.
Parameters
Returns
Raises
Calculates and returns the convex hull for the values in a table specified by input parameter table_name.
Parameters
Returns
Raises
Calculates unique combinations (groups) of values for the given columns in a given table or view and computes aggregates on each unique combination. This is somewhat analogous to an SQL-style SELECT…GROUP BY.
For aggregation details and examples, see Aggregation. For limitations, see Aggregation Limitations.
Any column(s) can be grouped on, and all column types except unrestricted-length strings may be used for computing applicable aggregates.
The results can be paged via the input parameter offset and input parameter limit parameters. For example, to get 10 groups with the largest counts the inputs would be: limit=10, options=“sort_order”:”descending”, “sort_by”:”value”.
Input parameter options can be used to customize behavior of this call e.g. filtering or sorting the results.
To group by columns ‘x’ and ‘y’ and compute the number of objects within each group, use: column_names=[‘x’,’y’,’count(*)’].
To also compute the sum of ‘z’ over each group, use: column_names=[‘x’,’y’,’count(*)’,’sum(z)’].
Available aggregation functions are: count(*), sum, min, max, avg, mean, stddev, stddev_pop, stddev_samp, var, var_pop, var_samp, arg_min, arg_max and count_distinct.
Available grouping functions are Rollup, Cube, and Grouping Sets
This service also provides support for Pivot operations.
Filtering on aggregates is supported via expressions using aggregation functions supplied to having.
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
If a result_table name is specified in the input parameter options, the results are stored in a new table with that name–no results are returned in the response. Both the table name and resulting column names must adhere to standard naming conventions; column/aggregation expressions will need to be aliased. If the source table’s shard key is used as the grouping column(s) and all result records are selected (input parameter offset is 0 and input parameter limit is -9999), the result table will be sharded, in all other cases it will be replicated. Sorting will properly function only if the result table is replicated or if there is only one processing node and should not be relied upon in other cases. Not available when any of the values of input parameter column_names is an unrestricted-length string.
Parameters
Returns
Raises
Performs a histogram calculation given a table, a column, and an interval function. The input parameter interval is used to produce bins of that size and the result, computed over the records falling within each bin, is returned. For each bin, the start value is inclusive, but the end value is exclusive–except for the very last bin for which the end value is also inclusive. The value returned for each bin is the number of records in it, except when a column name is provided as a value_column. In this latter case the sum of the values corresponding to the value_column is used as the result instead. The total number of bins requested cannot exceed 10,000.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service a request that specifies a value_column.
Parameters
Returns
Raises
This endpoint runs the k-means algorithm - a heuristic algorithm that attempts to do k-means clustering. An ideal k-means clustering algorithm selects k points such that the sum of the mean squared distances of each member of the set to the nearest of the k points is minimized. The k-means algorithm however does not necessarily produce such an ideal cluster. It begins with a randomly selected set of k points and then refines the location of the points iteratively and settles to a local minimum. Various parameters and options are provided to control the heuristic search.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
Parameters
Returns
Raises
Calculates and returns the minimum and maximum values of a particular column in a table.
Parameters
Returns
Raises
Calculates and returns the minimum and maximum x- and y-coordinates of a particular geospatial geometry column in a table.
Parameters
Returns
Raises
Calculates the requested statistics of the given column(s) in a given table.
The available statistics are: count (number of total objects), mean, stdv (standard deviation), variance, skew, kurtosis, sum, min, max, weighted_average, cardinality (unique count), estimated_cardinality, percentile, and percentile_rank.
Estimated cardinality is calculated by using the hyperloglog approximation technique.
Percentiles and percentile ranks are approximate and are calculated using the t-digest algorithm. They must include the desired percentile/percentile_rank. To compute multiple percentiles each value must be specified separately (i.e. ‘percentile(75.0),percentile(99.0),percentile_rank(1234.56),percentile_rank(-5)’).
A second, comma-separated value can be added to the percentile statistic to calculate percentile resolution, e.g., a 50th percentile with 200 resolution would be ‘percentile(50,200)’.
The weighted average statistic requires a weight column to be specified in weight_column_name. The weighted average is then defined as the sum of the products of input parameter column_name times the weight_column_name values divided by the sum of the weight_column_name values.
Additional columns can be used in the calculation of statistics via additional_column_names. Values in these columns will be included in the overall aggregate calculation–individual aggregates will not be calculated per additional column. For instance, requesting the count and mean of input parameter column_name x and additional_column_names y and z, where x holds the numbers 1-10, y holds 11-20, and z holds 21-30, would return the total number of x, y, and z values (30), and the single average value across all x, y, and z values (15.5).
The response includes a list of key/value pairs of each statistic requested and its corresponding value.
Parameters
Returns
Raises
Divides the given set into bins and calculates statistics of the values of a value-column in each bin. The bins are based on the values of a given binning-column. The statistics that may be requested are mean, stdv (standard deviation), variance, skew, kurtosis, sum, min, max, first, last and weighted average. In addition to the requested statistics the count of total samples in each bin is returned. This counts vector is just the histogram of the column used to divide the set members into bins. The weighted average statistic requires a weight column to be specified in weight_column_name. The weighted average is then defined as the sum of the products of the value column times the weight column divided by the sum of the weight column.
There are two methods for binning the set members. In the first, which can be used for numeric valued binning-columns, a min, max and interval are specified. The number of bins, nbins, is the integer upper bound of (max-min)/interval. Values that fall in the range [min+n*interval,min+(n+1)*interval) are placed in the nth bin where n ranges from 0..nbin-2. The final bin is [min+(nbin-1)*interval,max]. In the second method, bin_values specifies a list of binning column values. Binning-columns whose value matches the nth member of the bin_values list are placed in the nth bin. When a list is provided, the binning-column must be of type string or int.
NOTE: The Kinetica instance being accessed must be running a CUDA (GPU-based) build to service this request.
Parameters
Returns
Raises
Returns all the unique values from a particular column (specified by input parameter column_name) of a particular table or view (specified by input parameter table_name). If input parameter column_name is a numeric column, the values will be in output parameter binary_encoded_response. Otherwise if input parameter column_name is a string column, the values will be in output parameter json_encoded_response. The results can be paged via input parameter offset and input parameter limit parameters.
“limit”:”10”,”sort_order”:”descending”
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
If a result_table name is specified in the input parameter options, the results are stored in a new table with that name–no results are returned in the response. Both the table name and resulting column name must adhere to standard naming conventions; any column expression will need to be aliased. If the source table’s shard key is used as the input parameter column_name, the result table will be sharded, in all other cases it will be replicated. Sorting will properly function only if the result table is replicated or if there is only one processing node and should not be relied upon in other cases. Not available if the value of input parameter column_name is an unrestricted-length string.
Parameters
Returns
Raises
Rotate the column values into rows values.
For unpivot details and examples, see Unpivot. For limitations, see Unpivot Limitations.
Unpivot is used to normalize tables that are built for cross tabular reporting purposes. The unpivot operator rotates the column values for all the pivoted columns. A variable column, value column and all columns from the source table except the unpivot columns are projected into the result table. The variable column and value columns in the result table indicate the pivoted column name and values respectively.
The response is returned as a dynamic schema. For details see: dynamic schemas documentation.
Parameters
Returns
Raises
Apply various modifications to a table or view. The available modifications include the following:
Manage a table’s columns–a column can be added, removed, or have its type and properties modified, including whether it is dictionary encoded or not.
External tables cannot be modified except for their refresh method.
Create or delete a column, low-cardinality index, chunk skip, geospatial, CAGRA, or HNSW index. This can speed up certain operations when using expressions containing equality or relational operators on indexed columns. This only applies to tables.
Create or delete a foreign key on a particular column.
Manage a range-partitioned or a manual list-partitioned table’s partitions.
Set (or reset) the tier strategy of a table or view.
Refresh and manage the refresh mode of a materialized view or an external table.
Set the time-to-live (TTL). This can be applied to tables or views.
Set the global access mode (i.e. locking) for a table. This setting trumps any role-based access controls that may be in place; e.g., a user with write access to a table marked read-only will not be able to insert records into it. The mode can be set to read-only, write-only, read/write, and no access.
Parameters
Returns
Raises
Apply various modifications to columns in a table, view. The available modifications include the following:
Create or delete an index on a particular column. This can speed up certain operations when using expressions containing equality or relational operators on indexed columns. This only applies to tables.
Manage a table’s columns–a column can be added, removed, or have its type and properties modified, including whether it is dictionary encoded or not.
Parameters
Returns
Raises
Append (or insert) all records from a source table (specified by input parameter source_table_name) to a particular target table (specified by input parameter table_name). The field map (specified by input parameter field_map) holds the user specified map of target table column names with their mapped source column names.
Parameters
Returns
Raises
Clears statistics (cardinality, mean value, etc.) for a column in a specified table.
Parameters
Returns
Raises
Clears (drops) one or all tables in the database cluster. The operation is synchronous meaning that the table will be cleared before the function returns. The response payload returns the status of the operation along with the name of the table that was cleared.
Parameters
Returns
Raises
Collect statistics for a column(s) in a specified table.
Parameters
Returns
Raises
Creates a new projection of an existing table. A projection represents a subset of the columns (potentially including derived columns) of a table.
For projection details and examples, see Projections. For limitations, see Projection Limitations and Cautions.
Window functions, which can perform operations like moving averages, are available through this endpoint as well as GPUdb.get_records_by_column().
A projection can be created with a different shard key than the source table. By specifying shard_key, the projection will be sharded according to the specified columns, regardless of how the source table is sharded. The source table can even be unsharded or replicated.
If input parameter table_name is empty, selection is performed against a single-row virtual table. This can be useful in executing temporal (NOW()), identity (USER()), or constant-based functions (GEODIST(-77.11, 38.88, -71.06, 42.36)).
Parameters
Returns
Raises
Creates a monitor that watches for a single table modification event type (insert, update, or delete) on a particular table (identified by input parameter table_name) and forwards event notifications to subscribers via ZMQ. After this call completes, subscribe to the returned output parameter topic_id on the ZMQ table monitor port (default 9002). Each time an operation of the given type on the table completes, a multipart message is published for that topic; the first part contains only the topic ID, and each subsequent part contains one binary-encoded Avro object that corresponds to the event and can be decoded using output parameter type_schema. The monitor will continue to run (regardless of whether or not there are any subscribers) until deactivated with GPUdb.clear_table_monitor().
For more information on table monitors, see Table Monitors.
Parameters
Returns
Raises
Deletes record(s) matching the provided criteria from the given table. The record selection criteria can either be one or more input parameter expressions (matching multiple records), a single record identified by record_id options, or all records when using delete_all_records. Note that the three selection criteria are mutually exclusive. This operation cannot be run on a view. The operation is synchronous meaning that a response will not be available until the request is completely processed and all the matching records are deleted.
Parameters
Returns
Raises
Filters data based on the specified expression. The results are stored in a result set with the given input parameter view_name.
For details see Expressions.
The response message contains the number of points for which the expression evaluated to be true, which is equivalent to the size of the result view.
Parameters
Returns
Raises
Calculates which objects from a table are within a named area of interest (NAI/polygon). The operation is synchronous, meaning that a response will not be returned until all the matching objects are fully available. The response payload provides the count of the resulting set. A new resultant set (view) which satisfies the input NAI restriction specification is created with the name input parameter view_name passed in as part of the input.
Parameters
Returns
Raises
Calculates which geospatial geometry objects from a table intersect a named area of interest (NAI/polygon). The operation is synchronous, meaning that a response will not be returned until all the matching objects are fully available. The response payload provides the count of the resulting set. A new resultant set (view) which satisfies the input NAI restriction specification is created with the name input parameter view_name passed in as part of the input.
Parameters
Returns
Raises
Calculates how many objects within the given table lie in a rectangular box. The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new resultant set which satisfies the input NAI restriction specification is also created when a input parameter view_name is passed in as part of the input payload.
Parameters
Returns
Raises
Calculates which geospatial geometry objects from a table intersect a rectangular box. The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new resultant set which satisfies the input NAI restriction specification is also created when a input parameter view_name is passed in as part of the input payload.
Parameters
Returns
Raises
Applies a geometry filter against a geospatial geometry column in a given table or view. The filtering geometry is provided by input parameter input_wkt.
Parameters
Returns
Raises
Calculates which records from a table have values in the given list for the corresponding column. The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new resultant set (view) which satisfies the input filter specification is also created if a input parameter view_name is passed in as part of the request.
For example, if a type definition has the columns ‘x’ and ‘y’, then a filter by list query with the column map “x”:[“10.1”, “2.3”], “y”:[“0.0”, “-31.5”, “42.0”] will return the count of all data points whose x and y values match both in the respective x- and y-lists, e.g., “x = 10.1 and y = 0.0”, “x = 2.3 and y = -31.5”, etc. However, a record with “x = 10.1 and y = -31.5” or “x = 2.3 and y = 0.0” would not be returned because the values in the given lists do not correspond.
Parameters
Returns
Raises
Calculates which objects from a table lie within a circle with the given radius and center point (i.e. circular NAI). The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new resultant set (view) which satisfies the input circular NAI restriction specification is also created if a input parameter view_name is passed in as part of the request.
For track data, all track points that lie within the circle plus one point on either side of the circle (if the track goes beyond the circle) will be included in the result.
Parameters
Returns
Raises
Calculates which geospatial geometry objects from a table intersect a circle with the given radius and center point (i.e. circular NAI). The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new resultant set (view) which satisfies the input circular NAI restriction specification is also created if a input parameter view_name is passed in as part of the request.
Parameters
Returns
Raises
Calculates which objects from a table have a column that is within the given bounds. An object from the table identified by input parameter table_name is added to the view input parameter view_name if its column is within [input parameter lower_bound, input parameter upper_bound] (inclusive). The operation is synchronous. The response provides a count of the number of objects which passed the bound filter. Although this functionality can also be accomplished with the standard filter function, it is more efficient.
For track objects, the count reflects how many points fall within the given bounds (which may not include all the track points of any given track).
Parameters
Returns
Raises
Filters objects matching all points of the given track (works only on track type data). It allows users to specify a particular track to find all other points in the table that fall within specified ranges (spatial and temporal) of all points of the given track. Additionally, the user can specify another track to see if the two intersect (or go close to each other within the specified ranges). The user also has the flexibility of using different metrics for the spatial distance calculation: Euclidean (flat geometry) or Great Circle (spherical geometry to approximate the Earth’s surface distances). The filtered points are stored in a newly created result set. The return value of the function is the number of points in the resultant set (view).
This operation is synchronous, meaning that a response will not be returned until all the objects are fully available.
Parameters
Returns
Raises
Calculates which objects from a table or view match a string expression for the given string columns. Setting case_sensitive can modify case sensitivity in matching for all modes except search. For search mode details and limitations, see Full Text Search.
Parameters
Returns
Raises
Filters objects in one table based on objects in another table. The user must specify matching column types from the two tables (i.e. the target table from which objects will be filtered and the source table based on which the filter will be created); the column names need not be the same. If a input parameter view_name is specified, then the filtered objects will then be put in a newly created view. The operation is synchronous, meaning that a response will not be returned until all objects are fully available in the result view. The return value contains the count (i.e. the size) of the resulting view.
Parameters
Returns
Raises
Calculates which objects from a table has a particular value for a particular column. The input parameters provide a way to specify either a String or a Double valued column and a desired value for the column on which the filter is performed. The operation is synchronous, meaning that a response will not be returned until all the objects are fully available. The response payload provides the count of the resulting set. A new result view which satisfies the input filter restriction specification is also created with a view name passed in as part of the input payload. Although this functionality can also be accomplished with the standard filter function, it is more efficient.
Parameters
Returns
Raises
Manages global access to a table’s data. By default a table has a input parameter lock_type of read_write, indicating all operations are permitted. A user may request a read_only or a write_only lock, after which only read or write operations, respectively, are permitted on the table until the lock is removed. When input parameter lock_type is no_access then no operations are permitted on the table. The lock status can be queried by setting input parameter lock_type to status.
Parameters
Returns
Raises
Retrieves detailed information about a table, view, or schema, specified in input parameter table_name. If the supplied input parameter table_name is a schema the call can return information about either the schema itself or the tables and views it contains. If input parameter table_name is empty, information about all schemas will be returned.
If the option get_sizes is set to true, then the number of records in each table is returned (in output parameter sizes and output parameter full_sizes), along with the total number of objects across all requested tables (in output parameter total_size and output parameter total_full_size).
For a schema, setting the show_children option to false returns only information about the schema itself; setting show_children to true returns a list of tables and views contained in the schema, along with their corresponding detail.
To retrieve a list of every table, view, and schema in the database, set input parameter table_name to ‘*’ and show_children to true. When doing this, the returned output parameter total_size and output parameter total_full_size will not include the sizes of non-base tables (e.g., filters, views, joins, etc.).
Parameters
Returns
Raises
Runs multiple predicate-based updates in a single call. With the list of given expressions, any matching record’s column values will be updated as provided in input parameter new_values_maps. There is also an optional ‘upsert’ capability where if a particular predicate doesn’t match any existing record, then a new record can be inserted.
Note that this operation can only be run on an original table and not on a result view.
This operation can update primary key values. By default only ‘pure primary key’ predicates are allowed when updating primary key values. If the primary key for a table is the column ‘attr1’, then the operation will only accept predicates of the form: “attr1 == ‘foo’” if the attr1 column is being updated. For a composite primary key (e.g. columns ‘attr1’ and ‘attr2’) then this operation will only accept predicates of the form: “(attr1 == ‘foo’) and (attr2 == ‘bar’)”. Meaning, all primary key columns must appear in an equality predicate in the expressions. Furthermore each ‘pure primary key’ predicate must be unique within a given request. These restrictions can be removed by utilizing some available options through input parameter options.
The update_on_existing_pk option specifies the record primary key collision policy for tables with a primary key, while ignore_existing_pk specifies the record primary key collision error-suppression policy when those collisions result in the update being rejected. Both are ignored on tables with no primary key.
Parameters
Returns
Raises