<column list> is a
comma-separated list of columns to use as the primary key for the tableColumn Types
Number
| Data Type | Description |
|---|---|
BOOLEAN | Effective specific type: boolean |
TINYINT | Effective specific type: int8 |
BYTE | Alias for TINYINT |
SMALLINT | Effective specific type: int16 |
INTEGER | Effective specific type: int |
INT | Alias for INTEGER |
BIGINT | Effective specific type: long |
DECIMAL | Alias for BIGINT |
LONG | Alias for BIGINT |
UNSIGNED BIGINT | Effective specific type: ulong |
UNSIGNED LONG | Alias for UNSIGNED BIGINT |
REAL | Effective specific type: float |
FLOAT | Alias for REAL |
DOUBLE | Effective specific type: double |
DECIMAL(P,S) | Effective specific type: decimal(p,s) Maximum precision P is 27 Maximum scale S is 18 Precision determines decimal storage size: 1 - 18: 8 19 - 27: 12 |
NUMERIC | Alias for DECIMAL(P,S) |
String
| Data Type | Description |
|---|---|
VARCHAR | Effective specific type: string; character limit based on configured system property |
VARCHAR(N) | Effective specific type: the smallest charN type that is at least size N, or string if N is greater than 256 |
CHAR | Alias for VARCHAR / VARCHAR(N) |
STRING | Alias for VARCHAR / VARCHAR(N) |
TEXT | Alias for VARCHAR / VARCHAR(N) |
IPV4 | Shorthand for VARCHAR(IPV4), which applies the IPV4 column property |
UUID | Effective specific type uuid |
JSON | Effective specific type json |
Date/Time
Binary
| Data Type | Description |
|---|---|
BLOB | Effective specific type: bytes |
BINARY | Alias for BLOB |
BYTES | Alias for BLOB |
VARBINARY | Alias for BLOB |
Geospatial
Composite
| Data Type | Description |
|---|---|
BOOLEAN[N] | Effective specific type: array(boolean) (string base type) |
INTEGER[N] | Effective specific type: array(int) (string base type) |
BIGINT[N] | Effective specific type: array(long) (string base type) |
UNSIGNED BIGINT[N] | Effective specific type: array(ulong) (string base type) |
VECTOR(N) | Effective specific type: vector(n) (bytes base type); generally used in vector search |
REAL[N] | Effective specific type: array(float) (string base type) |
DOUBLE[N] | Effective specific type: array(double) (string base type) |
VARCHAR[N] | Effective specific type: array(string) (string base type) |
Column Properties
| Column Property | Description |
|---|---|
COMPRESS(TYPE) | Applies compress data handling to a column using the specified compression type, enabling compression of its values; see Column Compression for details |
DICT | Applies dict data handling to a column, enabling dictionary-encoding of its values; see Dictionary Encoding for details |
INIT_WITH_NOW | For DATE, TIME, DATETIME, and TIMESTAMP column types, enables the database to use NOW() as the value when the column is not specified in an INSERT statement; also directs the database to replace empty strings (for DATE, TIME, & DATETIME columns) and invalid timestamps (for TIMESTAMP columns) with NOW() during inserts |
INIT_WITH_UUID | For UUID column type, enables the database to use a universally unique identifier (UUID) as the value when the column is not specified in an INSERT statement; also directs the database to replace empty strings (for UUID columns) with UUIDs |
IPV4 | Treats the associated string-based column as an IPv4 address |
NORMALIZE | For VECTOR column type, automatically normalizes each vector to have a magnitude of 1; see Vector Type |
PRIMARY_KEY | Treats the associated column as a primary key, or part of a composite primary key if other columns also have this property |
SHARD_KEY | Treats the associated column as a shard key, or part of a composite shard key if other columns also have this property |
TEXT_SEARCH | Applies text-searchability to a column |
UPDATE_WITH_NOW | For DATE, TIME, DATETIME, and TIMESTAMP column types, enables the database to use NOW() as the value when the column is not specified in an UPDATE statement; also directs the database to replace empty strings (for DATE, TIME, & DATETIME columns) and invalid timestamps (for TIMESTAMP columns) with NOW() during updates |
CREATE SCHEMA
Schemas are logical containers for all database objects (tables, views, etc.). In order to place an object in a schema, the schema must be created first—schemas will not be automatically created when specified inCREATE TABLE or similar calls.
Parameters
IF NOT EXISTS
IF NOT EXISTS
<schema name>
<schema name>
Examples
ALTER SCHEMA
Alters the configuration of a schema. The name of a schema is the only property that can be altered.Rename Schema
A schema can be renamed.DROP SCHEMA
Removes an existing schema.Parameters
IF EXISTS
IF EXISTS
<schema name>
<schema name>
CASCADE
CASCADE
Examples
To drop a schema, including all contained objects:SHOW SCHEMA
Outputs the DDL statement required to reconstruct the given schema.SHOW SCHEMA is a single-column result set
with the DDL statement as the value in the DDL column, shown below with
the column separators returned by kisql.Parameters
CREATE
CREATE
SCHEMA
SCHEMA
SHOW TABLE statement instead<schema name>
<schema name>
* instead
to output the DDL of all schemas, tables, & viewsExamples
To output the DDL for the example schema created in the CREATE SCHEMA section:DESCRIBE SCHEMA
Lists the contained tables and views of a given schema.Parameters
SCHEMA
SCHEMA
DESCRIBE TABLE statement instead<schema name>
<schema name>
Examples
To describe the tables contained in thedemo schema, into
which demo data is usually downloaded:
SET CURRENT SCHEMA
To override the default schema with a different schema, the following syntax can be used:schema name will switch back to the original default schema.
CREATE TABLE
Creates a new table.Parameters
OR REPLACE
OR REPLACE
IF NOT EXISTSREPLICATED
REPLICATED
TEMP
TEMP
IF NOT EXISTS
IF NOT EXISTS
OR REPLACE<schema name>
<schema name>
<table name>
<table name>
<column name>
<column name>
<column definition>
<column definition>
<column name>; see Data Definition (DDL) for column
formatCOMMENT '<column comment>'
COMMENT '<column comment>'
SOFT
SOFT
PRIMARY KEY that creates a soft primary key instead
of a standard primary keyPRIMARY KEY (<column list>)
PRIMARY KEY (<column list>)
SHARD KEY (<column list>)
SHARD KEY (<column list>)
FOREIGN KEY ...
FOREIGN KEY ...
| Parameter | Description |
|---|---|
<column list> | Comma-separated list of columns in the table to create that will reference a matching set of primary key columns in another table |
<foreign table name> | Name of target table referred to in this foreign key |
<foreign column list> | The primary key columns in the target table referred to in this foreign key, matching the list of columns specified in <column list> in the table to create |
AS <foreign key name> | Optional alias for the foreign key |
<partition clause>
<partition clause>
<tier strategy clause>
<tier strategy clause>
<index clause>
<index clause>
<table property clause>
<table property clause>
Examples
To create a table with various column types and properties:Partition Clause
A table can be further segmented into partitions. The supported partition types are:Range
Interval
List
Hash
Series
Range Partitioning
The general format for the range partition clause is:PARTITIONS, is optional, though it is
recommended to define partitions at table creation time, when feasible.
For example, to create a range-partitioned table with the following
criteria:
- partitioned on the date/time of the order
-
partitions for years:
- 2014 - 2016
- 2017
- 2018
- 2019
- records not in that range go to the default partition
Interval Partitioning
The general format for the interval partition clause is:- partitioned on the date/time of the order
- one partition for each year from 2014 on
- later year partitions are added as necessary
- records prior to 2014 go to the default partition
- partitioned on the date/time of the order
- one partition for each day from January 1st, 2014 on
- later day partitions are added as necessary
- records prior to 2014 go to the default partition
INTERVAL function
(described in the Date/Time Functions section):
List Partitioning
The list partition clause has two forms:manual
automatic
Manual
The general format for the manual list partition clause is:PARTITIONS, is optional, though it is
recommended to define partitions at table creation time, when feasible.
For example, to create a manual list-partitioned table with the following
criteria:
- partitioned on the date/time of the order
-
partitions for years:
- 2014 - 2016
- 2017
- 2018
- 2019
- records not in that list go to the default partition
- partitioned on the date/time of the order
- each partition corresponds to a unique year & month pair
-
partitions for years/months:
- February 2016 & March 2016
- March 2020
- records not in that list go to the default partition
Automatic
The general format for the automatic list partition clause is:- partitioned on the date/time of the order
- one partition for each unique year & month across all orders
- partitions are added as necessary
Hash Partitioning
The general format for the hash partition clause is:- partitioned on the date/time of the order
- distributed among the fixed set of partitions, based on the hash of the year & month of the order
- 10 partitions
Series Partitioning
The general format for the series partition clause is:PERCENT_FULL should be an integer between 1 and 100; the default is
50%.
To create a series-partitioned table with the following criteria:
- partitioned on the customer of each order
- partitions with closed key sets will contain all orders from a set of unique customers
- 50% fill threshold
- partitioned on the track ID
- partitions with closed key sets will contain all points from a unique set of tracks
- 25% fill threshold
Tier Strategy Clause
A table can have a tier strategy specified at creation time. If not assigned a tier strategy upon creation, a default tier strategy will be assigned.customer_order table with an above-average
eviction priority in the
RAM Tier:
SHOW TABLE is a single-record result set
with the DDL statement as the value in the DDL column, shown here with
the column separators returned by kisql.Index Clause
A table can have any number of indexes applied to any of its columns at creation time. The types of explicit indexes supported are:Column (Attribute) Index
Low Cardinality Index
Chunk Skip Index
Geospatial Index
CAGRA Index
HNSW Index
- column index on
last_name - low-cardinality index on
dept_id - chunk skip index on
id - geospatial index on
work_district - geospatial index on the pair of
office_longitude&office_latitude - CAGRA index on
profile
Table Property Clause
A subset of table properties can be applied to the table at creation time.CHUNK COLUMN MEMORY
CHUNK COLUMN MEMORY
CHUNK MEMORY
CHUNK MEMORY
CHUNK SIZE
CHUNK SIZE
COMPRESSION_CODEC
COMPRESSION_CODEC
NO_ERROR_IF_EXISTS
NO_ERROR_IF_EXISTS
FALSE.IF NOT EXISTS.PRIMARY_KEY_TYPE
PRIMARY_KEY_TYPE
memory.| Type | Description |
|---|---|
memory | Primary key index is loaded into memory, occupying RAM but improving performance. |
disk | Primary key index is stored on disk only, increasing available RAM at the cost of some performance. |
TTL
TTL
CREATE EXTERNAL TABLE
Creates a new external table, which is a database object whose source data is located in one or more files, either internal or external to the database.- in KiFS
- on a remote system, accessible via a data source
| Provider | Description | One-Time Load | Subscription |
|---|---|---|---|
| Azure | Microsoft blob storage | Yes | Yes |
| GCS | Google Cloud Storage | Yes | Yes |
| HDFS | Apache Hadoop Distributed File System | Yes | |
| JDBC | Java DataBase Connectivity; using a user-supplied driver or one of the drivers on the list supported list | Yes | Yes |
| S3 | Amazon S3 Bucket | Yes | Yes |
Parameters
OR REPLACE
OR REPLACE
REPLICATED
REPLICATED
TEMP
TEMP
LOGICAL
LOGICAL
MATERIALIZED
MATERIALIZED
<schema name>
<schema name>
<table name>
<table name>
<table definition clause>
<table definition clause>
REMOTE QUERY
REMOTE QUERY
<source data query> is a SQL query selecting the data
which will be loaded.FILE PATHS clause, and is only
applicable to JDBC data sources.- Any column expression used is given a column alias.
- The first column is not a
WKTor unlimited lengthVARCHARtype. - The columns and expressions queried should match the intended order, number, & type of the columns in the target table.
- If
REMOTE_QUERY_NO_SPLITisTRUE, the query will not be distributed. - If a valid
REMOTE_QUERY_PARTITION_COLUMNis specified, the query will be distributed by partitioning on the given column’s values - If a valid
REMOTE_QUERY_ORDER_BYis specified, the query will be distributed by ordering the data accordingly and then partitioning into sequential blocks from the first record - If a non-null numeric/date/time column exists, the query will be distributed by partitioning on the first such column’s values
- The query will be distributed by sorting the data on the first column and then partitioning into sequential blocks from the first record
FILE PATHS
FILE PATHS
<file paths> is a comma-separated list of
single-quoted file paths from which data will be loaded; all files specified are presumed to have
the same format and data types.REMOTE QUERY clause, and is not
applicable to JDBC data sources.-
Data Source: If a data source is specified in the load options, these file paths must resolve
to accessible files at that data source location. A “path prefix” can be specified instead,
which will cause all files whose path begins with the given prefix to be included.
For example, a “path prefix” of
/data/gefor<file paths>would match all of the following:/data/geo.csv/data/geo/flights.csv/data/geo/2021/airline.csv
-
KiFS: The path must resolve to an accessible file path within KiFS.
A “path prefix” can be specified instead, which will cause all files whose path begins with the
given prefix to be included.
For example, a “path prefix” of
kifs://data/gewould match all of the following files under the KiFSdatadirectory:kifs://data/geo.csvkifs://data/geo/flights.csvkifs://data/geo/2021/airline.csv
FORMAT
FORMAT
| Keyword | Description |
|---|---|
[DELIMITED] TEXT | Any text-based, delimited field data file (CSV, PSV, TSV, etc.); a comma-delimited list of options can be given to specify the way in which the data file(s) should be parsed, including the delimiter used, whether headers are present, etc. Records spanning multiple lines are not supported. See Delimited Text Options for the complete list of <delimited text options>. |
AVRO | Apache Avro data file |
JSON | Either a JSON or GeoJSON data file See JSON/GeoJSON Limitations for the supported data types. |
PARQUET | Apache Parquet data file See Parquet Limitations for the supported data types. |
SHAPEFILE | ArcGIS shapefile |
WITH OPTIONS
WITH OPTIONS
<partition clause>
<partition clause>
<tier strategy clause>
<tier strategy clause>
<index clause>
<index clause>
<table property clause>
<table property clause>
Delimited Text Options
The following options can be specified when loading data from delimited text files. When reading from multiple files, options specific to the source file will be applied to each file being read.COMMENT = '<string>'
COMMENT = '<string>'
string as comments and skip.The default comment marker is #.DELIMITER = '<char>'
DELIMITER = '<char>'
char as the source file field delimiter.The default delimiter is a comma, unless a source file has one of these extensions:.psv- will cause|to be the delimiter.tsv- will cause the tab character to be the delimiter
ESCAPE = '<char>'
ESCAPE = '<char>'
char as the source file data escape character. The escape character preceding any
other character, in the source data, will be converted into that other character, except
in the following special cases:| Source Data String | Representation when Loaded into the Database |
|---|---|
<char>a | ASCII bell |
<char>b | ASCII backspace |
<char>f | ASCII form feed |
<char>n | ASCII line feed |
<char>r | ASCII carriage return |
<char>t | ASCII horizontal tab |
<char>v | ASCII vertical tab |
\, a \t encountered in
the data will be converted to a tab character when stored in the database.The escape character can be used to escape the quoting character, and will be treated as
an escape character whether it is within a quoted field value or not.There is no default escape character.HEADER DELIMITER = '<char>'
HEADER DELIMITER = '<char>'
char as the source file header field name/property delimiter, when the source file
header contains both names and properties. This is largely specific to the Kinetica
export to delimited text feature, which will, within each field’s header, contain the
field name and any associated properties, delimited by the pipe | character.An example Kinetica header in a CSV file:| (pipe) character. See
Delimited Text Option Characters for allowed characters.DELIMITER character will still be used to separate
field name/property sets from each other in the header rowINCLUDES HEADER = <TRUE|FALSE>
INCLUDES HEADER = <TRUE|FALSE>
TRUE.NULL = '<string>'
NULL = '<string>'
string as the indicator of a null source field value.The default is the empty string.QUOTE = '<char>'
QUOTE = '<char>'
char as the source file data quoting character, for enclosing field values.
Usually used to wrap field values that contain embedded delimiter characters, though any
field may be enclosed in quote characters (for clarity, for instance). The quote
character must appear as the first and last character of a field value in order to be
interpreted as quoting the value. Within a quoted value, embedded quote characters may be
escaped by preceding them with another quote character or the escape character specified
by ESCAPE, if given.The default is the " (double-quote) character. See
Delimited Text Option Characters for allowed characters.Delimited Text Option Characters
ForDELIMITER, HEADER DELIMITER, ESCAPE, & QUOTE, any single
character can be used, or any one of the following escaped characters:
| Escaped Char | Corresponding Source File Character |
|---|---|
'' | Single quote |
\a | ASCII bell |
\b | ASCII backspace |
\f | ASCII form feed |
\t | ASCII horizontal tab |
\v | ASCII vertical tab |
'') are specified for a QUOTE
character, the parser will interpret single quotes in the source file as
quoting characters; specifying \t for DELIMITER will cause the parser
to interpret ASCII horizontal tab characters in the source file as delimiter
characters.
Load Options
The following options can be specified to modify the way data is loaded (or not loaded) into the target table.BAD RECORD TABLE
BAD RECORD TABLE
| Column Name | Source Data Format Codes |
|---|---|
line_number | Number of the line in the input file containing the failed record |
char_number | Position of character within a failed record that is assessed as the beginning of the portion of the record that failed to process |
filename | Name of file that contained the failed record |
line_rejected | Text of the record that failed to process |
error_msg | Error message associated with the record processing failure |
ON ERROR mode of ABORT. In that
mode, processing stops at the first error and that error is returned to the user.BATCH SIZE
BATCH SIZE
COLUMN FORMATS
COLUMN FORMATS
- date
- time
- datetime
| Code | Description |
|---|---|
YYYY | 4-digit year |
MM | 2-digit month, where January is 01 |
DD | 2-digit day of the month, where the 1st of each month is 01 |
2010.10.30 into date column d and times of
the 24-hour format 18:36:54.789 into time column t:DATA SOURCE
DATA SOURCE
DEFAULT COLUMN FORMATS
DEFAULT COLUMN FORMATS
COLUMN FORMATS option.For example, to make the default format for loading source data dates like 2010.10.30
and 24-hour times like 18:36:54.789:FIELDS IGNORED BY
FIELDS IGNORED BY
POSITION or NAME. If ignoring by NAME, the
specified names must match the source file field names exactly.-
Identifying by Name:
-
Identifying by Position:
- When ignoring source data file fields, the set of fields that are not ignored must align, in type & number in their order in the source file, with the external table columns into which the data will be loaded.
- Ignoring fields by
POSITIONis only supported for delimited text files.
FIELDS MAPPED BY
FIELDS MAPPED BY
POSITION or NAME. If mapping by NAME, the
specified names must match the source file field names exactly.-
Identifying by Name:
-
Identifying by Position:
- When mapping source data file fields, the set of fields that are identified must align, in type & number in the specified order, with the external table columns into which data will be loaded.
- Mapping fields by
POSITIONis only supported for delimited text files.
FLATTEN_COLUMNS
FLATTEN_COLUMNS
FALSE.| Value | Description |
|---|---|
TRUE | Break up nested columns into multiple columns. |
FALSE | Treat nested columns as JSON columns instead of flattening. |
IGNORE_EXISTING_PK
IGNORE_EXISTING_PK
UPDATE_ON_EXISTING_PK option is used, then this options has no effect.The default is FALSE.| Value | Description |
|---|---|
TRUE | Suppress errors when inserted records and existing records’ PKs match. |
FALSE | Return errors when inserted records and existing records’ PKs match. |
INGESTION MODE
INGESTION MODE
FULL.| Value | Description |
|---|---|
DRY RUN | No data will be inserted, but the file will be read with the applied ON ERROR mode and the number of valid records that would normally be inserted is returned. |
FULL | Data is fully ingested according to the active ON ERROR mode. |
TYPE INFERENCE | Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response, as the output of a SHOW TABLE command. |
JDBC_FETCH_SIZE
JDBC_FETCH_SIZE
NUM_SPLITS_PER_RANK
NUM_SPLITS_PER_RANK
NUM_TASKS_PER_RANK
NUM_TASKS_PER_RANK
JDBC_SESSION_INIT_STATEMENT
JDBC_SESSION_INIT_STATEMENT
REFRESH ON START or SUBSCRIBE is TRUE.For example, to set the time zone to UTC before running each load, use:ON ERROR
ON ERROR
ABORT.| Value | Description |
|---|---|
SKIP | If an error is encountered parsing a source record, skip the record. |
ABORT | If an error is encountered parsing a source record, stop the data load process. Primary key collisions are considered abortable errors in this mode. |
POLL_INTERVAL
POLL_INTERVAL
SUBSCRIBE is
TRUE.REFRESH ON START
REFRESH ON START
FALSE. This option is ignored if SUBSCRIBE is TRUE.| Value | Description |
|---|---|
TRUE | Refresh the external table’s data when the database is restarted. |
FALSE | Do not refresh the external table’s data when the database is restarted. |
REMOTE_QUERY_INCREASING_COLUMN
REMOTE_QUERY_INCREASING_COLUMN
SUBSCRIBE is TRUE.REMOTE_QUERY_NO_SPLIT
REMOTE_QUERY_NO_SPLIT
FALSE.| Value | Description |
|---|---|
TRUE | Issue the remote data retrieval as a single query. |
FALSE | Distribute and parallelize the remote data retrieval in queries for blocks of data at a time. |
REMOTE_QUERY_ORDER_BY
REMOTE_QUERY_ORDER_BY
REMOTE_QUERY_PARTITION_COLUMN.If REMOTE_QUERY_NO_SPLIT is TRUE, a valid REMOTE_QUERY_PARTITION_COLUMN is specified, or the column given is
invalid, this option is ignored.REMOTE_QUERY_PARTITION_COLUMN
REMOTE_QUERY_PARTITION_COLUMN
REMOTE_QUERY_NO_SPLIT is TRUE or the column given is invalid, this option is ignored.SUBSCRIBE
SUBSCRIBE
DATA SOURCE option. Only relevant for materialized external tables using
data sources configured to allow streaming.The default is FALSE. If TRUE, the REFRESH ON START option is ignored.| Value | Description |
|---|---|
TRUE | Subscribe to the specified streaming data source. |
FALSE | Do not subscribe to the specified data source. |
TRUNCATE_STRINGS
TRUNCATE_STRINGS
VARCHAR columns that are
not large enough to hold the entire text value.The default is FALSE.| Value | Description |
|---|---|
TRUE | Truncate any inserted string value at the maximum size for its column. |
FALSE | Reject any record with a string value that is too long for its column. |
TYPE_INFERENCE_MODE
TYPE_INFERENCE_MODE
SPEED.| Value | Description |
|---|---|
ACCURACY | Scan all available data to arrive at column types that are the narrowest possible that can still hold all the data. |
SPEED | Pick the widest possible column types from the minimum data scanned in order to quickly arrive at column types that should fit all data values. |
UPDATE_ON_EXISTING_PK
UPDATE_ON_EXISTING_PK
FALSE.| Value | Description |
|---|---|
TRUE | Update existing records with records being inserted, when PKs match. |
FALSE | Discard records being inserted when existing records’ PKs match. |
Table Definition Clause
The table definition clause allows for an explicit local table structure to be defined, irrespective of the source data type. This specification mirrors that of CREATE TABLE.Partition Clause
An external table can be further segmented into partitions. The supported Partition Clause syntax & features are the same as those in the CREATE TABLE Partition Clause.Tier Strategy Clause
An external table can have a tier strategy specified at creation time. If not assigned a tier strategy upon creation, a default tier strategy will be assigned. The supported Tier Strategy Clause syntax & features are the same as those in the CREATE TABLE Tier Strategy Clause.Index Clause
An external table can have any number of indexes applied to any of its columns at creation time. The supported Index Clause syntax & features are the same as those in the CREATE TABLE Index Clause.Table Property Clause
A subset of table properties can be applied to the external table associated with the external data at creation time. The supported Table Property Clause syntax & features are the same as those in the CREATE TABLE Table Property Clause.Examples
To create a logical external table with the following features, using a query as the source of data:- External table named
ext_employee_dept2in theexampleschema - Source is
department2 employees from theexample.employeetable, queried through the example.jdbc_ds data source - Data is re-queried from the source each time the external table is queried
- External table named
ext_productin theexampleschema - External source is a KiFS file named
product.csvlocated in thedatadirectory - Data is not refreshed on database startup
- External table named
ext_employeein theexampleschema - External source is a Parquet file named
employee.parquetlocated in the KiFS directorydata - External table has a primary key on the
idcolumn - Data is not refreshed on database startup
- External table named
ext_employeein theexampleschema - External source is a file named
employee.csvlocated in the KiFS directorydata - Apply a date format to the
hire_datecolumn
- External table named
ext_productin theexampleschema - External source is a data source named
product_dsin theexampleschema - Source is a file named
products.csv - Data is refreshed on database startup
- External table named
ext_productin theexampleschema - External source is a data source named
product_dsin theexampleschema - Source is a file named
products.csv - Data updates are streamed continuously
- External table named
ext_employee_dept2in theexampleschema - External source is a data source named
jdbc_dsin theexampleschema - Source data is a remote query of employees in department 2 from that
database’s
example.ext_employeetable - Data is refreshed on database startup
Data Sources
File-Based
To create an external table that loads a CSV file, products.csv, from the data source example.product_ds, into a table namedexample.ext_product:
Query-Based
To create an external table that is the result of a remote query of employees in department 2 from the JDBC data source example.jdbc_ds, into a local table namedexample.ext_employee_dept2:
Change Data Capture
File-Based
To create an external table loaded by a set of order data in a change data capture scheme with the following conditions:- data pulled through a data source, example.order_ds
- data files contained with an orders directory
- initially, all files in the directory will be loaded; subsequently, only those files that have been updated since the last check will be reloaded
- files will be polled for updates every 60 seconds
- target table named
example.ext_order
Query-Based
To create an external table loaded from a remote query of orders in a change data capture scheme with the following conditions:- data pulled through a data source, example.jdbc_ds
- data contained with an
example.orderstable, where only orders for product with ID 42 will be loaded into the target table - initially, all orders will be loaded; subsequently, only those orders with an
order_idcolumn value higher than the highest one on the previous poll cycle will be loaded - remote table will be polled for updates every 60 seconds
- target table named
example.ext_order_product42
CREATE TABLE … AS
Creates a new table from the given query in the specified schema.SELECT list.
Parameters
| Parameter/Key | Description |
|---|---|
OR REPLACE | Any existing table or view with the same name will be dropped before creating this one |
REPLICATED | The table will be distributed within the database as a replicated table |
TEMP | The table will be a memory-only table; which, among other things, means it will not be persisted (if the database is restarted, the table will be removed), but it will have increased ingest performance |
<schema name> | Name of the schema that will contain the created table; if no schema is specified, the table will be created in the user’s default schema |
<table name> | Name of the table to create; must adhere to supported naming criteria |
<select statement> | The query that will define both the initial structure and content of the created table |
<select statement> to affect the resulting
table:
| Keyword | Type | Description |
|---|---|---|
| KI_HINT_GROUP_BY_PK | hint | Creates a primary keys on the columns in the GROUP BY clause if the outermost SELECT statement contains a GROUP BY |
| KI_HINT_INDEX(column list) | hint | Indexes each of the columns specified in the column list |
| KI_SHARD_KEY(column list) | pseudo- function | Shards the result table with a shard key composed of all columns in the specified column list |
Examples
To create a replicated temporary table that is a copy of an existing table, failing if a table with the same name as the target table already exists:a, b, c, & d a new
shard key on columns a & b, and an index on column d, replacing a
table with the same name as the target table, if it exists:
a, b, c, & d, preserving the
primary key on a, b, & c, and the foreign key from d; a new
table must be created to match the schema of the old one and then records can be
copied from the old one to the new one:
ALTER TABLE
Alters the configuration of a table. Any of the following facets of a table can be altered:- Name
- Schema
- Access Mode
- TTL
- Columns
- Column Indexes
- Low-Cardinality Indexes
- Chunk Skip Indexes
- Geospatial Indexes
- CAGRA Indexes
- HNSW Indexes
- Foreign Keys
- Partitions
- Tier Strategy
- External Data Source Subscription
Rename Table
A table can be renamed, following the supported naming criteria.Move Table
A table can be moved from one schema to another.sales_2017 table from the example_olap schema
to the example_archive schema:
Set Access Mode
A table can have its global accessibility modified for all users in the system, independently from and further restricting any role-based access controls in place. Note that changing the access mode cannot widen access for users not already granted access; it can only narrow access for those who already have access. This setting will also trump administrative access to a table.Set TTL
A table’s time-to-live (TTL) can be altered.Add Column
A column can be added, specifying a column definition.DEFAULT keyword. These values can either be a string/numeric constant or
the name of an existing column in the table from which values can be copied into
the new column. This default value is only in effect for the column creation;
the new column will have no default value after that.
Examples
To add, to the employee table, a salary column that is a non-nullable,
10-digit number field containing 2 decimal places with a default value of 0:
employee table, a category column that is a nullable,
dictionary-encoded, 32-character text
field:
employee table, a bio column that is a nullable,
text-searchable, unrestricted-width text field:
Rename Column
An existing column can be renamed.Modify Column
A column can have its column definition modified, affecting column type, column size, column properties, and nullability.0 for numeric fields.
Examples
To change, in the employee table, the first_name column to one that is a
non-nullable, dictionary-encoded,
50-character text field:
Drop Column
An existing column can be removed from a table.Add Column Index
A column (attribute) index can be added to a table column in order to improve the performance of operations whose expressions contain relational operators against the column. See Limitations for restrictions.employee table’s last_name column:
Drop Column Index
An existing column (attribute) index can be removed from a table.employee table’s last_name column:
Add Low-Cardinality Index
A low-cardinality index can be added to a table column with few distinct values in order to improve the performance of operations whose expressions contain relational operators against the column. See Limitations for restrictions.employee table’s dept_id column:
Drop Low-Cardinality Index
An existing low-cardinality index can be removed from a table.employee table’s dept_id column:
Add Chunk Skip Index
A chunk skip index can be added to a table column in order to improve the performance of operations containing equality-based filters against the column. See Limitations for restrictions.employee table’s employee ID column:
Drop Chunk Skip Index
An existing chunk skip index can be removed from a table.employee table’s employee ID column:
Add Geospatial Index
A geospatial index can be added to one or more table columns to improve the performance of geospatial functions applied to them.employee table’s work district WKT column:
employee table’s office location coordinate pair columns:
Drop Geospatial Index
An existing geospatial index can be removed from a table.employee table’s work
district WKT column:
employee table’s office location
coordinate pair columns:
Add CAGRA Index
A CAGRA index can be added to a table column in order to improve the performance of vector searches applied to the column.employee table’s profile column:
Refresh CAGRA Index
An existing CAGRA index can be refreshed.employee table’s profile column:
Drop CAGRA Index
An existing CAGRA index can be removed from a table.employee table’s profile column:
Add HNSW Index
An HNSW index can be added to a table column in order to improve the performance of vector searches applied to the column.employee table’s profile column:
Drop HNSW Index
An existing HNSW index can be removed from a table.employee table’s profile column:
Add Foreign Key
A foreign key can be added to any column or set of columns with primary key applicable types in order to improve the performance of join operations between the table being altered and the table referenced in the foreign key.employee table’s department ID
column, linking it to the department table’s department ID column:
Drop Foreign Key
An existing foreign key can be removed from a table, either by the name (alias) given to it during creation or by its definition:employee table’s department ID
column:
Add Partition
A partition can be added to a range-partitioned or list-partitioned table.Range Partition
The new partition can be given a minimum bound (inclusive) and a maximum bound (exclusive). If the new partition would come before an existing partition, omitting the maximum bound would cause the new partition to take on the nearest following existing partition’s minimum bound as its maximum bound. If the new partition would come after an existing partition, omitting the minimum bound would cause the new partition to take on the nearest preceding partition’s maximum bound as its minimum bound. If no partitions are present in the table, the new partition will have to declare both a minimum and maximum bound.customer_order_range_by_year table,
containing all records with a partition key less than 2020 and greater
than or equal to the maximum bound of the nearest preceding partition:
List Partition
The new partition can be given a list of values to match against the partition key values of incoming records.customer_order_manual_list_partition_by_year table, containing all records
from 2020:
customer_order_manual_list_partition_by_year_and_month table, containing all
records from February 2020 & April 2020:
Remove Partition
An existing partition can be removed from a range-partitioned or list-partitioned table, sending all data contained within that partition back to the default partition.order_2017 from the
customer_order_range_by_year table:
Delete Partition
An existing partition can be dropped from a range-partitioned or list-partitioned table, deleting all data contained within it.order_2014_2016 from the
customer_order_range_by_year table, deleting all data within that
partition:
Set Tier Strategy
A table’s eviction priorities can be adjusted by setting its tier strategy.customer_order table’s tier strategy, to one with
a below-average eviction priority in the
RAM Tier:
customer_order table’s tier strategy:
Manage Subscription
Any table that is subscribed to a streaming external data source can have that subscription paused, resumed, canceled, or dropped.ALTER TABLE. An external table will need to be
re-created, while a table whose subscription was initiated through the
LOAD INTO command can have that command
re-executed.ext_product external table
through the product_ds data source:
REFRESH EXTERNAL TABLE
Refreshes the data within a materialized external table that is not subscription-based.REPAIR TABLE
Repairs one or more corrupted tables. Tables are specified as a comma-separated list of table names.Parameters
<schema name>
<schema name>
<view name>
<view name>
REPAIR_POLICY
REPAIR_POLICY
| Type | Description |
|---|---|
delete_chunks | Deletes any corrupted chunks |
shrink_columns | Shrinks corrupted chunks to the shortest column |
replay_wal | Manually invokes WAL replay on the table |
Examples
To manually replay the WAL on two tables:TRUNCATE TABLE
Deletes all the records from a table.DROP TABLE
Removes an existing table. All dependent views, materialized views, streams, and SQL procedures will be dropped.Parameters
IF EXISTS
IF EXISTS
<schema name>
<schema name>
<table name>
<table name>
SHOW TABLE
Outputs the DDL statement required to reconstruct the given table.SHOW TABLE is a single-record result set
with the DDL statement as the value in the DDL column, shown below with
the column separators returned by kisql.Parameters
CREATE
CREATE
TABLE
TABLE
- if given, and a view with the given name exists instead, the command will return an error
- if omitted, and a view with the given name exists instead, the command will be interpreted as
a
SHOW VIEWstatement
<schema name>
<schema name>
<table name>
<table name>
Examples
To output the DDL for the example table created in the CREATE TABLE section:DESCRIBE TABLE
Lists the columns and column types & properties for a given table.Parameters
TABLE
TABLE
- if given, and a view with the given name exists instead, the command will return an error
- if omitted, and a view with the given name exists instead, the command will be interpreted as
a
DESCRIBE VIEWstatement
<schema name>
<schema name>
<table name>
<table name>
Examples
To describe the example table created in the CREATE TABLE section:CREATE VIEW
Creates a new virtual table from the given query.Parameters
OR REPLACE
OR REPLACE
<schema name>
<schema name>
<view name>
<view name>
<select statement>
<select statement>
Examples
To create a view that is a copy of an existing table, failing if a table or view with the same name as the target view already exists:CREATE MATERIALIZED VIEW
SpecifyingMATERIALIZED in a CREATE VIEW statement
will make the view a materialized view.
SELECT list. A new shard key can
be specified for the materialized view by using the
KI_SHARD_KEY(<column list>) pseudo-function in the SELECT list.
Parameters
OR REPLACE
OR REPLACE
TEMP
TEMP
<schema name>
<schema name>
<view name>
<view name>
REFRESH
REFRESH
| Constant | Description |
|---|---|
OFF | (the default) Will prevent the materialized view from being automatically refreshed, but will still allow manual refreshes of the data to be requested |
ON CHANGE | Will cause the materialized view to be updated any time a record is added, modified, or deleted from the subtending tables in its query |
ON QUERY | Will cause the materialized view to be updated any time it is queried |
EVERY | Allows specification of an interval in seconds, minutes, hours, or days, at which the materialized view should be refreshed. By default, the first refresh interval will be one interval’s worth of time from the point at which the materialized view creation was requested. This can be modified with the following options:
|
<select statement>
<select statement>
WITH OPTIONS
WITH OPTIONS
| Option | Description |
|---|---|
EXECUTE AS | Executes materialized view refreshes as the given user with that user’s privileges, when EVERY … is specified as the REFRESH method. If this user doesn’t exist at the time of a refresh, the refresh will be executed as the creating user, and, failing that, the system administration user. |
<table property clause>
<table property clause>
Table Property Clause
A subset of table properties can be applied to the materialized view at creation time. The supported Table Property Clause syntax & features are the same as those in the CREATE TABLE Table Property Clause.Changes-Only Views via Delta Tables
A materialized view can be configured to only show the changes in the query result since its last refresh. This applies strictly for inserts; updates and deletes will not be reflected in the view’s data. Using a materialized view to only show deltas in the result data can be achieved using theKI_HINT_DELTA_TABLE scoped hint. This hint should be
placed immediately after the table name (and before any table alias) of each
source table in the query that should be treated as a delta table—a table
which will contribute to the result set only those records added since the last
time the materialized view referencing it was refreshed.
The marked delta table must be a regular table—it cannot be any of the
following:
For instance, a changes-only materialized view weather_zone can be created
with its weather source table marked as a delta table in order to only
show new weather events since the last refresh of weather_zone:
Examples
To create a materialized view with columnsa, b, c, & d and a
new shard key on columns a & b, that refreshes once per half hour,
replacing a view with the same name as the target view, if it exists:
weather table and their locations from a static geo_zone table that have
occurred since the last time the materialized view weather_zone joining the
two data sets was refreshed:
ALTER VIEW
Alters the configuration of a view. The following facet of a view can be altered:Move View
A view can be moved from one schema to another.sales_2017 view from the example_olap
schema to the example_archive schema:
ALTER MATERIALIZED VIEW
Alters the configuration of a materialized view. Any of the following facets of a materialized view can be altered:Move View
A materialized view can be moved from one schema to another.sales_current view from the example_olap
schema to the example_archive schema:
Set Access Mode
A materialized view can have its global accessibility modified for all users in the system, independently from and further restricting any role-based access controls in place.Set Execution User
A materialized view can have its execution user for periodic refreshes (REFRESH is set to EVERY ...) changed to the given user.
Set Refresh Mode
The refresh mode of a materialized view can be modified.| Constant | Description |
|---|---|
OFF | Will prevent the materialized view from being automatically refreshed, but will still allow manual refreshes of the data to be requested |
ON CHANGE | Will cause the materialized view to be updated any time a record is added, modified, or deleted from the subtending tables in its query |
ON QUERY | Will cause the materialized view to be updated any time it is queried |
EVERY | Allows specification of an interval in seconds, minutes, hours, or days, at which the materialized view should be refreshed. By default, the first refresh interval will be one interval’s worth of time from the point at which the materialized view alteration was requested. This can be modified with the following options: * STARTING AT: specify a date or timestamp at which refresh cycles should begin * STOP AFTER: specify a date, timestamp, or time interval after which refresh cycles should end |
Set TTL
A materialized view’s time-to-live (TTL) can be altered.REFRESH VIEW
Refreshes the data within a materialized view. This will also re-validate invalidated materialized views, making them accessible once again if their query criteria is met (it can be successfully run again).DROP VIEW
Removes an existing view or materialized view. All dependent views, materialized views, streams, and SQL procedures will be dropped.Parameters
MATERIALIZED
MATERIALIZED
IF EXISTS
IF EXISTS
<schema name>
<schema name>
<view name>
<view name>
SHOW VIEW
Outputs the DDL statement required to reconstruct the given view or materialized view.SHOW VIEW is a single-record result set
with the DDL statement as the value in the DDL column, shown below with
the column separators returned by kisql.Parameters
CREATE
CREATE
VIEW
VIEW
- if given, and a table with the given name exists instead, the command will return an error
- if omitted, and a table with the given name exists instead, the command will be interpreted
as a
SHOW TABLEstatement
<schema name>
<schema name>
<view name>
<view name>
WITH OPTIONS
WITH OPTIONS
Examples
To output the DDL for the example view created in the CREATE VIEW section:DESCRIBE VIEW
Lists the columns and column types & properties for a given view or materialized view.Parameters
VIEW
VIEW
- if given, and a table with the given name exists instead, the command will return an error
- if omitted, and a table with the given name exists instead, the command will be interpreted
as a
DESCRIBE TABLEstatement
<schema name>
<schema name>
<view name>
<view name>
Examples
To describe the example view created in the CREATE VIEW section:CREATE CREDENTIAL
Creates a new credential, which is a record that contains authentication information required to connect to a resource outside the database. Any user may create a credential for their own use.- Azure (Microsoft blob storage)
- GCS (Google Cloud Storage)
- HDFS (Apache Hadoop Distributed File System)
- JDBC (Java Database Connectivity, using a user-supplied driver or one of the drivers on the supported list)
-
Kafka (streaming feed)
- Apache
- Confluent
-
Remote Repositories
- Docker
- Nvidia
- OpenAI
- S3 (Amazon S3 Bucket)
Parameters
OR REPLACE
OR REPLACE
<schema name>
<schema name>
<credential name>
<credential name>
TYPE
TYPE
| Type | Description |
|---|---|
aws_access_key | Authenticate to Amazon Web Services (AWS) via Access Key |
aws_iam_role | Authenticate to Amazon Web Services (AWS) via IAM Role |
azure_ad | Authenticate to Microsoft Azure via Active Directory |
azure_oauth | Authenticate to Microsoft Azure via OAuth |
azure_sas | Authenticate to Microsoft Azure via Shared Access Signature (SAS) using an account-level access token, not a container-level one |
azure_storage_key | Authenticate to Microsoft Azure via Storage Key |
confluent | Authenticate to a Confluent Kafka cluster or schema registry |
docker | Authenticate to a Docker repository |
gcs_service_account_id | Authenticate to Google Cloud via user ID & private key |
gcs_service_account_keys | Authenticate to Google Cloud via JSON key |
hdfs | Authenticate to HDFS |
jdbc | Authenticate via Java Database Connectivity |
kafka | Authenticate to an Apache Kafka cluster or schema registry |
nvidia_api_key | Authenticate to the Nvidia AI API |
openai_api_key | Authenticate to the OpenAI API |
IDENTITY
IDENTITY
SECRET
SECRET
WITH OPTIONS
WITH OPTIONS
Examples
To create a credential,auser_azure_active_dir_creds, for
connecting to Microsoft Azure Active Directory:
kafka_cred, for connecting to Apache Kafka via SSL:
Options
The following is a list of possible credential options and their associated providers. For valid combinations of credential options per provider and authentication mechanism, see Provider-Specific Syntax.| Option | Provider | Description |
|---|---|---|
azure_storage_account_name | Azure | Azure storage account name (only used if azure_tenant_id is specified) |
azure_tenant_id | Azure | Azure Active Directory tenant identifier |
gcs_service_account_keys | GCS | Text of the JSON file containing the GCS private key |
hdfs_kerberos_keytab | HDFS | Location of the Kerberos keytab file in KiFS |
hdfs_use_kerberos | HDFS | Whether to attempt Kerberos authentication to HDFS |
s3_aws_role_arn | S3 | AWS S3 IAM role |
sasl.kerberos.keytab | Kafka | Location of the Kerberos keytab file in KiFS |
sasl.kerberos.principal | Kafka | Kerberos principal ID |
sasl.kerberos.service.name | Kafka | Kerberos service name |
sasl.mechanism | Kafka | SASL scheme to use; one of: * PLAIN * GSSAPI |
sasl.password | Kafka | SASL user password |
sasl.username | Kafka | SASL user ID |
security.protocol | Kafka | Security protocol to use for authentication; one of: * SSL * SASL_SSL * SASL_PLAINTEXT |
ssl.ca.location | Kafka | Location of CA certificates file in KiFS |
ssl.certificate.location | Kafka | Location of client certificate in KiFS |
ssl.key.location | Kafka | Location of client key in KiFS |
ssl.key.password | Kafka | Password to client key or trust store |
Provider-Specific Syntax
Several authentication schemes across multiple providers are supported.Azure
Syntax below, examples here.GCS
Syntax below, examples here.HDFS
Syntax below, examples here.JDBC
Syntax below, examples here.Kafka (Apache)
Syntax below, examples here.Kafka (Confluent)
Syntax below, examples here.Remote Repository
Syntax below, examples here.S3
Syntax below, examples here.ALTER CREDENTIAL
Alters the properties of an existing credential.system_admin or system_user_admin, or users with
credential_admin on the credential, may alter it.
Parameters
<schema name>
<schema name>
<credential name>
<credential name>
SET PROPERTY
SET PROPERTY
Examples
To alter a credential,auser_azure_active_dir_creds, updating the secret:
kafka_cred, updating the keystore password:
Set Properties
All credential properties can be altered via ALTER CREDENTIAL. The following are the property names and descriptions to use when performing an alteration.TYPE
TYPE
| Type | Description |
|---|---|
aws_access_key | Authenticate to Amazon Web Services (AWS) via Access Key |
aws_iam_role | Authenticate to Amazon Web Services (AWS) via IAM Role |
azure_ad | Authenticate to Microsoft Azure via Active Directory |
azure_sas | Authenticate to Microsoft Azure via Shared Access Signature (SAS) |
azure_storage_key | Authenticate to Microsoft Azure via Storage Key |
confluent | Authenticate to a Confluent Kafka cluster or schema registry |
docker | Authenticate to a Docker repository |
gcs_service_account_id | Authenticate to Google Cloud via user ID & private key |
gcs_service_account_keys | Authenticate to Google Cloud via JSON key |
hdfs | Authenticate to HDFS |
jdbc | Authenticate via Java Database Connectivity |
kafka | Authenticate to a Kafka cluster or schema registry |
nvidia_api_key | Authenticate to the Nvidia AI API |
openai_api_key | Authenticate to the OpenAI API |
IDENTITY
IDENTITY
SECRET
SECRET
'<option>'
'<option>'
DROP CREDENTIAL
Removes an existing credential.system_admin or system_user_admin, or users with
credential_admin on the credential to drop, may drop a credential.
Parameters
<schema name>
<schema name>
<credential name>
<credential name>
* instead of schema/credential name to
drop all credentialsExamples
To drop a credential,auser_azure_active_dir_creds:
SHOW CREDENTIAL
Outputs the DDL statement required to reconstruct the given credential.secret value will be masked and would need to be replaced with the
actual secret value if attempting to reconstruct the credential. Only
users with system_admin or system_user_admin, or users with
credential_admin or credential_read on the credential to show, may
show a credential.
SHOW CREDENTIAL is a single-column result set
with the DDL statement as the value in the DDL column.Parameters
<schema name>
<schema name>
<credential name>
<credential name>
* instead of
schema/credential name to output the DDL of all credentialsExamples
To output the DDL for a credential,auser_azure_active_dir_creds:
DESCRIBE CREDENTIAL
Outputs the configuration of an existing credential.system_admin or system_user_admin, or users with
credential_admin or credential_read on the credential to describe, may
describe a credential.
Parameters
<schema name>
<schema name>
<credential name>
<credential name>
* instead
of schema/credential name to output the configuration of all credentialsResponse
The response toDESCRIBE CREDENTIAL is a four-column result set:
| Output Column | Description |
|---|---|
CREDENTIAL_NAME | Name of the credential |
CREDENTIAL_TYPE | Type of the credential; see CREATE CREDENTIAL for values |
CREDENTIAL_IDENTITY | Username associated with the credential |
CREDENTIAL_OPTIONS | Options associated with the credential; see Options for values |
Examples
To show the configuration for a credential,auser_azure_active_dir_creds:
CREATE DATA SOURCE
Creates a new data source, which contains the location and connection information for a data store that is external to Kinetica. A data source serves as an authentication & access mechanism to a remote resource and can optionally be used as a streaming source as well.- Azure (Microsoft blob storage)
- GCS (Google Cloud Storage)
- HDFS (Apache Hadoop Distributed File System)
- JDBC (Java Database Connectivity, using a user-supplied driver or one of the drivers on the supported list)
-
Kafka (streaming feed)
- Apache
- Confluent
- S3 (Amazon S3 Bucket)
- Azure anonymous data sources are only supported when both the container and the contained objects allow anonymous access.
- HDFS systems with wire encryption are not supported.
- Confluent & Kafka data sources require an associated credential for authentication.
Parameters
OR REPLACE
OR REPLACE
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data source name>
<data source name>
<provider>
<provider>
| Provider | Description |
|---|---|
AZURE | Microsoft Azure blob storage |
CONFLUENT | Confluent Kafka streaming feed The LOCATION can be a comma-delimited list of Kafka URLs, to be used for high-availability; only one of which will be streamed from at any given time. |
GCS | Google Cloud Storage |
HDFS | Apache Hadoop Distributed File System |
JDBC | JDBC connection, where LOCATION is the JDBC URL. See the supported list for the full list of supported drivers, or specify one with JDBC_DRIVER_JAR_PATH and JDBC_DRIVER_CLASS_NAME. |
KAFKA | Apache Kafka streaming feed The LOCATION can be a comma-delimited list of Kafka URLs, to be used for high-availability; only one of which will be streamed from at any given time. |
S3 | Amazon S3 bucket |
<host>
<host>
host is left blank:| Provider | Default Host |
|---|---|
| Azure | <storage_account_name>.blob.core.windows.net |
| GCS | storage.googleapis.com |
| S3 | <region>.amazonaws.com |
<port>
<port>
USER
USER
<username>, to use for authenticating to the
data sourcePASSWORD
PASSWORD
<password>, to use for authenticating to the
data sourceWITH OPTIONS
WITH OPTIONS
Examples
To create a data source,kin_ds, that connects to an Amazon S3 bucket,
kinetica_ds, in the US East (N. Virginia) region:
Options
| Option | Provider | Description |
|---|---|---|
CREDENTIAL | Any | Credential object to use to authenticate to the remote system |
VALIDATE | Any | Whether to test the connection to the data source upon creation; if TRUE (default), the creation of a data source that cannot be connected to will fail; if FALSE, the data source will be created regardless of connectivity |
WAIT TIMEOUT | Any except JDBC | Timeout in seconds for reading from the storage provider |
USE_MANAGED_CREDENTIALS | AZURE, GCS, S3 | Whether to connect to the storage provider with provider-managed credentials AZURE: Use the Azure Instance Metadata Service (IMDS) endpoint local to the Kinetica cluster to acquire credentials (only for on-prem clusters deployed within Azure) GCS: Satisfy the Application Default Credentials requirements for credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within GCS) S3: Use the AWS Default Credential Provider Chain to acquire credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within S3) |
CONNECTION TIMEOUT | HDFS, S3 | Timeout in seconds for connecting to a given storage provider |
CONTAINER NAME | AZURE | Azure storage container name |
SAS TOKEN | AZURE | Azure storage account shared access signature token; this should be an account-level access token, not a container-level one |
STORAGE ACCOUNT NAME | AZURE | Azure storage account name (only used if TENANT ID is specified) |
TENANT ID | AZURE | Azure Active Directory tenant identifier |
GCS_BUCKET_NAME | GCS | Name of the GCS bucket to use as the data source |
GCS_PROJECT_ID | GCS | Name of the Google Cloud project to use for request billing |
GCS_SERVICE_ACCOUNT_KEYS | GCS | Text of the JSON key file containing the GCS private key |
DELEGATION TOKEN | HDFS | Optional Kerberos delegation token for worker nodes; if not specified, the token will be acquired from HDFS |
KERBEROS KEYTAB | HDFS | Location of the Kerberos keytab file in KiFS |
USE KERBEROS | HDFS | Whether to attempt Kerberos authentication to HDFS |
JDBC_DRIVER_CLASS_NAME | JDBC | JDBC driver class name (optional, if the name is available in the JAR file’s manifest) |
JDBC_DRIVER_JAR_PATH | JDBC | KiFS path of the JDBC driver JAR file to use |
KAFKA_TOPIC_NAME | KAFKA, CONFLUENT | Kafka topic to access |
SCHEMA_REGISTRY_CREDENTIAL | KAFKA, CONFLUENT | Credential object to use to authenticate to the Confluent Schema Registry |
SCHEMA_REGISTRY_LOCATION | KAFKA, | Location of the Confluent Schema Registry in format: [storage_path[:storage_port]] |
SCHEMA_REGISTRY_PORT | KAFKA, CONFLUENT | Port of the Confluent Schema Registry |
BUCKET NAME | S3 | Amazon S3 bucket name |
REGION | S3 | Amazon S3 region identifier |
S3_AWS_ROLE_ARN | S3 | Amazon Resource Name (ARN) specifying the role |
S3_ENCRYPTION_CUSTOMER_ALGORITHM | S3 | Algorithm used to encrypt/decrypt data |
S3_ENCRYPTION_CUSTOMER_KEY | S3 | Key used to encrypt/decrypt data |
S3_USE_VIRTUAL_ADDRESSING | S3 | Whether to use virtual addressing when referencing the S3 sink true: The requests URI should be specified in virtual-hosted-style format where the bucket name is part of the domain name in the URL false: Use path-style URI for requests |
S3_VERIFY_SSL | S3 | Whether to verify SSL connections true: Verify SSL connections (default) false: Don’t verify SSL connections; for testing purposes, bypassing TLS errors, self-signed certificates, etc. |
Provider-Specific Syntax
Several authentication schemes across multiple providers are supported.Azure
Syntax below, examples here.GCS
Syntax below, examples here.HDFS
Syntax below, examples here.JDBC
Syntax below, examples here.Kafka (Apache)
Syntax below, examples here.Kafka (Confluent)
Syntax below, examples here.S3
Syntax below, examples here.ALTER DATA SOURCE
Alters the connection parameters of an existing data source.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data source name>
<data source name>
SET PROPERTY
SET PROPERTY
Examples
To alter a data source,kin_ds, updating the username & password:
Set Properties
All data source properties can be altered via ALTER DATA SOURCE. The following are the property names and descriptions to use when performing an alteration.Provider-Agnostic Properties
LOCATION
LOCATION
| Provider | Description |
|---|---|
AZURE | Microsoft Azure blob storage |
CONFLUENT | Confluent Kafka streaming feed |
GCS | Google Cloud Storage |
HDFS | Apache Hadoop Distributed File System |
JDBC | JDBC connection, where LOCATION is the JDBC or CData JDBC URL. See the supported list for the full list of supported drivers, or specify one with JDBC_DRIVER_JAR_PATH and JDBC_DRIVER_CLASS_NAME. |
KAFKA | Apache Kafka streaming feed |
S3 | Amazon S3 bucket |
USER
USER
<username>, to use for authenticating to the data sourcePASSWORD
PASSWORD
<password>, to use for authenticating to the data sourceCREDENTIAL
CREDENTIAL
VALIDATE
VALIDATE
TRUE, the creation of a
data source that cannot be connected to will fail; if FALSE, the data source will be
created regardless of connectivity. Default is TRUE.Provider-Specific Properties
| Option | Provider | Description |
|---|---|---|
WAIT TIMEOUT | Any except JDBC | Timeout in seconds for reading from the storage provider |
USE_MANAGED_CREDENTIALS | AZURE, GCS, S3 | Whether to connect to the storage provider with provider-managed credentials AZURE: Use the Azure Instance Metadata Service (IMDS) endpoint local to the Kinetica cluster to acquire credentials (only for on-prem clusters deployed within Azure) GCS: Satisfy the Application Default Credentials requirements for credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within GCS) S3: Use the AWS Default Credential Provider Chain to acquire credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within S3) |
CONNECTION TIMEOUT | HDFS, S3 | Timeout in seconds for connecting to a given storage provider |
CONTAINER NAME | AZURE | Azure storage container name |
SAS TOKEN | AZURE | Azure storage account shared access signature token; this should be an account-level access token, not a container-level one |
STORAGE ACCOUNT NAME | AZURE | Azure storage account name (only used if TENANT ID is specified) |
TENANT ID | AZURE | Azure Active Directory tenant identifier |
GCS_BUCKET_NAME | GCS | Name of the GCS bucket to use as the data source |
GCS_PROJECT_ID | GCS | Name of the Google Cloud project to use for request billing |
GCS_SERVICE_ACCOUNT_KEYS | GCS | Text of the JSON key file containing the GCS private key |
DELEGATION TOKEN | HDFS | Optional Kerberos delegation token for worker nodes; if not specified, the token will be acquired from HDFS |
KERBEROS KEYTAB | HDFS | Location of the Kerberos keytab file in KiFS |
USE KERBEROS | HDFS | Whether to attempt Kerberos authentication to HDFS |
JDBC_DRIVER_CLASS_NAME | JDBC | JDBC driver class name (optional, if the name is available in the JAR file’s manifest) |
JDBC_DRIVER_JAR_PATH | JDBC | KiFS path of the JDBC driver JAR file to use |
KAFKA_TOPIC_NAME | KAFKA, CONFLUENT | Kafka topic to access |
SCHEMA_REGISTRY_CREDENTIAL | KAFKA, CONFLUENT | Credential object to use to authenticate to the Confluent Schema Registry |
SCHEMA_REGISTRY_LOCATION | KAFKA, | Location of the Confluent Schema Registry in format: [storage_path[:storage_port]] |
SCHEMA_REGISTRY_PORT | KAFKA, CONFLUENT | Port of the Confluent Schema Registry |
BUCKET NAME | S3 | Amazon S3 bucket name |
REGION | S3 | Amazon S3 region identifier |
S3_AWS_ROLE_ARN | S3 | Amazon Resource Name (ARN) specifying the role |
S3_ENCRYPTION_CUSTOMER_ALGORITHM | S3 | Algorithm used to encrypt/decrypt data |
S3_ENCRYPTION_CUSTOMER_KEY | S3 | Key used to encrypt/decrypt data |
S3_USE_VIRTUAL_ADDRESSING | S3 | Whether to use virtual addressing when referencing the S3 sink true: The requests URI should be specified in virtual-hosted-style format where the bucket name is part of the domain name in the URL false: Use path-style URI for requests |
S3_VERIFY_SSL | S3 | Whether to verify SSL connections true: Verify SSL connections (default) false: Don’t verify SSL connections; for testing purposes, bypassing TLS errors, self-signed certificates, etc. |
LIST DATA SOURCE
Outputs a list of tables accessible through a JDBC data source.LIST DATA SOURCE is a single-column result set
of table names in <schema_name>.<table_name> format.Parameters
DATA SOURCE
DATA SOURCE
<schema name>
<schema name>
<data source name>
<data source name>
WITH OPTIONS
WITH OPTIONS
Examples
To list all the tables accessible through a data source,kin_ds_jdbc:
Options
The_ and % characters can be used in any of these options to match
any single character or any number of characters in object names, respectively.
If multiple options are used, the tables must match all specified criteria.
CATALOG
CATALOG
SCHEMA
SCHEMA
TABLE
TABLE
DROP DATA SOURCE
Removes an existing data source.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data source name>
<data source name>
Examples
To drop a data source,kin_ds:
SHOW DATA SOURCE
Outputs the DDL statement required to reconstruct the given data source.PASSWORD value will be masked and would need to be replaced with the
actual password if attempting to reconstruct the data source.
SHOW DATA SOURCE is a single-column result set
with the DDL statement as the value in the DDL column.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data source name>
<data source name>
* instead of
schema/data source name to output the DDL of all data sourcesExamples
To output the DDL for a data source,kin_ds:
DESCRIBE DATA SOURCE
Outputs the configuration of an existing data source.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data source name>
<data source name>
* instead of
schema/data source name to output the configuration of all data sourcesResponse
The response toDESCRIBE DATA SOURCE is a three-column result set:
| Output Column | Description |
|---|---|
DATA_SOURCE | Name of the data source |
STORAGE_PROVIDER_TYPE | Keyword associated with storage provider |
ADDITIONAL_INFO | Data source configuration; passwords will be redacted |
Examples
To show the configuration for a data source,kin_ds:
CREATE DATA SINK
Creates a new data sink, which contains the location and connection information for a data consumer that is generally external to Kinetica. A data sink serves as an authentication & access mechanism to the remote resource.- Azure (Microsoft blob storage)
- GCS (Google Cloud Storage)
- HDFS (Apache Hadoop Distributed File System)
- JDBC (Java Database Connectivity, using a user-supplied driver or one of the drivers on the supported list)
- Kafka (Apache Kafka streaming feed)
- Local (Table within the same Kinetica instance)
- S3 (Amazon S3 Bucket)
- Webhook (HTTP/HTTPS)
Parameters
OR REPLACE
OR REPLACE
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data sink name>
<data sink name>
<consumer>
<consumer>
| Consumer | Description |
|---|---|
AZURE | Microsoft Azure blob storage |
GCS | Google Cloud Storage |
HDFS | Apache Hadoop Distributed File System |
HTTP | Unsecured webhook |
HTTPS | Secured webhook |
JDBC | JDBC connection, where LOCATION is the JDBC URL. See the supported list for the full list of supported drivers, or specify one with JDBC_DRIVER_JAR_PATH and JDBC_DRIVER_CLASS_NAME. |
KAFKA | Apache Kafka broker |
TABLE | Local Kinetica table residing in the same database as the data sink: |
S3 | Amazon S3 bucket |
<target>
<target>
target is left blank:| Consumer | Default Host |
|---|---|
| Azure | <storage_account_name>.blob.core.windows.net |
| GCS | storage.googleapis.com |
| S3 | <region>.amazonaws.com |
target is the name of the name of the target table:<port>
<port>
WITH OPTIONS
WITH OPTIONS
Examples
To create a data sink,kin_dsink, that connects to an Apache Kafka broker:
Options
| Option | Consumer | Description |
|---|---|---|
CREDENTIAL | Any | Credential object to use to authenticate to the remote consumer |
VALIDATE | Any | Whether to test the connection to the data sink upon creation; if TRUE (default), the creation of a data sink that cannot be connected to will fail; if FALSE, the data sink will be created regardless of connectivity |
WAIT TIMEOUT | Any except JDBC | Timeout in seconds for reading from the consumer |
CONNECTION TIMEOUT | Any except JDBC | Timeout in seconds for connecting to a consumer |
USE_HTTPS | AZURE, GCS, S3 | Whether to connect to the storage consumer over HTTPS or not true: Use HTTPS (default) false: Use HTTP |
USE_MANAGED_CREDENTIALS | AZURE, GCS, S3 | Whether to connect to the storage provider with consumer-managed credentials AZURE: Use the Azure Instance Metadata Service (IMDS) endpoint local to the Kinetica cluster to acquire credentials (only for on-prem clusters deployed within Azure) GCS: Satisfy the Application Default Credentials requirements for credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within GCS) S3: Use the AWS Default Credential Provider Chain to acquire credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within S3) |
KAFKA_TOPIC_NAME | KAFKA | Kafka topic to write to |
JSON_FORMAT | HTTP, HTTPS, KAFKA | The desired format of JSON encoded notifications message. flat: A single record is returned per message nested: Records are returned in an array per message |
MAX_BATCH_SIZE | HTTP, HTTPS, KAFKA | Maximum number of records per notification message |
MAX_MESSAGE_SIZE | HTTP, HTTPS, KAFKA | Maximum size in bytes of each notification message |
CONTAINER NAME | AZURE | Azure storage container name |
SAS TOKEN | AZURE | Azure storage account shared access signature token; this should be an account-level access token, not a container-level one |
STORAGE ACCOUNT NAME | AZURE | Azure storage account name (only used if TENANT ID is specified) |
TENANT ID | AZURE | Azure Active Directory tenant identifier |
GCS_BUCKET_NAME | GCS | Name of the GCS bucket to use as the data sink |
GCS_PROJECT_ID | GCS | Name of the Google Cloud project to use for request billing |
GCS_SERVICE_ACCOUNT_KEYS | GCS | Text of the JSON key file containing the GCS private key |
DELEGATION TOKEN | HDFS | Optional Kerberos delegation token for worker nodes; if not specified, the token will be acquired from HDFS |
KERBEROS KEYTAB | HDFS | Location of the Kerberos keytab file in KiFS |
USE KERBEROS | HDFS | Whether to attempt Kerberos authentication to HDFS |
JDBC_DRIVER_CLASS_NAME | JDBC | JDBC driver class name (optional, if the name is available in the JAR file’s manifest) |
JDBC_DRIVER_JAR_PATH | JDBC | KiFS path of the JDBC driver JAR file to use |
BUCKET NAME | S3 | Amazon S3 bucket name |
REGION | S3 | Amazon S3 region identifier |
S3_AWS_ROLE_ARN | S3 | Amazon Resource Name (ARN) specifying the role |
S3_ENCRYPTION_CUSTOMER_ALGORITHM | S3 | Algorithm used to encrypt/decrypt data |
S3_ENCRYPTION_CUSTOMER_KEY | S3 | Key used to encrypt/decrypt data |
S3_ENCRYPTION_TYPE | S3 | Server side encryption type |
S3_KMS_KEY_ID | S3 | KMS key |
S3_USE_VIRTUAL_ADDRESSING | S3 | Whether to use virtual addressing when referencing the S3 sink true: The requests URI should be specified in virtual-hosted-style format where the bucket name is part of the domain name in the URL false: Use path-style URI for requests |
S3_VERIFY_SSL | S3 | Whether to verify SSL connections true: Verify SSL connections (default) false: Don’t verify SSL connections; for testing purposes, bypassing TLS errors, self-signed certificates, etc. |
Consumer-Specific Syntax
Several authentication schemes across multiple consumers are supported.Azure
Syntax below, examples here.GCS
Syntax below, examples here.HDFS
Syntax below, examples here.JDBC
Syntax below, examples here.Kafka
Syntax below, examples here.Local (Kinetica)
Syntax below, examples here.S3
Syntax below, examples here.Webhook
Syntax below, examples here.ALTER DATA SINK
Alters the connection parameters of an existing data sink.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data sink name>
<data sink name>
SET PROPERTY
SET PROPERTY
Examples
To alter a data sink,kin_dsink, updating the timeouts:
Set Properties
All data sink properties can be altered via ALTER DATA SINK. The following are the property names and descriptions to use when performing an alteration.| Option | Consumer | Description |
|---|---|---|
CREDENTIAL | Any | Credential object to use to authenticate to the remote consumer |
LOCATION | Any | Location of the data sink; see Consumer-Specific Syntax for details |
VALIDATE | Any | Whether to test the connection to the data sink upon creation; if TRUE (default), the creation of a data sink that cannot be connected to will fail; if FALSE, the data sink will be created regardless of connectivity |
WAIT TIMEOUT | Any except JDBC | Timeout in seconds for reading from the consumer |
CONNECTION TIMEOUT | Any except JDBC | Timeout in seconds for connecting to a consumer |
USE_HTTPS | AZURE, GCS, S3 | Whether to connect to the storage provider over HTTPS or not true: Use HTTPS (default) false: Use HTTP |
USE_MANAGED_CREDENTIALS | AZURE, GCS, S3 | Whether to connect to the storage provider with consumer-managed credentials AZURE: Use the Azure Instance Metadata Service (IMDS) endpoint local to the Kinetica cluster to acquire credentials (only for on-prem clusters deployed within Azure) GCS: Satisfy the Application Default Credentials requirements for credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within GCS) S3: Use the AWS Default Credential Provider Chain to acquire credentials, making sure any supporting server-side configuration (files, environment variables) is done on every node in the cluster (only for on-prem clusters deployed within S3) |
KAFKA_TOPIC_NAME | KAFKA | Kafka topic to write to |
JSON_FORMAT | HTTP, HTTPS, KAFKA | The desired format of JSON encoded notifications message. flat: A single record is returned per message nested: Records are returned in an array per message |
MAX_BATCH_SIZE | HTTP, HTTPS, KAFKA | Maximum number of records per notification message |
MAX_MESSAGE_SIZE | HTTP, HTTPS, KAFKA | Maximum size in bytes of each notification message |
CONTAINER NAME | AZURE | Azure storage container name |
SAS TOKEN | AZURE | Azure storage account shared access signature token; this should be an account-level access token, not a container-level one |
STORAGE ACCOUNT NAME | AZURE | Azure storage account name (only used if TENANT ID is specified) |
TENANT ID | AZURE | Azure Active Directory tenant identifier |
GCS_BUCKET_NAME | GCS | Name of the GCS bucket to use as the data sink |
GCS_PROJECT_ID | GCS | Name of the Google Cloud project to use for request billing |
GCS_SERVICE_ACCOUNT_KEYS | GCS | Text of the JSON key file containing the GCS private key |
DELEGATION TOKEN | HDFS | Optional Kerberos delegation token for worker nodes; if not specified, the token will be acquired from HDFS |
KERBEROS KEYTAB | HDFS | Location of the Kerberos keytab file in KiFS |
USE KERBEROS | HDFS | Whether to attempt Kerberos authentication to HDFS |
JDBC_DRIVER_CLASS_NAME | JDBC | JDBC driver class name (optional, if the name is available in the JAR file’s manifest) |
JDBC_DRIVER_JAR_PATH | JDBC | KiFS path of the JDBC driver JAR file to use |
BUCKET NAME | S3 | Amazon S3 bucket name |
REGION | S3 | Amazon S3 region identifier |
S3_AWS_ROLE_ARN | S3 | Amazon Resource Name (ARN) specifying the role |
S3_ENCRYPTION_CUSTOMER_ALGORITHM | S3 | Algorithm used to encrypt/decrypt data |
S3_ENCRYPTION_CUSTOMER_KEY | S3 | Key used to encrypt/decrypt data |
S3_ENCRYPTION_TYPE | S3 | Server side encryption type |
S3_KMS_KEY_ID | S3 | KMS key |
S3_USE_VIRTUAL_ADDRESSING | S3 | Whether to use virtual addressing when referencing the S3 sink true: The requests URI should be specified in virtual-hosted-style format where the bucket name is part of the domain name in the URL false: Use path-style URI for requests |
S3_VERIFY_SSL | S3 | Whether to verify SSL connections true: Verify SSL connections (default) false: Don’t verify SSL connections; for testing purposes, bypassing TLS errors, self-signed certificates, etc. |
DROP DATA SINK
Removes an existing data sink.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data sink name>
<data sink name>
Examples
To drop a data sink,kin_dsink:
SHOW DATA SINK
Outputs the DDL statement required to reconstruct the given data sink.SHOW DATA SINK is a single-column result set
with the DDL statement as the value in the DDL column.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data sink name>
<data sink name>
* instead of
schema/data sink name to output the DDL of all data sinksExamples
To output the DDL for a data sink,kin_dsink:
DESCRIBE DATA SINK
Outputs the configuration of an existing data sink.Parameters
EXTERNAL
EXTERNAL
<schema name>
<schema name>
<data sink name>
<data sink name>
* instead of
schema/data sink name to output the configuration of all data sinksResponse
The response toDESCRIBE DATA SINK is a three-column result set:
| Output Column | Description |
|---|---|
DATA_SINK | Name of the data sink |
STORAGE_PROVIDER_TYPE | Keyword identifying data sink consumer |
ADDITIONAL_INFO | Data sink configuration |
Examples
To show the configuration for a data sink,kin_dsink:
CREATE STREAM
Creates a new data stream (natively, a table monitor), which publishes changes in a given table to a target.- a table (external tables not supported)
- a materialized view (insert monitoring only)
- an external Apache Kafka broker
- an external webhook
- a local database table
Parameters
<stream schema name>
<stream schema name>
<stream name>
<stream name>
TABLE
TABLE
<table schema name>
<table schema name>
<table name>
<table name>
QUERY
QUERY
<query>
<query>
<delta_table> is the data table being monitored and <lookup_table> contains the
criteria for whether a <delta_table> record should be streamed or not, as determined by
the given <join_clause>.REFRESH
REFRESH
| Constant | Description |
|---|---|
ON CHANGE | Will cause notifications to be streamed any time a record is added, modified, or deleted from the monitored table |
EVERY | Allows specification of an interval in seconds, minutes, hours, or days, with the optional specification of a starting time at which the first monitor interval will run; if no start time is specified, the default will be an interval’s worth of time from the point at which the stream was created |
<filter expression>
<filter expression>
WITH OPTIONS
WITH OPTIONS
DATASINK_NAME or DESTINATION is required.Examples
To create a stream,kin_stream, that publishes inserts into the
order_stream table of orders over $10,000 via the kin_dsink data sink:
kin_stream, that publishes inserts into the
order_stream table of orders, placed within a given string-literal geofence,
via the kin_dsink data sink:
kin_stream, that publishes inserts into the
order_stream table of orders, placed within a table-driven geofence defined
in order_location, via the kin_dsink data sink; noting that the
order_location table in this example is
replicated:
kin_stream, that inserts into the local
order_target table any orders over $10,000 inserted into the
order_stream table:
Options
| Option | Consumer | Description |
|---|---|---|
DATASINK_NAME | Any | Data sink object to use to identify the consumer of this stream; mutually exclusive with destination |
DESTINATION | Any | Location of the target when not using a data sink; the URL of the consumer to connect to in this format: <consumer>://<target>[:<port>] Supported consumers include: http: Unsecured webhook https: Secured webhook kafka: Apache Kafka broker table: Local Kinetica table, where target is the name of the target table in the following format: [<schema name>.]<table name> |
EVENT | Any | Specifies the type of event to monitor insert: (default) Monitor inserts and publish them to the target update: Monitor updates and publish their counts to the target delete: Monitor deletes and publish their counts to the target |
INCREASING_COLUMN | Any | Column of the monitored table that contains ever-increasing values for new data; e.g., a timestamp or sequence ID column |
MAX_CONSECUTIVE_FAILURES | Any | Maximum number of failed notification attempts before automatically suspending the the stream. The default, -1, disables auto-suspension. |
FAILED_NOTIFICATIONS_TABLE_NAME | http https kafka | Name of the table to which failed stream notifications are written and from which they are replayed when the stream is re-enabled from a suspended state. The default is no table for storing/replaying failed stream notifications. |
KAFKA_TOPIC_NAME | kafka | Kafka topic to write to, if the target is a Kafka broker |
Consumer-Specific Syntax
Several configurations across multiple consumers are supported.DROP STREAM
Removes an existing stream.Parameters
<schema name>
<schema name>
<stream name>
<stream name>
Examples
To drop a stream,kin_stream:
SHOW STREAM
Outputs the DDL statement required to reconstruct one or more existing streams.[<schema name>.]<stream name>- output the DDL statement of the given stream<schema name>.*- output the DDL statements of all streams under the given schema*- output the DDL statements of all streams
SHOW STREAM is a single-column result set
with the DDL statement as the value in the DDL column.Parameters
<schema name>
<schema name>
<stream name>
<stream name>
Examples
To output the DDL for a stream,kin_stream:
DESCRIBE STREAM
Outputs the configuration of one or more existing streams.[<schema name>.]<stream name>- output the configuration of the given stream<schema name>.*- output the configuration of all streams under the given schema*- output the configuration of all streams
Parameters
<schema name>
<schema name>
<stream name>
<stream name>
Response
The response toDESCRIBE STREAM is a nine-column result set:
| Output Column | Description |
|---|---|
MONITOR_ID | Unique name or topic ID for the stream |
TABLE_NAME | Name of the table being monitored |
EVENT | Table data event that triggers streamed content |
INCREASING_COLUMN | Column in table being monitored that contains ever-increasing data values |
FILTER_EXPRESSION | Filter used to monitor only data of interest |
MAX_CONSECUTIVE_FAILURES | Maximum number of failed notification attempts before suspending the stream; -1 disables auto-suspend |
FAILED_NOTIFICATIONS_TABLE_NAME | Name of the table to which failed stream notifications are written and replayed when re-enabled |
DATASINK_NAME | Name of the data sink associated with the consumer |
STATUS | Whether the stream is enabled or suspended |
Examples
To show the configuration for a stream,kin_stream:
DESCRIBE
Lists the contained tables and views of a given schema, or lists the columns and column types & properties for a given table or view.Examples
To describe the tables contained in thedemo schema, into
which demo data is usually downloaded:
Comments
Comments, text-based tags, can be applied to a variety of object types. Normally, comments are applied to objects after the object is created; however, comments can also be applied in-line to table columns within a CREATE TABLE or CREATE EXTERNAL TABLE statement.Parameters
<object type>
<object type>
| Type | Description |
|---|---|
SCHEMA | Apply a schema-level comment; can be viewed in the ki_catalog.ki_schemas Kinetica catalog table |
TABLE | Apply a table-level comment; can be viewed in the ki_catalog.ki_objects Kinetica catalog table |
VIEW | Apply a view-level comment; can be viewed in the ki_catalog.ki_objects Kinetica catalog table |
COLUMN | Apply a column-level comment; can be viewed in the ki_catalog.ki_columns Kinetica catalog table |
PROCEDURE | Apply a procedure-level comment; can be viewed in the ki_catalog.ki_objects Kinetica catalog table |
USER | Apply a user-level comment; can be viewed in the ki_catalog.ki_users_and_roles Kinetica catalog table |
ROLE | Apply a role-level comment; can be viewed in the ki_catalog.ki_users_and_roles Kinetica catalog table |
<object name>
<object name>
<comment text>
<comment text>
<column comment>to the column associated with<column name>