A type is analogous to a traditional database schema for a table. Before data can be stored in Kinetica, a type must be specified for that data. For every type, Kinetica assigns a unique GUID. Kinetica will use the same GUID for all types with identical characteristics.
Every type in Kinetica consists of the following:
A type label serves as a tagging mechanism for the type. The type label can be any text string specified by the client. The type label serves two purposes. First, it identifies tables with similar data. Second, it helps determine a type’s uniqueness.
A type schema consists of a set of column names and their respective primitive types. Each column can also be assigned a number of pre-defined properties that determine aspects of that column, like searchability and whether or not the column is indexed.
A given column in a type schema must be of one of the following primitive types:
Base Type | Native Type | Bytes in Memory | Minimum Value | Maximum Value |
---|---|---|---|---|
int | integer (default) | 4 | -2147483648 | 2147483647 |
int16 | 2 | -32768 | 32767 | |
int8 | 1 | -128 | 127 | |
long | long (default) | 8 | -9223372036854775808 | 9223372036854775807 |
timestamp | 8 | -30610224000000 (1/1/1000 00:00:00.000) |
29379542399999 (12/31/2900 23:59:59.999) |
|
float | float (default) | 4 | -3.40282 * 1038 | 3.40282 * 1038 |
double | double (default) | 8 | -1.79769313486231 * 10308 | 1.79769313486231 * 10308 |
string | string (default) | 8 | (empty string) | (100,000,000,000 characters) |
char256 | 256 | (empty string) | (256 characters) | |
char128 | 128 | (empty string) | (128 characters) | |
char64 | 64 | (empty string) | (64 characters) | |
char32 | 32 | (empty string) | (32 characters) | |
char16 | 16 | (empty string) | (16 characters) | |
char8 | 8 | (empty string) | (8 characters) | |
char4 | 4 | (empty string) | (4 characters) | |
char2 | 2 | (empty string) | (2 characters) | |
char1 | 1 | (empty string) | (1 character) | |
ipv4 | 4 | 0.0.0.0 | 255.255.255.255 | |
decimal | 8 | -922337203685477.5808 | 922337203685477.5807 | |
date | 4 | 1000-01-01 | 2900-12-31 | |
time | 4 | 00:00:00.000 | 23:59:59.999 | |
datetime | 8 | 1000-01-01 00:00:00.000 | 2900-12-31 23:59:59.999 | |
wkt | varies | (empty string) | (100,000,000,000 characters) | |
bytes | bytes (default) | N/A | (empty array) | (100,000,000,000 bytes) |
wkt | varies | (empty array) | (100,000,000,000 bytes) |
Note
Adding nullability to a column requires an additional byte per value; e.g., a nullable integer requires 5 bytes in memory instead of 4.
The decimal type should be used instead of float or double when exact values need to be stored and rounding errors are unacceptable, such as currency.
Decimal fields are supported by most, but not all, endpoint functions. Those that currently support decimal are:
Kinetica has an additional layer of semantic regarding column data type. At the time of creation, a column can be given one or more of the supported properties, which give the column special meaning or handling. The properties can refine the data type, direct special handling, or define the keyed nature of the column. These modifiers can impact the number of records that can be stored in memory, the performance of queries, and the types of operations that can be performed on the data.
The following properties can be used to modify the allowable set of values for the corresponding base type. Only one of the following properties may be applied to a given column, and the column must be of the stated base type. The exception to this is nullability, which can be applied to any column in addition to the data type specifiers listed here. However, nullable is not used at type creation, but will be returned in a /show/table call. Denoting a column's nullability is API-dependent, and, with the exception of the Java API, does not involve the direct use of the nullable column property.
Property | Base Type | Description |
---|---|---|
char1 | string | Text of up to 1 character; optimizes memory, disk, and query performance |
char2 | string | Text of up to 2 characters; optimizes memory, disk, and query performance |
char4 | string | Text of up to 4 characters; optimizes memory, disk, and query performance |
char8 | string | Text of up to 8 characters; optimizes memory, disk, and query performance |
char16 | string | Text of up to 16 characters; optimizes memory, disk, and query performance |
char32 | string | Text of up to 32 characters; optimizes memory, disk, and query performance |
char64 | string | Text of up to 64 characters; optimizes memory, disk, and query performance |
char128 | string | Text of up to 128 characters; optimizes memory, disk, and query performance |
char256 | string | Text of up to 256 characters; optimizes memory, disk, and query performance |
date | string | Interprets a string field as a date of the form YYYY-MM-DD |
datetime | string | Interprets a string field as a combination of date and time in the form of
YYYY-MM-DD HH:MM:SS[.mmm] |
decimal | string | Interprets a string field as a decimal number, with up to 19 digits of precision and 4 digits of scale |
int8 | int | Numbers limited to 8-bit signed integers; optimizes memory and query performance |
int16 | int | Numbers limited to 16-bit signed integers; optimizes memory and query performance |
ipv4 | string | Dotted decimal IPv4 addresses of the form: A.B.C.D where A , B , C
and D are between 0 and 255 , inclusive (e.g. 127.0.0.1 );
optimizes memory, disk, and query performance |
nullable | <any> | Values can be set to null Note: This property is generally not used in column creation; assigning nullability is an API-dependent exercise |
time | string | Interprets a string field as a time of the form HH:MM:SS[.mmm] |
timestamp | long | Timestamps in milliseconds since the Unix epoch: Jan 1 1970 00:00:00 |
wkt | string or bytes | Indicates that the column has WKT (or WKB) strings that should be handled as geometry objects. |
One or more of the following properties can be assigned to a column to alter the way the data is stored & handled. Valid combinations are detailed in the descriptions.
Property | Description |
---|---|
data | Directs that the column's data should be stored in memory,
making it available for use in query expressions. Default
property for all numeric and string type columns, unless
overridden by store_only . |
dict | This property will dictionary encode the associated charN column, reducing memory and disk usage. Queries against the column will also be faster. For Dictionary Encoding details, see Dictionary Encoding. |
disk_optimized | Prevents variable-width strings from being written to an
indexing service, saving disk space at the cost of some
functionality. A /filter/bystring applied will only
work in the equals mode, /aggregate/unique
cannot be applied, and /aggregate/groupby can only be
used when the count or count_distinct function is
applied to the column--the column itself cannot otherwise appear
in the column list. Requires the data property. |
store_only | Reduces system memory usage by not keeping a copy of the data
in memory and only persisting the values. The data is able to
be queried by column name, but since not in memory, no
expressions (column, filter, aggregation, etc.) can be applied.
It is mutually exclusive with the data property. Default
property for bytes type columns. |
text_search | Enables full text search for string columns. Can be set
independently of data , disk_optimized , & store_only .
For full text search details and limitations, see
Full Text Search. |
One or both of these keyed attributes can be assigned to one or more columns. For more information, see the sections on primary keys & shard keys. Foreign keys cannot be assigned as column properties.
Property | Description |
---|---|
primary_key | Makes this column part of (or the entire) primary key |
shard_key | Makes this column part of (or the entire) shard key |
The following property can be assigned to a column to replace certain kinds of data.
Property | Description |
---|---|
init_with_now | For date , time , datetime , and timestamp columns,
this property will replace empty strings and invalid timestamp
values with NOW() . |