Version:

/create/type

URL: http://GPUDB_IP_ADDRESS:GPUDB_PORT/create/type

Creates a new type describing the layout or schema of a table. The type definition is a JSON string describing the fields (i.e. columns) of the type. Each field consists of a name and a data type. Supported data types are: double, float, int, long, string, and bytes. In addition one or more properties can be specified for each column which customize the memory usage and query availability of that column. Note that some properties are mutually exclusive--i.e. they cannot be specified for any given column simultaneously. One example of mutually exclusive properties are data and store_only.

A single primary key and/or single shard key can be set across one or more columns. If a primary key is specified, then a uniqueness constraint is enforced, in that only a single object can exist with a given primary key. When inserting data into a table with a primary key, depending on the parameters in the request, incoming objects with primary key values that match existing objects will either overwrite (i.e. update) the existing object or will be skipped and not added into the set.

Example of a type definition with some of the parameters:

{"type":"record",
"name":"point",
"fields":[{"name":"msg_id","type":"string"},
                {"name":"x","type":"double"},
                {"name":"y","type":"double"},
                {"name":"TIMESTAMP","type":"double"},
                {"name":"source","type":"string"},
                {"name":"group_id","type":"string"},
                {"name":"OBJECT_ID","type":"string"}]
}

Properties:

{"group_id":["store_only"],
"msg_id":["store_only","text_search"]
}

Input Parameter Description

Name Type Description
type_definition string a JSON string describing the columns of the type to be registered.
label string A user-defined description string which can be used to differentiate between tables and types with otherwise identical schemas.
properties map of string to arrays of strings

Each key-value pair specifies the properties to use for a given column where the key is the column name. All keys used must be relevant column names for the given table. Specifying any property overrides the default properties for that column (which is based on the column's data type).

Allowed Values Description
data Default property for all numeric and string type columns; makes the column available for GPU queries.
text_search Valid only for 'string' columns. Enables full text search for string columns. Can be set independently of data and store_only.
store_only Persist the column value but do not make it available to queries (e.g. /filter)-i.e. it is mutually exclusive to the data property. Any 'bytes' type column must have a store_only property. This property reduces system memory usage.
disk_optimized Works in conjunction with the data property for string columns. This property reduces system disk usage by disabling reverse string lookups. Queries like /filter, /filter/bylist, and /filter/byvalue work as usual but /aggregate/unique, /aggregate/groupby and /get/records/bycolumn are not allowed on columns with this property.
timestamp Valid only for 'long' columns. Indicates that this field represents a timestamp and will be provided in milliseconds since the Unix epoch: 00:00:00 Jan 1 1970. Dates represented by a timestamp must fall between the year 1000 and the year 2900.
decimal Valid only for 'string' columns. It represents a SQL type NUMERIC(19, 4) data type. There can be up to 15 digits before the decimal point and up to four digits in the fractional part. The value can be positive or negative (indicated by a minus sign at the beginning). This property is mutually exclusive with the text_search property.
date Valid only for 'string' columns. Indicates that this field represents a date and will be provided in the format 'YYYY-MM-DD'. The allowable range is 1000-01-01 through 2900-01-01. This property is mutually exclusive with the text_search property.
time Valid only for 'string' columns. Indicates that this field represents a time-of-day and will be provided in the format 'HH:MM:SS.mmm'. The allowable range is 00:00:00.000 through 23:59:59.999. This property is mutually exclusive with the text_search property.
datetime Valid only for 'string' columns. Indicates that this field represents a datetime and will be provided in the format 'YYYY-MM-DD HH:MM:SS.mmm'. The allowable range is 1000-01-01 00:00:00.000 through 2900-01-01 23:59:59.999. This property is mutually exclusive with the text_search property.
char1 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 1 character.
char2 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 2 characters.
char4 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 4 characters.
char8 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 8 characters.
char16 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 16 characters.
char32 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 32 characters.
char64 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 64 characters.
char128 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 128 characters.
char256 This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 256 characters.
int8 This property provides optimized memory and query performance for int columns. Ints with this property must be between -128 and +127 (inclusive)
int16 This property provides optimized memory and query performance for int columns. Ints with this property must be between -32768 and +32767 (inclusive)
ipv4 This property provides optimized memory, disk and query performance for string columns representing IPv4 addresses (i.e. 192.168.1.1). Strings with this property must be of the form: A.B.C.D where A, B, C and D are in the range of 0-255.
wkt Valid only for 'string' and 'bytes' columns. Indicates that this field contains geospatial geometry objects in Well-Known Text (WKT) or Well-Known Binary (WKB) format.
primary_key This property indicates that this column will be part of (or the entire) primary key.
shard_key This property indicates that this column will be part of (or the entire) shard key.
nullable This property indicates that this column is nullable. However, setting this property is insufficient for making the column nullable. The user must declare the type of the column as a union between its regular type and 'null' in the avro schema for the record type in input parameter type_definition. For example, if a column is of type integer and is nullable, then the entry for the column in the avro schema must be: ['int', 'null']. The C++, C#, Java, and Python APIs have built-in convenience for bypassing setting the avro schema by hand. For those languages, one can use this property as usual and not have to worry about the avro schema for the record.
dict This property indicates that this column should be dictionary encoded. It can only be used in conjunction with string columns marked with a charN or date property or with int or long columns. This property is appropriate for columns where the cardinality (the number of unique values) is expected to be low, and can save a large amount of memory.
init_with_now For columns with attributes of date, time, datetime or timestamp, at insert time, replace empty strings and invalid timestamps with NOW()
options map of string to strings Optional parameters. The default value is an empty map ( {} ).

Output Parameter Description

The GPUdb server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:

Name Type Description
status String 'OK' or 'ERROR'
message String Empty if success or an error message
data_type String 'create_type_request' or 'none' in case of an error
data String Empty string
data_str JSON or String

This embedded JSON represents the result of the /create/type endpoint:

Name Type Description
type_id string An identifier representing the created type. This type_id can be used in subsequent calls to create a table
type_definition string Value of input parameter type_definition.
label string Value of input parameter label.
properties map of string to arrays of strings Value of input parameter properties.

Empty string in case of an error.