public class CreateTypeRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.createType(CreateTypeRequest)
.
Creates a new type describing the layout or schema of a table. The type
definition is a JSON string describing the fields (i.e. columns) of the
type. Each field consists of a name and a data type. Supported data types
are: double, float, int, long, string, and bytes. In addition one or more
properties can be specified for each column which customize the memory usage
and query availability of that column. Note that some properties are
mutually exclusive--i.e. they cannot be specified for any given column
simultaneously. One example of mutually exclusive properties are data
and store_only
.
A single primary key and/or single shard
key can be set across one or more columns. If a primary key is
specified, then a uniqueness constraint is enforced, in that only a single
object can exist with a given primary key. When inserting
data
into a table with a primary key, depending on the parameters in the request,
incoming objects with primary key values that match existing objects will
either overwrite (i.e. update) the existing object or will be skipped and
not added into the set.
Example of a type definition with some of the parameters::
{"type":"record", "name":"point", "fields":[{"name":"msg_id","type":"string"}, {"name":"x","type":"double"}, {"name":"y","type":"double"}, {"name":"TIMESTAMP","type":"double"}, {"name":"source","type":"string"}, {"name":"group_id","type":"string"}, {"name":"OBJECT_ID","type":"string"}] }
Properties::
{"group_id":["store_only"], "msg_id":["store_only","text_search"] }
Modifier and Type | Class and Description |
---|---|
static class |
CreateTypeRequest.Properties
Each key-value pair specifies the properties to use for a given column
where the key is the column name.
|
Constructor and Description |
---|
CreateTypeRequest()
Constructs a CreateTypeRequest object with default parameters.
|
CreateTypeRequest(String typeDefinition,
String label,
Map<String,List<String>> properties,
Map<String,String> options)
Constructs a CreateTypeRequest object with the specified parameters.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
String |
getLabel() |
Map<String,String> |
getOptions() |
Map<String,List<String>> |
getProperties() |
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
String |
getTypeDefinition() |
int |
hashCode() |
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
CreateTypeRequest |
setLabel(String label) |
CreateTypeRequest |
setOptions(Map<String,String> options) |
CreateTypeRequest |
setProperties(Map<String,List<String>> properties) |
CreateTypeRequest |
setTypeDefinition(String typeDefinition) |
String |
toString() |
public CreateTypeRequest()
public CreateTypeRequest(String typeDefinition, String label, Map<String,List<String>> properties, Map<String,String> options)
typeDefinition
- a JSON string describing the columns of the type
to be registered.label
- A user-defined description string which can be used to
differentiate between tables and types with otherwise
identical schemas.properties
- Each key-value pair specifies the properties to use
for a given column where the key is the column name.
All keys used must be relevant column names for the
given table. Specifying any property overrides the
default properties for that column (which is based on
the column's data type).
Valid values are:
DATA
: Default property for all numeric and string
type columns; makes the column available for GPU
queries.
TEXT_SEARCH
: Valid only for 'string' columns.
Enables full text search for string columns. Can be
set independently of data
and store_only
.
STORE_ONLY
: Persist the column value but do not make
it available to queries (e.g. GPUdb.filter(FilterRequest)
)-i.e. it is
mutually exclusive to the data
property. Any
'bytes' type column must have a store_only
property. This property reduces system memory usage.
DISK_OPTIMIZED
: Works in conjunction with the data
property for string columns. This property
reduces system disk usage by disabling reverse string
lookups. Queries like GPUdb.filter(FilterRequest)
, GPUdb.filterByList(FilterByListRequest)
,
and GPUdb.filterByValue(FilterByValueRequest)
work as usual but GPUdb.aggregateUniqueRaw(AggregateUniqueRequest)
and GPUdb.aggregateGroupByRaw(AggregateGroupByRequest)
are not allowed on columns with this property.
TIMESTAMP
: Valid only for 'long' columns. Indicates
that this field represents a timestamp and will be
provided in milliseconds since the Unix epoch:
00:00:00 Jan 1 1970. Dates represented by a
timestamp must fall between the year 1000 and the
year 2900.
ULONG
: Valid only for 'string' columns. It
represents an unsigned long integer data type. The
string can only be interpreted as an unsigned long
data type with minimum value of zero, and maximum
value of 18446744073709551615.
DECIMAL
: Valid only for 'string' columns. It
represents a SQL type NUMERIC(19, 4) data type.
There can be up to 15 digits before the decimal point
and up to four digits in the fractional part. The
value can be positive or negative (indicated by a
minus sign at the beginning). This property is
mutually exclusive with the text_search
property.
DATE
: Valid only for 'string' columns. Indicates
that this field represents a date and will be
provided in the format 'YYYY-MM-DD'. The allowable
range is 1000-01-01 through 2900-01-01. This
property is mutually exclusive with the text_search
property.
TIME
: Valid only for 'string' columns. Indicates
that this field represents a time-of-day and will be
provided in the format 'HH:MM:SS.mmm'. The allowable
range is 00:00:00.000 through 23:59:59.999. This
property is mutually exclusive with the text_search
property.
DATETIME
: Valid only for 'string' columns.
Indicates that this field represents a datetime and
will be provided in the format 'YYYY-MM-DD
HH:MM:SS.mmm'. The allowable range is 1000-01-01
00:00:00.000 through 2900-01-01 23:59:59.999. This
property is mutually exclusive with the text_search
property.
CHAR1
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 1
character.
CHAR2
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 2
characters.
CHAR4
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 4
characters.
CHAR8
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 8
characters.
CHAR16
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 16
characters.
CHAR32
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 32
characters.
CHAR64
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 64
characters.
CHAR128
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 128
characters.
CHAR256
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 256
characters.
INT8
: This property provides optimized memory and
query performance for int columns. Ints with this
property must be between -128 and +127 (inclusive)
INT16
: This property provides optimized memory and
query performance for int columns. Ints with this
property must be between -32768 and +32767
(inclusive)
IPV4
: This property provides optimized memory, disk
and query performance for string columns representing
IPv4 addresses (i.e. 192.168.1.1). Strings with this
property must be of the form: A.B.C.D where A, B, C
and D are in the range of 0-255.
WKT
: Valid only for 'string' and 'bytes' columns.
Indicates that this field contains geospatial
geometry objects in Well-Known Text (WKT) or
Well-Known Binary (WKB) format.
PRIMARY_KEY
: This property indicates that this
column will be part of (or the entire) primary key.
SHARD_KEY
: This property indicates that this column
will be part of (or the entire) shard key.
NULLABLE
: This property indicates that this column
is nullable. However, setting this property is
insufficient for making the column nullable. The
user must declare the type of the column as a union
between its regular type and 'null' in the avro
schema for the record type in typeDefinition
.
For example, if a column is of type integer and is
nullable, then the entry for the column in the avro
schema must be: ['int', 'null'].
The C++, C#, Java, and Python APIs have built-in
convenience for bypassing setting the avro schema by
hand. For those languages, one can use this property
as usual and not have to worry about the avro schema
for the record.
DICT
: This property indicates that this column
should be dictionary encoded. It can only be
used in conjunction with restricted string (charN),
int, long or date columns. Dictionary encoding is
best for columns where the cardinality (the number of
unique values) is expected to be low. This property
can save a large amount of memory.
INIT_WITH_NOW
: For 'date', 'time', 'datetime', or
'timestamp' column types, replace empty strings and
invalid timestamps with 'NOW()' upon insert.
options
- Optional parameters. The default value is an empty
Map
.public static org.apache.avro.Schema getClassSchema()
public String getTypeDefinition()
public CreateTypeRequest setTypeDefinition(String typeDefinition)
typeDefinition
- a JSON string describing the columns of the type
to be registered.this
to mimic the builder pattern.public String getLabel()
public CreateTypeRequest setLabel(String label)
label
- A user-defined description string which can be used to
differentiate between tables and types with otherwise
identical schemas.this
to mimic the builder pattern.public Map<String,List<String>> getProperties()
DATA
:
Default property for all numeric and string type columns; makes
the column available for GPU queries.
TEXT_SEARCH
: Valid only for 'string' columns. Enables full text
search for string columns. Can be set independently of data
and store_only
.
STORE_ONLY
: Persist the column value but do not make it
available to queries (e.g. GPUdb.filter(FilterRequest)
)-i.e. it is mutually
exclusive to the data
property. Any 'bytes' type column
must have a store_only
property. This property reduces
system memory usage.
DISK_OPTIMIZED
: Works in conjunction with the data
property for string columns. This property reduces system disk
usage by disabling reverse string lookups. Queries like GPUdb.filter(FilterRequest)
, GPUdb.filterByList(FilterByListRequest)
, and GPUdb.filterByValue(FilterByValueRequest)
work as
usual but GPUdb.aggregateUniqueRaw(AggregateUniqueRequest)
and
GPUdb.aggregateGroupByRaw(AggregateGroupByRequest)
are not allowed on columns with this property.
TIMESTAMP
: Valid only for 'long' columns. Indicates that this
field represents a timestamp and will be provided in
milliseconds since the Unix epoch: 00:00:00 Jan 1 1970. Dates
represented by a timestamp must fall between the year 1000 and
the year 2900.
ULONG
:
Valid only for 'string' columns. It represents an unsigned long
integer data type. The string can only be interpreted as an
unsigned long data type with minimum value of zero, and maximum
value of 18446744073709551615.
DECIMAL
: Valid only for 'string' columns. It represents a SQL
type NUMERIC(19, 4) data type. There can be up to 15 digits
before the decimal point and up to four digits in the fractional
part. The value can be positive or negative (indicated by a
minus sign at the beginning). This property is mutually
exclusive with the text_search
property.
DATE
:
Valid only for 'string' columns. Indicates that this field
represents a date and will be provided in the format
'YYYY-MM-DD'. The allowable range is 1000-01-01 through
2900-01-01. This property is mutually exclusive with the text_search
property.
TIME
:
Valid only for 'string' columns. Indicates that this field
represents a time-of-day and will be provided in the format
'HH:MM:SS.mmm'. The allowable range is 00:00:00.000 through
23:59:59.999. This property is mutually exclusive with the
text_search
property.
DATETIME
: Valid only for 'string' columns. Indicates that this
field represents a datetime and will be provided in the format
'YYYY-MM-DD HH:MM:SS.mmm'. The allowable range is 1000-01-01
00:00:00.000 through 2900-01-01 23:59:59.999. This property is
mutually exclusive with the text_search
property.
CHAR1
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 1 character.
CHAR2
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 2 characters.
CHAR4
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 4 characters.
CHAR8
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 8 characters.
CHAR16
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 16 characters.
CHAR32
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 32 characters.
CHAR64
:
This property provides optimized memory, disk and query
performance for string columns. Strings with this property must
be no longer than 64 characters.
CHAR128
: This property provides optimized memory, disk and
query performance for string columns. Strings with this property
must be no longer than 128 characters.
CHAR256
: This property provides optimized memory, disk and
query performance for string columns. Strings with this property
must be no longer than 256 characters.
INT8
: This
property provides optimized memory and query performance for int
columns. Ints with this property must be between -128 and +127
(inclusive)
INT16
:
This property provides optimized memory and query performance
for int columns. Ints with this property must be between -32768
and +32767 (inclusive)
IPV4
: This
property provides optimized memory, disk and query performance
for string columns representing IPv4 addresses (i.e.
192.168.1.1). Strings with this property must be of the form:
A.B.C.D where A, B, C and D are in the range of 0-255.
WKT
: Valid
only for 'string' and 'bytes' columns. Indicates that this field
contains geospatial geometry objects in Well-Known Text (WKT) or
Well-Known Binary (WKB) format.
PRIMARY_KEY
: This property indicates that this column will be
part of (or the entire) primary key.
SHARD_KEY
: This property indicates that this column will be
part of (or the entire) shard key.
NULLABLE
: This property indicates that this column is nullable.
However, setting this property is insufficient for making the
column nullable. The user must declare the type of the column
as a union between its regular type and 'null' in the avro
schema for the record type in typeDefinition
. For
example, if a column is of type integer and is nullable, then
the entry for the column in the avro schema must be: ['int',
'null'].
The C++, C#, Java, and Python APIs have built-in convenience for
bypassing setting the avro schema by hand. For those languages,
one can use this property as usual and not have to worry about
the avro schema for the record.
DICT
: This
property indicates that this column should be dictionary encoded. It can only be used in
conjunction with restricted string (charN), int, long or date
columns. Dictionary encoding is best for columns where the
cardinality (the number of unique values) is expected to be low.
This property can save a large amount of memory.
INIT_WITH_NOW
: For 'date', 'time', 'datetime', or 'timestamp'
column types, replace empty strings and invalid timestamps with
'NOW()' upon insert.
public CreateTypeRequest setProperties(Map<String,List<String>> properties)
properties
- Each key-value pair specifies the properties to use
for a given column where the key is the column name.
All keys used must be relevant column names for the
given table. Specifying any property overrides the
default properties for that column (which is based on
the column's data type).
Valid values are:
DATA
: Default property for all numeric and string
type columns; makes the column available for GPU
queries.
TEXT_SEARCH
: Valid only for 'string' columns.
Enables full text search for string columns. Can be
set independently of data
and store_only
.
STORE_ONLY
: Persist the column value but do not make
it available to queries (e.g. GPUdb.filter(FilterRequest)
)-i.e. it is
mutually exclusive to the data
property. Any
'bytes' type column must have a store_only
property. This property reduces system memory usage.
DISK_OPTIMIZED
: Works in conjunction with the data
property for string columns. This property
reduces system disk usage by disabling reverse string
lookups. Queries like GPUdb.filter(FilterRequest)
, GPUdb.filterByList(FilterByListRequest)
,
and GPUdb.filterByValue(FilterByValueRequest)
work as usual but GPUdb.aggregateUniqueRaw(AggregateUniqueRequest)
and GPUdb.aggregateGroupByRaw(AggregateGroupByRequest)
are not allowed on columns with this property.
TIMESTAMP
: Valid only for 'long' columns. Indicates
that this field represents a timestamp and will be
provided in milliseconds since the Unix epoch:
00:00:00 Jan 1 1970. Dates represented by a
timestamp must fall between the year 1000 and the
year 2900.
ULONG
: Valid only for 'string' columns. It
represents an unsigned long integer data type. The
string can only be interpreted as an unsigned long
data type with minimum value of zero, and maximum
value of 18446744073709551615.
DECIMAL
: Valid only for 'string' columns. It
represents a SQL type NUMERIC(19, 4) data type.
There can be up to 15 digits before the decimal point
and up to four digits in the fractional part. The
value can be positive or negative (indicated by a
minus sign at the beginning). This property is
mutually exclusive with the text_search
property.
DATE
: Valid only for 'string' columns. Indicates
that this field represents a date and will be
provided in the format 'YYYY-MM-DD'. The allowable
range is 1000-01-01 through 2900-01-01. This
property is mutually exclusive with the text_search
property.
TIME
: Valid only for 'string' columns. Indicates
that this field represents a time-of-day and will be
provided in the format 'HH:MM:SS.mmm'. The allowable
range is 00:00:00.000 through 23:59:59.999. This
property is mutually exclusive with the text_search
property.
DATETIME
: Valid only for 'string' columns.
Indicates that this field represents a datetime and
will be provided in the format 'YYYY-MM-DD
HH:MM:SS.mmm'. The allowable range is 1000-01-01
00:00:00.000 through 2900-01-01 23:59:59.999. This
property is mutually exclusive with the text_search
property.
CHAR1
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 1
character.
CHAR2
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 2
characters.
CHAR4
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 4
characters.
CHAR8
: This property provides optimized memory, disk
and query performance for string columns. Strings
with this property must be no longer than 8
characters.
CHAR16
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 16
characters.
CHAR32
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 32
characters.
CHAR64
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 64
characters.
CHAR128
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 128
characters.
CHAR256
: This property provides optimized memory,
disk and query performance for string columns.
Strings with this property must be no longer than 256
characters.
INT8
: This property provides optimized memory and
query performance for int columns. Ints with this
property must be between -128 and +127 (inclusive)
INT16
: This property provides optimized memory and
query performance for int columns. Ints with this
property must be between -32768 and +32767
(inclusive)
IPV4
: This property provides optimized memory, disk
and query performance for string columns representing
IPv4 addresses (i.e. 192.168.1.1). Strings with this
property must be of the form: A.B.C.D where A, B, C
and D are in the range of 0-255.
WKT
: Valid only for 'string' and 'bytes' columns.
Indicates that this field contains geospatial
geometry objects in Well-Known Text (WKT) or
Well-Known Binary (WKB) format.
PRIMARY_KEY
: This property indicates that this
column will be part of (or the entire) primary key.
SHARD_KEY
: This property indicates that this column
will be part of (or the entire) shard key.
NULLABLE
: This property indicates that this column
is nullable. However, setting this property is
insufficient for making the column nullable. The
user must declare the type of the column as a union
between its regular type and 'null' in the avro
schema for the record type in typeDefinition
.
For example, if a column is of type integer and is
nullable, then the entry for the column in the avro
schema must be: ['int', 'null'].
The C++, C#, Java, and Python APIs have built-in
convenience for bypassing setting the avro schema by
hand. For those languages, one can use this property
as usual and not have to worry about the avro schema
for the record.
DICT
: This property indicates that this column
should be dictionary encoded. It can only be
used in conjunction with restricted string (charN),
int, long or date columns. Dictionary encoding is
best for columns where the cardinality (the number of
unique values) is expected to be low. This property
can save a large amount of memory.
INIT_WITH_NOW
: For 'date', 'time', 'datetime', or
'timestamp' column types, replace empty strings and
invalid timestamps with 'NOW()' upon insert.
this
to mimic the builder pattern.public Map<String,String> getOptions()
Map
.public CreateTypeRequest setOptions(Map<String,String> options)
options
- Optional parameters. The default value is an empty
Map
.this
to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema
in interface org.apache.avro.generic.GenericContainer
public Object get(int index)
get
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to getIndexOutOfBoundsException
public void put(int index, Object value)
put
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to setvalue
- the value to setIndexOutOfBoundsException
Copyright © 2020. All rights reserved.