Class CreateTypeRequest

  • All Implemented Interfaces:
    org.apache.avro.generic.GenericContainer, org.apache.avro.generic.IndexedRecord

    public class CreateTypeRequest
    extends Object
    implements org.apache.avro.generic.IndexedRecord
    A set of parameters for GPUdb.createType.

    Creates a new type describing the columns of a table. The type definition is specified as a list of columns, each specified as a list of the column name, data type, and any column attributes.

    Example of a type definition with some parameters:

    
         [
             ["id", "int8", "primary_key"],
             ["dept_id", "int8", "primary_key", "shard_key"],
             ["manager_id", "int8", "nullable"],
             ["first_name", "char32"],
             ["last_name", "char64"],
             ["salary", "decimal"],
             ["hire_date", "date"]
         ]
     
    Each column definition consists of the column name (which should meet the standard column naming criteria), the column's specific type (int, long, float, double, string, bytes, or any of the possible values for properties), and any data handling, data key, or data replacement properties.

    Note that some properties are mutually exclusive--i.e. they cannot be specified for any given column simultaneously. One example of mutually exclusive properties are PRIMARY_KEY and NULLABLE.

    A single primary key and/or single shard key can be set across one or more columns. If a primary key is specified, then a uniqueness constraint is enforced, in that only a single object can exist with a given primary key column value (or set of values for the key columns, if using a composite primary key). When inserting data into a table with a primary key, depending on the parameters in the request, incoming objects with primary key values that match existing objects will either overwrite (i.e. update) the existing object or will be skipped and not added into the set.

    • Constructor Detail

      • CreateTypeRequest

        public CreateTypeRequest()
        Constructs a CreateTypeRequest object with default parameters.
      • CreateTypeRequest

        public CreateTypeRequest​(String typeDefinition,
                                 String label,
                                 Map<String,​List<String>> properties,
                                 Map<String,​String> options)
        Constructs a CreateTypeRequest object with the specified parameters.
        Parameters:
        typeDefinition - a JSON string describing the columns of the type to be registered, as described above.
        label - A user-defined description string which can be used to differentiate between tables and types with otherwise identical schemas.
        properties - [DEPRECATED--please use these property values in the typeDefinition directly, as described at the top, instead] Each key-value pair specifies the properties to use for a given column where the key is the column name. All keys used must be relevant column names for the given table. Specifying any property overrides the default properties for that column (which is based on the column's data type). Valid values are:
        • DATA: Default property for all numeric and string type columns; makes the column available for GPU queries.
        • TEXT_SEARCH: Valid only for select 'string' columns. Enables full text search--see Full Text Search for details and applicable string column types.
        • TIMESTAMP: Valid only for 'long' columns. Indicates that this field represents a timestamp and will be provided in milliseconds since the Unix epoch: 00:00:00 Jan 1 1970. Dates represented by a timestamp must fall between the year 1000 and the year 2900.
        • ULONG: Valid only for 'string' columns. It represents an unsigned long integer data type. The string can only be interpreted as an unsigned long data type with minimum value of zero, and maximum value of 18446744073709551615.
        • UUID: Valid only for 'string' columns. It represents an uuid data type. Internally, it is stored as a 128-bit integer.
        • DECIMAL: Valid only for 'string' columns. It represents a SQL type NUMERIC(19, 4) data type. There can be up to 15 digits before the decimal point and up to four digits in the fractional part. The value can be positive or negative (indicated by a minus sign at the beginning). This property is mutually exclusive with the TEXT_SEARCH property.
        • DATE: Valid only for 'string' columns. Indicates that this field represents a date and will be provided in the format 'YYYY-MM-DD'. The allowable range is 1000-01-01 through 2900-01-01. This property is mutually exclusive with the TEXT_SEARCH property.
        • TIME: Valid only for 'string' columns. Indicates that this field represents a time-of-day and will be provided in the format 'HH:MM:SS.mmm'. The allowable range is 00:00:00.000 through 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • DATETIME: Valid only for 'string' columns. Indicates that this field represents a datetime and will be provided in the format 'YYYY-MM-DD HH:MM:SS.mmm'. The allowable range is 1000-01-01 00:00:00.000 through 2900-01-01 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • CHAR1: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 1 character.
        • CHAR2: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 2 characters.
        • CHAR4: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 4 characters.
        • CHAR8: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 8 characters.
        • CHAR16: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 16 characters.
        • CHAR32: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 32 characters.
        • CHAR64: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 64 characters.
        • CHAR128: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 128 characters.
        • CHAR256: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 256 characters.
        • BOOLEAN: This property provides optimized memory and query performance for int columns. Ints with this property must be between 0 and 1(inclusive)
        • INT8: This property provides optimized memory and query performance for int columns. Ints with this property must be between -128 and +127 (inclusive)
        • INT16: This property provides optimized memory and query performance for int columns. Ints with this property must be between -32768 and +32767 (inclusive)
        • IPV4: This property provides optimized memory, disk and query performance for string columns representing IPv4 addresses (i.e. 192.168.1.1). Strings with this property must be of the form: A.B.C.D where A, B, C and D are in the range of 0-255.
        • ARRAY: Valid only for 'string' columns. Indicates that this field contains an array. The value type and (optionally) the item count should be specified in parenthesis; e.g., 'array(int, 10)' for a 10-integer array. Both 'array(int)' and 'array(int, -1)' will designate an unlimited-length integer array, though no bounds checking is performed on arrays of any length.
        • JSON: Valid only for 'string' columns. Indicates that this field contains values in JSON format.
        • VECTOR: Valid only for 'bytes' columns. Indicates that this field contains a vector of floats. The length should be specified in parenthesis, e.g., 'vector(1000)'.
        • WKT: Valid only for 'string' and 'bytes' columns. Indicates that this field contains geospatial geometry objects in Well-Known Text (WKT) or Well-Known Binary (WKB) format.
        • PRIMARY_KEY: This property indicates that this column will be part of (or the entire) primary key.
        • SOFT_PRIMARY_KEY: This property indicates that this column will be part of (or the entire) soft primary key.
        • SHARD_KEY: This property indicates that this column will be part of (or the entire) shard key.
        • NULLABLE: This property indicates that this column is nullable. However, setting this property is insufficient for making the column nullable. The user must declare the type of the column as a union between its regular type and 'null' in the Avro schema for the record type in typeDefinition. For example, if a column is of type integer and is nullable, then the entry for the column in the Avro schema must be: ['int', 'null']. The C++, C#, Java, and Python APIs have built-in convenience for bypassing setting the Avro schema by hand. For those languages, one can use this property as usual and not have to worry about the Avro schema for the record.
        • COMPRESS: This property indicates that this column should be compressed with the given codec and optional level; e.g., 'compress(snappy)' for Snappy compression and 'compress(zstd(7))' for zstd level 7 compression. This property is primarily used in order to save disk space.
        • DICT: This property indicates that this column should be dictionary encoded. It can only be used in conjunction with restricted string (charN), int, long or date columns. Dictionary encoding is best for columns where the cardinality (the number of unique values) is expected to be low. This property can save a large amount of memory.
        • INIT_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, replace empty strings and invalid timestamps with 'NOW()' upon insert.
        • INIT_WITH_UUID: For 'uuid' type, replace empty strings and invalid UUID values with randomly-generated UUIDs upon insert.
        • UPDATE_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, always update the field with 'NOW()' upon any update.
        The default value is an empty Map.
        options - Optional parameters. The default value is an empty Map.
    • Method Detail

      • getClassSchema

        public static org.apache.avro.Schema getClassSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Returns:
        The schema for the class.
      • getTypeDefinition

        public String getTypeDefinition()
        a JSON string describing the columns of the type to be registered, as described above.
        Returns:
        The current value of typeDefinition.
      • setTypeDefinition

        public CreateTypeRequest setTypeDefinition​(String typeDefinition)
        a JSON string describing the columns of the type to be registered, as described above.
        Parameters:
        typeDefinition - The new value for typeDefinition.
        Returns:
        this to mimic the builder pattern.
      • getLabel

        public String getLabel()
        A user-defined description string which can be used to differentiate between tables and types with otherwise identical schemas.
        Returns:
        The current value of label.
      • setLabel

        public CreateTypeRequest setLabel​(String label)
        A user-defined description string which can be used to differentiate between tables and types with otherwise identical schemas.
        Parameters:
        label - The new value for label.
        Returns:
        this to mimic the builder pattern.
      • getProperties

        public Map<String,​List<String>> getProperties()
        [DEPRECATED--please use these property values in the typeDefinition directly, as described at the top, instead] Each key-value pair specifies the properties to use for a given column where the key is the column name. All keys used must be relevant column names for the given table. Specifying any property overrides the default properties for that column (which is based on the column's data type). Valid values are:
        • DATA: Default property for all numeric and string type columns; makes the column available for GPU queries.
        • TEXT_SEARCH: Valid only for select 'string' columns. Enables full text search--see Full Text Search for details and applicable string column types.
        • TIMESTAMP: Valid only for 'long' columns. Indicates that this field represents a timestamp and will be provided in milliseconds since the Unix epoch: 00:00:00 Jan 1 1970. Dates represented by a timestamp must fall between the year 1000 and the year 2900.
        • ULONG: Valid only for 'string' columns. It represents an unsigned long integer data type. The string can only be interpreted as an unsigned long data type with minimum value of zero, and maximum value of 18446744073709551615.
        • UUID: Valid only for 'string' columns. It represents an uuid data type. Internally, it is stored as a 128-bit integer.
        • DECIMAL: Valid only for 'string' columns. It represents a SQL type NUMERIC(19, 4) data type. There can be up to 15 digits before the decimal point and up to four digits in the fractional part. The value can be positive or negative (indicated by a minus sign at the beginning). This property is mutually exclusive with the TEXT_SEARCH property.
        • DATE: Valid only for 'string' columns. Indicates that this field represents a date and will be provided in the format 'YYYY-MM-DD'. The allowable range is 1000-01-01 through 2900-01-01. This property is mutually exclusive with the TEXT_SEARCH property.
        • TIME: Valid only for 'string' columns. Indicates that this field represents a time-of-day and will be provided in the format 'HH:MM:SS.mmm'. The allowable range is 00:00:00.000 through 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • DATETIME: Valid only for 'string' columns. Indicates that this field represents a datetime and will be provided in the format 'YYYY-MM-DD HH:MM:SS.mmm'. The allowable range is 1000-01-01 00:00:00.000 through 2900-01-01 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • CHAR1: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 1 character.
        • CHAR2: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 2 characters.
        • CHAR4: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 4 characters.
        • CHAR8: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 8 characters.
        • CHAR16: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 16 characters.
        • CHAR32: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 32 characters.
        • CHAR64: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 64 characters.
        • CHAR128: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 128 characters.
        • CHAR256: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 256 characters.
        • BOOLEAN: This property provides optimized memory and query performance for int columns. Ints with this property must be between 0 and 1(inclusive)
        • INT8: This property provides optimized memory and query performance for int columns. Ints with this property must be between -128 and +127 (inclusive)
        • INT16: This property provides optimized memory and query performance for int columns. Ints with this property must be between -32768 and +32767 (inclusive)
        • IPV4: This property provides optimized memory, disk and query performance for string columns representing IPv4 addresses (i.e. 192.168.1.1). Strings with this property must be of the form: A.B.C.D where A, B, C and D are in the range of 0-255.
        • ARRAY: Valid only for 'string' columns. Indicates that this field contains an array. The value type and (optionally) the item count should be specified in parenthesis; e.g., 'array(int, 10)' for a 10-integer array. Both 'array(int)' and 'array(int, -1)' will designate an unlimited-length integer array, though no bounds checking is performed on arrays of any length.
        • JSON: Valid only for 'string' columns. Indicates that this field contains values in JSON format.
        • VECTOR: Valid only for 'bytes' columns. Indicates that this field contains a vector of floats. The length should be specified in parenthesis, e.g., 'vector(1000)'.
        • WKT: Valid only for 'string' and 'bytes' columns. Indicates that this field contains geospatial geometry objects in Well-Known Text (WKT) or Well-Known Binary (WKB) format.
        • PRIMARY_KEY: This property indicates that this column will be part of (or the entire) primary key.
        • SOFT_PRIMARY_KEY: This property indicates that this column will be part of (or the entire) soft primary key.
        • SHARD_KEY: This property indicates that this column will be part of (or the entire) shard key.
        • NULLABLE: This property indicates that this column is nullable. However, setting this property is insufficient for making the column nullable. The user must declare the type of the column as a union between its regular type and 'null' in the Avro schema for the record type in typeDefinition. For example, if a column is of type integer and is nullable, then the entry for the column in the Avro schema must be: ['int', 'null']. The C++, C#, Java, and Python APIs have built-in convenience for bypassing setting the Avro schema by hand. For those languages, one can use this property as usual and not have to worry about the Avro schema for the record.
        • COMPRESS: This property indicates that this column should be compressed with the given codec and optional level; e.g., 'compress(snappy)' for Snappy compression and 'compress(zstd(7))' for zstd level 7 compression. This property is primarily used in order to save disk space.
        • DICT: This property indicates that this column should be dictionary encoded. It can only be used in conjunction with restricted string (charN), int, long or date columns. Dictionary encoding is best for columns where the cardinality (the number of unique values) is expected to be low. This property can save a large amount of memory.
        • INIT_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, replace empty strings and invalid timestamps with 'NOW()' upon insert.
        • INIT_WITH_UUID: For 'uuid' type, replace empty strings and invalid UUID values with randomly-generated UUIDs upon insert.
        • UPDATE_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, always update the field with 'NOW()' upon any update.
        The default value is an empty Map.
        Returns:
        The current value of properties.
      • setProperties

        public CreateTypeRequest setProperties​(Map<String,​List<String>> properties)
        [DEPRECATED--please use these property values in the typeDefinition directly, as described at the top, instead] Each key-value pair specifies the properties to use for a given column where the key is the column name. All keys used must be relevant column names for the given table. Specifying any property overrides the default properties for that column (which is based on the column's data type). Valid values are:
        • DATA: Default property for all numeric and string type columns; makes the column available for GPU queries.
        • TEXT_SEARCH: Valid only for select 'string' columns. Enables full text search--see Full Text Search for details and applicable string column types.
        • TIMESTAMP: Valid only for 'long' columns. Indicates that this field represents a timestamp and will be provided in milliseconds since the Unix epoch: 00:00:00 Jan 1 1970. Dates represented by a timestamp must fall between the year 1000 and the year 2900.
        • ULONG: Valid only for 'string' columns. It represents an unsigned long integer data type. The string can only be interpreted as an unsigned long data type with minimum value of zero, and maximum value of 18446744073709551615.
        • UUID: Valid only for 'string' columns. It represents an uuid data type. Internally, it is stored as a 128-bit integer.
        • DECIMAL: Valid only for 'string' columns. It represents a SQL type NUMERIC(19, 4) data type. There can be up to 15 digits before the decimal point and up to four digits in the fractional part. The value can be positive or negative (indicated by a minus sign at the beginning). This property is mutually exclusive with the TEXT_SEARCH property.
        • DATE: Valid only for 'string' columns. Indicates that this field represents a date and will be provided in the format 'YYYY-MM-DD'. The allowable range is 1000-01-01 through 2900-01-01. This property is mutually exclusive with the TEXT_SEARCH property.
        • TIME: Valid only for 'string' columns. Indicates that this field represents a time-of-day and will be provided in the format 'HH:MM:SS.mmm'. The allowable range is 00:00:00.000 through 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • DATETIME: Valid only for 'string' columns. Indicates that this field represents a datetime and will be provided in the format 'YYYY-MM-DD HH:MM:SS.mmm'. The allowable range is 1000-01-01 00:00:00.000 through 2900-01-01 23:59:59.999. This property is mutually exclusive with the TEXT_SEARCH property.
        • CHAR1: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 1 character.
        • CHAR2: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 2 characters.
        • CHAR4: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 4 characters.
        • CHAR8: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 8 characters.
        • CHAR16: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 16 characters.
        • CHAR32: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 32 characters.
        • CHAR64: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 64 characters.
        • CHAR128: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 128 characters.
        • CHAR256: This property provides optimized memory, disk and query performance for string columns. Strings with this property must be no longer than 256 characters.
        • BOOLEAN: This property provides optimized memory and query performance for int columns. Ints with this property must be between 0 and 1(inclusive)
        • INT8: This property provides optimized memory and query performance for int columns. Ints with this property must be between -128 and +127 (inclusive)
        • INT16: This property provides optimized memory and query performance for int columns. Ints with this property must be between -32768 and +32767 (inclusive)
        • IPV4: This property provides optimized memory, disk and query performance for string columns representing IPv4 addresses (i.e. 192.168.1.1). Strings with this property must be of the form: A.B.C.D where A, B, C and D are in the range of 0-255.
        • ARRAY: Valid only for 'string' columns. Indicates that this field contains an array. The value type and (optionally) the item count should be specified in parenthesis; e.g., 'array(int, 10)' for a 10-integer array. Both 'array(int)' and 'array(int, -1)' will designate an unlimited-length integer array, though no bounds checking is performed on arrays of any length.
        • JSON: Valid only for 'string' columns. Indicates that this field contains values in JSON format.
        • VECTOR: Valid only for 'bytes' columns. Indicates that this field contains a vector of floats. The length should be specified in parenthesis, e.g., 'vector(1000)'.
        • WKT: Valid only for 'string' and 'bytes' columns. Indicates that this field contains geospatial geometry objects in Well-Known Text (WKT) or Well-Known Binary (WKB) format.
        • PRIMARY_KEY: This property indicates that this column will be part of (or the entire) primary key.
        • SOFT_PRIMARY_KEY: This property indicates that this column will be part of (or the entire) soft primary key.
        • SHARD_KEY: This property indicates that this column will be part of (or the entire) shard key.
        • NULLABLE: This property indicates that this column is nullable. However, setting this property is insufficient for making the column nullable. The user must declare the type of the column as a union between its regular type and 'null' in the Avro schema for the record type in typeDefinition. For example, if a column is of type integer and is nullable, then the entry for the column in the Avro schema must be: ['int', 'null']. The C++, C#, Java, and Python APIs have built-in convenience for bypassing setting the Avro schema by hand. For those languages, one can use this property as usual and not have to worry about the Avro schema for the record.
        • COMPRESS: This property indicates that this column should be compressed with the given codec and optional level; e.g., 'compress(snappy)' for Snappy compression and 'compress(zstd(7))' for zstd level 7 compression. This property is primarily used in order to save disk space.
        • DICT: This property indicates that this column should be dictionary encoded. It can only be used in conjunction with restricted string (charN), int, long or date columns. Dictionary encoding is best for columns where the cardinality (the number of unique values) is expected to be low. This property can save a large amount of memory.
        • INIT_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, replace empty strings and invalid timestamps with 'NOW()' upon insert.
        • INIT_WITH_UUID: For 'uuid' type, replace empty strings and invalid UUID values with randomly-generated UUIDs upon insert.
        • UPDATE_WITH_NOW: For 'date', 'time', 'datetime', or 'timestamp' column types, always update the field with 'NOW()' upon any update.
        The default value is an empty Map.
        Parameters:
        properties - The new value for properties.
        Returns:
        this to mimic the builder pattern.
      • getSchema

        public org.apache.avro.Schema getSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        getSchema in interface org.apache.avro.generic.GenericContainer
        Returns:
        The schema object describing this class.
      • get

        public Object get​(int index)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        get in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to get
        Returns:
        value of the field with the given index.
        Throws:
        IndexOutOfBoundsException
      • put

        public void put​(int index,
                        Object value)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        put in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to set
        value - the value to set
        Throws:
        IndexOutOfBoundsException
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object