Package com.gpudb.protocol
Class InsertRecordsFromPayloadRequest
- java.lang.Object
-
- com.gpudb.protocol.InsertRecordsFromPayloadRequest
-
- All Implemented Interfaces:
org.apache.avro.generic.GenericContainer,org.apache.avro.generic.IndexedRecord
- Direct Known Subclasses:
GPUdbBase.InsertRecordsJsonRequest
public class InsertRecordsFromPayloadRequest extends Object implements org.apache.avro.generic.IndexedRecord
A set of parameters forGPUdb.insertRecordsFromPayload.Reads from the given text-based or binary payload and inserts the data into a new or existing table. The table will be created if it doesn't already exist.
Returns once all records are processed.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classInsertRecordsFromPayloadRequest.CreateTableOptionsA set of string constants for theInsertRecordsFromPayloadRequestparametercreateTableOptions.static classInsertRecordsFromPayloadRequest.OptionsA set of string constants for theInsertRecordsFromPayloadRequestparameteroptions.
-
Constructor Summary
Constructors Constructor Description InsertRecordsFromPayloadRequest()Constructs an InsertRecordsFromPayloadRequest object with default parameters.InsertRecordsFromPayloadRequest(String tableName, String dataText, ByteBuffer dataBytes, Map<String,Map<String,String>> modifyColumns, Map<String,String> createTableOptions, Map<String,String> options)Constructs an InsertRecordsFromPayloadRequest object with the specified parameters.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanequals(Object obj)Objectget(int index)This method supports the Avro framework and is not intended to be called directly by the user.static org.apache.avro.SchemagetClassSchema()This method supports the Avro framework and is not intended to be called directly by the user.Map<String,String>getCreateTableOptions()Options used when creating the target table.ByteBuffergetDataBytes()Records formatted as binary dataStringgetDataText()Records formatted as delimited textMap<String,Map<String,String>>getModifyColumns()Not implemented yet.Map<String,String>getOptions()Optional parameters.org.apache.avro.SchemagetSchema()This method supports the Avro framework and is not intended to be called directly by the user.StringgetTableName()Name of the table into which the data will be inserted, in [schema_name.]table_name format, using standard name resolution rules.inthashCode()voidput(int index, Object value)This method supports the Avro framework and is not intended to be called directly by the user.InsertRecordsFromPayloadRequestsetCreateTableOptions(Map<String,String> createTableOptions)Options used when creating the target table.InsertRecordsFromPayloadRequestsetDataBytes(ByteBuffer dataBytes)Records formatted as binary dataInsertRecordsFromPayloadRequestsetDataText(String dataText)Records formatted as delimited textInsertRecordsFromPayloadRequestsetModifyColumns(Map<String,Map<String,String>> modifyColumns)Not implemented yet.InsertRecordsFromPayloadRequestsetOptions(Map<String,String> options)Optional parameters.InsertRecordsFromPayloadRequestsetTableName(String tableName)Name of the table into which the data will be inserted, in [schema_name.]table_name format, using standard name resolution rules.StringtoString()
-
-
-
Constructor Detail
-
InsertRecordsFromPayloadRequest
public InsertRecordsFromPayloadRequest()
Constructs an InsertRecordsFromPayloadRequest object with default parameters.
-
InsertRecordsFromPayloadRequest
public InsertRecordsFromPayloadRequest(String tableName, String dataText, ByteBuffer dataBytes, Map<String,Map<String,String>> modifyColumns, Map<String,String> createTableOptions, Map<String,String> options)
Constructs an InsertRecordsFromPayloadRequest object with the specified parameters.- Parameters:
tableName- Name of the table into which the data will be inserted, in [schema_name.]table_name format, using standard name resolution rules. If the table does not exist, the table will be created using either an existingTYPE_IDor the type inferred from the payload, and the new table name will have to meet standard table naming criteria.dataText- Records formatted as delimited textdataBytes- Records formatted as binary datamodifyColumns- Not implemented yet. The default value is an emptyMap.createTableOptions- Options used when creating the target table. Includes type to use. The other options match those inGPUdb.createTable.TYPE_ID: ID of a currently registered type. The default value is ''.NO_ERROR_IF_EXISTS: IfTRUE, prevents an error from occurring if the table already exists and is of the given type. If a table with the same ID but a different type exists, it is still an error. Supported values: The default value isFALSE.IS_REPLICATED: Affects the distribution scheme for the table's data. IfTRUEand the given type has no explicit shard key defined, the table will be replicated. IfFALSE, the table will be sharded according to the shard key specified in the givenTYPE_ID, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values: The default value isFALSE.FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.PARTITION_TYPE: Partitioning scheme to use. Supported values:RANGE: Use range partitioning.INTERVAL: Use interval partitioning.LIST: Use list partitioning.HASH: Use hash partitioning.SERIES: Use series partitioning.
PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined byPARTITION_DEFINITIONS.PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice ofPARTITION_TYPE. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.IS_AUTOMATIC_PARTITION: IfTRUE, a new partition will be created for values which don't fall into an existing partition. Currently only supported for list partitions. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified intableName.CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.CHUNK_COLUMN_MAX_MEMORY: Indicates the target maximum data size for each column in a chunk to be used for this table.CHUNK_MAX_MEMORY: Indicates the target maximum data size for all columns in a chunk to be used for this table.IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with text_search data-handling, and it will not be retained if the server is restarted. Supported values: The default value isFALSE.STRATEGY_DEFINITION: The tier strategy for the table and its columns.COMPRESSION_CODEC: The default compression codec for this table's columns.
Map.options- Optional parameters.BAD_RECORD_TABLE_NAME: Optional name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string).BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. Default value is 10000BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions: A positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be 'bad_record_table_limit' and total size of the table per rank is limited to 'bad_record_table_limit'BATCH_SIZE: Internal tuning parameter--number of records per batch when inserting data.COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, and datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. SeeDEFAULT_COLUMN_FORMATSfor valid format syntax.COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive withCOLUMNS_TO_SKIP.COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive withCOLUMNS_TO_LOAD.COMPRESSION_TYPE: Optional: payload compression type. Supported values:NONE: UncompressedAUTO: Default. Auto detect compression typeGZIP: gzip file compression.BZIP2: bzip2 file compression.
AUTO.DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, and datetime. This default column-property-bound format can be overridden by specifying a column property and format for a given target column inCOLUMN_FORMATS. For each specified annotation, the format will apply to all columns with that annotation unless a customCOLUMN_FORMATSfor that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.IGNORE_BAD_RECORDS: Malformed records are skipped.ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
ABORT.FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:AVRO: Avro file formatDELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.GDB: Esri/GDB file formatJSON: Json file formatPARQUET: Apache Parquet file formatSHAPEFILE: ShapeFile file format
DELIMITED_TEXT.FLATTEN_COLUMNS: Specifies how to handle nested columns. Supported values:TRUE: Break up nested columns to multiple columnsFALSE: Treat nested columns as json columns instead of flattening
FALSE.GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requests: key=value. The default value is ''.IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled whenUPDATE_ON_EXISTING_PKisFALSE). If set toTRUE, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. IfFALSE, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined byERROR_HANDLING. If the specified table does not have a primary key or if upsert mode is in effect (UPDATE_ON_EXISTING_PKisTRUE), then this option has no effect. Supported values:TRUE: Ignore new records whose primary key values collide with those of existing recordsFALSE: Treat as errors any new records whose primary key values collide with those of existing records
FALSE.INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:FULL: Run a type inference on the source data (if needed) and ingestDRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode ofERROR_HANDLING.TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
FULL.LAYER: Optional: geo files layer(s) name(s): comma separated. The default value is ''.LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database. Supported values:HEAD: The head node loads all data. All files must be available to the head node.DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
HEAD.LOCAL_TIME_OFFSET: For Avro local timestamp columnsMAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: If this number is larger than a batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread). The default value is ''.NUM_TASKS_PER_RANK: Optional: number of tasks for reading file per rank. Default will be external_file_reader_num_tasksPOLL_INTERVAL: IfTRUE, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds.PRIMARY_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SCHEMA_REGISTRY_CONNECTION_RETRIES: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_CONNECTION_TIMEOUT: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES: Max records to skip due to SR connection failures, before failingMAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE: Max records to skip due to schema related errors, before failingSCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.SHARD_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SKIP_LINES: Skip a number of lines from the beginning of the file.SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values: The default value isFALSE.TABLE_INSERT_MODE: Optional: table_insert_mode. When inserting records from multiple files: if table_per_file then insert from each file into a new table. Currently supported only for shapefiles. Supported values: The default value isSINGLE.TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '#'.TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). ForDELIMITED_TEXTFILE_TYPEonly. The default value is ','.TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, and vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. ForDELIMITED_TEXTFILE_TYPEonly.TEXT_HAS_HEADER: Indicates whether the source data contains a header row. ForDELIMITED_TEXTFILE_TYPEonly. Supported values: The default value isTRUE.TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value asTEXT_DELIMITER. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '|'.TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '\N'.TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '"'.TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma separated list of column names or '*' for all columns. To add text_search property only to string columns of minimum size, set also the option 'text_search_min_column_length'TEXT_SEARCH_MIN_COLUMN_LENGTH: Set minimum column size. Used only when 'text_search_columns' has a value.TRIM_SPACE: If set toTRUE, remove leading or trailing space from fields. Supported values: The default value isFALSE.TRUNCATE_STRINGS: If set toTRUE, truncate string values that are longer than the column's type size. Supported values: The default value isFALSE.TRUNCATE_TABLE: If set toTRUE, truncates the table specified bytableNameprior to loading the file(s). Supported values: The default value isFALSE.TYPE_INFERENCE_MAX_RECORDS_READ: The default value is ''.TYPE_INFERENCE_MODE: optimize type inference for: Supported values:ACCURACY: Scans data to get exactly-typed and sized columns for all data scanned.SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
ACCURACY.UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set toTRUE, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be "upserted"). If set toFALSE, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined byIGNORE_EXISTING_PKandERROR_HANDLING. If the specified table does not have a primary key, then this option has no effect. Supported values:TRUE: Upsert new records when primary keys match existing recordsFALSE: Reject new records when primary keys match existing records
FALSE.
Map.
-
-
Method Detail
-
getClassSchema
public static org.apache.avro.Schema getClassSchema()
This method supports the Avro framework and is not intended to be called directly by the user.- Returns:
- The schema for the class.
-
getTableName
public String getTableName()
Name of the table into which the data will be inserted, in [schema_name.]table_name format, using standard name resolution rules. If the table does not exist, the table will be created using either an existingTYPE_IDor the type inferred from the payload, and the new table name will have to meet standard table naming criteria.- Returns:
- The current value of
tableName.
-
setTableName
public InsertRecordsFromPayloadRequest setTableName(String tableName)
Name of the table into which the data will be inserted, in [schema_name.]table_name format, using standard name resolution rules. If the table does not exist, the table will be created using either an existingTYPE_IDor the type inferred from the payload, and the new table name will have to meet standard table naming criteria.- Parameters:
tableName- The new value fortableName.- Returns:
thisto mimic the builder pattern.
-
getDataText
public String getDataText()
Records formatted as delimited text- Returns:
- The current value of
dataText.
-
setDataText
public InsertRecordsFromPayloadRequest setDataText(String dataText)
Records formatted as delimited text- Parameters:
dataText- The new value fordataText.- Returns:
thisto mimic the builder pattern.
-
getDataBytes
public ByteBuffer getDataBytes()
Records formatted as binary data- Returns:
- The current value of
dataBytes.
-
setDataBytes
public InsertRecordsFromPayloadRequest setDataBytes(ByteBuffer dataBytes)
Records formatted as binary data- Parameters:
dataBytes- The new value fordataBytes.- Returns:
thisto mimic the builder pattern.
-
getModifyColumns
public Map<String,Map<String,String>> getModifyColumns()
Not implemented yet. The default value is an emptyMap.- Returns:
- The current value of
modifyColumns.
-
setModifyColumns
public InsertRecordsFromPayloadRequest setModifyColumns(Map<String,Map<String,String>> modifyColumns)
Not implemented yet. The default value is an emptyMap.- Parameters:
modifyColumns- The new value formodifyColumns.- Returns:
thisto mimic the builder pattern.
-
getCreateTableOptions
public Map<String,String> getCreateTableOptions()
Options used when creating the target table. Includes type to use. The other options match those inGPUdb.createTable.TYPE_ID: ID of a currently registered type. The default value is ''.NO_ERROR_IF_EXISTS: IfTRUE, prevents an error from occurring if the table already exists and is of the given type. If a table with the same ID but a different type exists, it is still an error. Supported values: The default value isFALSE.IS_REPLICATED: Affects the distribution scheme for the table's data. IfTRUEand the given type has no explicit shard key defined, the table will be replicated. IfFALSE, the table will be sharded according to the shard key specified in the givenTYPE_ID, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values: The default value isFALSE.FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.PARTITION_TYPE: Partitioning scheme to use. Supported values:RANGE: Use range partitioning.INTERVAL: Use interval partitioning.LIST: Use list partitioning.HASH: Use hash partitioning.SERIES: Use series partitioning.
PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined byPARTITION_DEFINITIONS.PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice ofPARTITION_TYPE. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.IS_AUTOMATIC_PARTITION: IfTRUE, a new partition will be created for values which don't fall into an existing partition. Currently only supported for list partitions. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified intableName.CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.CHUNK_COLUMN_MAX_MEMORY: Indicates the target maximum data size for each column in a chunk to be used for this table.CHUNK_MAX_MEMORY: Indicates the target maximum data size for all columns in a chunk to be used for this table.IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with text_search data-handling, and it will not be retained if the server is restarted. Supported values: The default value isFALSE.STRATEGY_DEFINITION: The tier strategy for the table and its columns.COMPRESSION_CODEC: The default compression codec for this table's columns.
Map.- Returns:
- The current value of
createTableOptions.
-
setCreateTableOptions
public InsertRecordsFromPayloadRequest setCreateTableOptions(Map<String,String> createTableOptions)
Options used when creating the target table. Includes type to use. The other options match those inGPUdb.createTable.TYPE_ID: ID of a currently registered type. The default value is ''.NO_ERROR_IF_EXISTS: IfTRUE, prevents an error from occurring if the table already exists and is of the given type. If a table with the same ID but a different type exists, it is still an error. Supported values: The default value isFALSE.IS_REPLICATED: Affects the distribution scheme for the table's data. IfTRUEand the given type has no explicit shard key defined, the table will be replicated. IfFALSE, the table will be sharded according to the shard key specified in the givenTYPE_ID, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values: The default value isFALSE.FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.PARTITION_TYPE: Partitioning scheme to use. Supported values:RANGE: Use range partitioning.INTERVAL: Use interval partitioning.LIST: Use list partitioning.HASH: Use hash partitioning.SERIES: Use series partitioning.
PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined byPARTITION_DEFINITIONS.PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice ofPARTITION_TYPE. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.IS_AUTOMATIC_PARTITION: IfTRUE, a new partition will be created for values which don't fall into an existing partition. Currently only supported for list partitions. Supported values: The default value isFALSE.TTL: Sets the TTL of the table specified intableName.CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.CHUNK_COLUMN_MAX_MEMORY: Indicates the target maximum data size for each column in a chunk to be used for this table.CHUNK_MAX_MEMORY: Indicates the target maximum data size for all columns in a chunk to be used for this table.IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with text_search data-handling, and it will not be retained if the server is restarted. Supported values: The default value isFALSE.STRATEGY_DEFINITION: The tier strategy for the table and its columns.COMPRESSION_CODEC: The default compression codec for this table's columns.
Map.- Parameters:
createTableOptions- The new value forcreateTableOptions.- Returns:
thisto mimic the builder pattern.
-
getOptions
public Map<String,String> getOptions()
Optional parameters.BAD_RECORD_TABLE_NAME: Optional name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string).BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. Default value is 10000BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions: A positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be 'bad_record_table_limit' and total size of the table per rank is limited to 'bad_record_table_limit'BATCH_SIZE: Internal tuning parameter--number of records per batch when inserting data.COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, and datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. SeeDEFAULT_COLUMN_FORMATSfor valid format syntax.COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive withCOLUMNS_TO_SKIP.COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive withCOLUMNS_TO_LOAD.COMPRESSION_TYPE: Optional: payload compression type. Supported values:NONE: UncompressedAUTO: Default. Auto detect compression typeGZIP: gzip file compression.BZIP2: bzip2 file compression.
AUTO.DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, and datetime. This default column-property-bound format can be overridden by specifying a column property and format for a given target column inCOLUMN_FORMATS. For each specified annotation, the format will apply to all columns with that annotation unless a customCOLUMN_FORMATSfor that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.IGNORE_BAD_RECORDS: Malformed records are skipped.ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
ABORT.FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:AVRO: Avro file formatDELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.GDB: Esri/GDB file formatJSON: Json file formatPARQUET: Apache Parquet file formatSHAPEFILE: ShapeFile file format
DELIMITED_TEXT.FLATTEN_COLUMNS: Specifies how to handle nested columns. Supported values:TRUE: Break up nested columns to multiple columnsFALSE: Treat nested columns as json columns instead of flattening
FALSE.GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requests: key=value. The default value is ''.IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled whenUPDATE_ON_EXISTING_PKisFALSE). If set toTRUE, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. IfFALSE, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined byERROR_HANDLING. If the specified table does not have a primary key or if upsert mode is in effect (UPDATE_ON_EXISTING_PKisTRUE), then this option has no effect. Supported values:TRUE: Ignore new records whose primary key values collide with those of existing recordsFALSE: Treat as errors any new records whose primary key values collide with those of existing records
FALSE.INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:FULL: Run a type inference on the source data (if needed) and ingestDRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode ofERROR_HANDLING.TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
FULL.LAYER: Optional: geo files layer(s) name(s): comma separated. The default value is ''.LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database. Supported values:HEAD: The head node loads all data. All files must be available to the head node.DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
HEAD.LOCAL_TIME_OFFSET: For Avro local timestamp columnsMAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: If this number is larger than a batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread). The default value is ''.NUM_TASKS_PER_RANK: Optional: number of tasks for reading file per rank. Default will be external_file_reader_num_tasksPOLL_INTERVAL: IfTRUE, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds.PRIMARY_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SCHEMA_REGISTRY_CONNECTION_RETRIES: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_CONNECTION_TIMEOUT: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES: Max records to skip due to SR connection failures, before failingMAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE: Max records to skip due to schema related errors, before failingSCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.SHARD_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SKIP_LINES: Skip a number of lines from the beginning of the file.SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values: The default value isFALSE.TABLE_INSERT_MODE: Optional: table_insert_mode. When inserting records from multiple files: if table_per_file then insert from each file into a new table. Currently supported only for shapefiles. Supported values: The default value isSINGLE.TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '#'.TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). ForDELIMITED_TEXTFILE_TYPEonly. The default value is ','.TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, and vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. ForDELIMITED_TEXTFILE_TYPEonly.TEXT_HAS_HEADER: Indicates whether the source data contains a header row. ForDELIMITED_TEXTFILE_TYPEonly. Supported values: The default value isTRUE.TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value asTEXT_DELIMITER. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '|'.TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '\N'.TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '"'.TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma separated list of column names or '*' for all columns. To add text_search property only to string columns of minimum size, set also the option 'text_search_min_column_length'TEXT_SEARCH_MIN_COLUMN_LENGTH: Set minimum column size. Used only when 'text_search_columns' has a value.TRIM_SPACE: If set toTRUE, remove leading or trailing space from fields. Supported values: The default value isFALSE.TRUNCATE_STRINGS: If set toTRUE, truncate string values that are longer than the column's type size. Supported values: The default value isFALSE.TRUNCATE_TABLE: If set toTRUE, truncates the table specified bytableNameprior to loading the file(s). Supported values: The default value isFALSE.TYPE_INFERENCE_MAX_RECORDS_READ: The default value is ''.TYPE_INFERENCE_MODE: optimize type inference for: Supported values:ACCURACY: Scans data to get exactly-typed and sized columns for all data scanned.SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
ACCURACY.UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set toTRUE, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be "upserted"). If set toFALSE, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined byIGNORE_EXISTING_PKandERROR_HANDLING. If the specified table does not have a primary key, then this option has no effect. Supported values:TRUE: Upsert new records when primary keys match existing recordsFALSE: Reject new records when primary keys match existing records
FALSE.
Map.- Returns:
- The current value of
options.
-
setOptions
public InsertRecordsFromPayloadRequest setOptions(Map<String,String> options)
Optional parameters.BAD_RECORD_TABLE_NAME: Optional name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string).BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. Default value is 10000BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions: A positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be 'bad_record_table_limit' and total size of the table per rank is limited to 'bad_record_table_limit'BATCH_SIZE: Internal tuning parameter--number of records per batch when inserting data.COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, and datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. SeeDEFAULT_COLUMN_FORMATSfor valid format syntax.COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive withCOLUMNS_TO_SKIP.COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive withCOLUMNS_TO_LOAD.COMPRESSION_TYPE: Optional: payload compression type. Supported values:NONE: UncompressedAUTO: Default. Auto detect compression typeGZIP: gzip file compression.BZIP2: bzip2 file compression.
AUTO.DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, and datetime. This default column-property-bound format can be overridden by specifying a column property and format for a given target column inCOLUMN_FORMATS. For each specified annotation, the format will apply to all columns with that annotation unless a customCOLUMN_FORMATSfor that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.IGNORE_BAD_RECORDS: Malformed records are skipped.ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
ABORT.FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:AVRO: Avro file formatDELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.GDB: Esri/GDB file formatJSON: Json file formatPARQUET: Apache Parquet file formatSHAPEFILE: ShapeFile file format
DELIMITED_TEXT.FLATTEN_COLUMNS: Specifies how to handle nested columns. Supported values:TRUE: Break up nested columns to multiple columnsFALSE: Treat nested columns as json columns instead of flattening
FALSE.GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requests: key=value. The default value is ''.IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled whenUPDATE_ON_EXISTING_PKisFALSE). If set toTRUE, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. IfFALSE, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined byERROR_HANDLING. If the specified table does not have a primary key or if upsert mode is in effect (UPDATE_ON_EXISTING_PKisTRUE), then this option has no effect. Supported values:TRUE: Ignore new records whose primary key values collide with those of existing recordsFALSE: Treat as errors any new records whose primary key values collide with those of existing records
FALSE.INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:FULL: Run a type inference on the source data (if needed) and ingestDRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode ofERROR_HANDLING.TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
FULL.LAYER: Optional: geo files layer(s) name(s): comma separated. The default value is ''.LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database. Supported values:HEAD: The head node loads all data. All files must be available to the head node.DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
HEAD.LOCAL_TIME_OFFSET: For Avro local timestamp columnsMAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: If this number is larger than a batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread). The default value is ''.NUM_TASKS_PER_RANK: Optional: number of tasks for reading file per rank. Default will be external_file_reader_num_tasksPOLL_INTERVAL: IfTRUE, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds.PRIMARY_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SCHEMA_REGISTRY_CONNECTION_RETRIES: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_CONNECTION_TIMEOUT: Confluent Schema registry connection timeout (in Secs)SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES: Max records to skip due to SR connection failures, before failingMAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE: Max records to skip due to schema related errors, before failingSCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.SHARD_KEYS: Optional: comma separated list of column names, to set as primary keys, when not specified in the type. The default value is ''.SKIP_LINES: Skip a number of lines from the beginning of the file.SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values: The default value isFALSE.TABLE_INSERT_MODE: Optional: table_insert_mode. When inserting records from multiple files: if table_per_file then insert from each file into a new table. Currently supported only for shapefiles. Supported values: The default value isSINGLE.TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '#'.TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). ForDELIMITED_TEXTFILE_TYPEonly. The default value is ','.TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, and vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. ForDELIMITED_TEXTFILE_TYPEonly.TEXT_HAS_HEADER: Indicates whether the source data contains a header row. ForDELIMITED_TEXTFILE_TYPEonly. Supported values: The default value isTRUE.TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value asTEXT_DELIMITER. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '|'.TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '\N'.TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. ForDELIMITED_TEXTFILE_TYPEonly. The default value is '"'.TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma separated list of column names or '*' for all columns. To add text_search property only to string columns of minimum size, set also the option 'text_search_min_column_length'TEXT_SEARCH_MIN_COLUMN_LENGTH: Set minimum column size. Used only when 'text_search_columns' has a value.TRIM_SPACE: If set toTRUE, remove leading or trailing space from fields. Supported values: The default value isFALSE.TRUNCATE_STRINGS: If set toTRUE, truncate string values that are longer than the column's type size. Supported values: The default value isFALSE.TRUNCATE_TABLE: If set toTRUE, truncates the table specified bytableNameprior to loading the file(s). Supported values: The default value isFALSE.TYPE_INFERENCE_MAX_RECORDS_READ: The default value is ''.TYPE_INFERENCE_MODE: optimize type inference for: Supported values:ACCURACY: Scans data to get exactly-typed and sized columns for all data scanned.SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
ACCURACY.UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set toTRUE, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be "upserted"). If set toFALSE, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined byIGNORE_EXISTING_PKandERROR_HANDLING. If the specified table does not have a primary key, then this option has no effect. Supported values:TRUE: Upsert new records when primary keys match existing recordsFALSE: Reject new records when primary keys match existing records
FALSE.
Map.- Parameters:
options- The new value foroptions.- Returns:
thisto mimic the builder pattern.
-
getSchema
public org.apache.avro.Schema getSchema()
This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
getSchemain interfaceorg.apache.avro.generic.GenericContainer- Returns:
- The schema object describing this class.
-
get
public Object get(int index)
This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
getin interfaceorg.apache.avro.generic.IndexedRecord- Parameters:
index- the position of the field to get- Returns:
- value of the field with the given index.
- Throws:
IndexOutOfBoundsException
-
put
public void put(int index, Object value)This method supports the Avro framework and is not intended to be called directly by the user.- Specified by:
putin interfaceorg.apache.avro.generic.IndexedRecord- Parameters:
index- the position of the field to setvalue- the value to set- Throws:
IndexOutOfBoundsException
-
-