CreateTableExternalRequest.Options (Kinetica Java API 7.1.10.4 API)

java.lang.Object
- com.gpudb.protocol.CreateTableExternalRequest.Options

Enclosing class:

CreateTableExternalRequest
```
public static final class CreateTableExternalRequest.Options
extends Object
```
Optional parameters.
- BAD_RECORD_TABLE_NAME: Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When error_handling is abort, bad records table is not populated.
- BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is '10000'.
- BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be bad_record_table_limit and total size of the table per rank is limited to bad_record_table_limit.
- BATCH_SIZE: Number of records to insert per batch when inserting data. The default value is '50000'.
- COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, & datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'.
  See default_column_formats for valid format syntax.
- COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files.
  Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table.
  If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful.
  Mutually exclusive with columns_to_skip.
- COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with columns_to_load.
- COMPRESSION_TYPE: Source data compression type Supported values:
  - NONE: No compression.
  - AUTO: Auto detect compression type
  - GZIP: gzip file compression.
  - BZIP2: bzip2 file compression.
  The default value is AUTO.
- DATASOURCE_NAME: Name of an existing external data source from which data file(s) specified in filepaths will be loaded
- DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, & datetime. This default column-property-bound format can be overridden by specifying a column property & format for a given target column in column_formats. For each specified annotation, the format will apply to all columns with that annotation unless a custom column_formats for that annotation is specified.
  The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds).
  Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"
- ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:
  - PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
  - IGNORE_BAD_RECORDS: Malformed records are skipped.
  - ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
  The default value is ABORT.
- EXTERNAL_TABLE_TYPE: Specifies whether the external table holds a local copy of the external data. Supported values:
  - MATERIALIZED: Loads a copy of the external data into the database, refreshed on demand
  - LOGICAL: External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table
  The default value is MATERIALIZED.
- FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:
  - AVRO: Avro file format
  - DELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.
  - GDB: Esri/GDB file format
  - JSON: Json file format
  - PARQUET: Apache Parquet file format
  - SHAPEFILE: ShapeFile file format
  The default value is DELIMITED_TEXT.
- GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requets: key=value
- IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when update_on_existing_pk is false). If set to true, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If false, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by error_handling. If the specified table does not have a primary key or if upsert mode is in effect (update_on_existing_pk is true), then this option has no effect. Supported values:
  - TRUE: Ignore new records whose primary key values collide with those of existing records
  - FALSE: Treat as errors any new records whose primary key values collide with those of existing records
  The default value is FALSE.
- INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:
  - FULL: Run a type inference on the source data (if needed) and ingest
  - DRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of error_handling.
  - TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
  The default value is FULL.
- JDBC_FETCH_SIZE: The JDBC fetch size, which determines how many rows to fetch per round trip. The default value is '50000'.
- KAFKA_CONSUMERS_PER_RANK: Number of Kafka consumer threads per rank (valid range 1-6). The default value is '1'.
- KAFKA_GROUP_ID: The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
- KAFKA_OFFSET_RESET_POLICY: Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values:
  - EARLIEST
  - LATEST
  The default value is EARLIEST.
- KAFKA_OPTIMISTIC_INGEST: Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values:
  - TRUE
  - FALSE
  The default value is FALSE.
- KAFKA_SUBSCRIPTION_CANCEL_AFTER: Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically.
- KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT: Maximum time to collect Kafka messages before type inferencing on the set of them.
- LAYER: Geo files layer(s) name(s): comma separated.
- LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database Supported values:
  - HEAD: The head node loads all data. All files must be available to the head node.
  - DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers.
    NOTE:
    Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
  - DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data).
    NOTE:
    If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail.
    If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
  The default value is HEAD.
- LOCAL_TIME_OFFSET: Apply an offset to Avro local timestamp columns.
- MAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: if this number is larger than batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread).
- NUM_TASKS_PER_RANK: Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks.
- POLL_INTERVAL: If true, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is '0'.
- PRIMARY_KEYS: Comma separated list of column names to set as primary keys, when not specified in the type.
- REFRESH_METHOD: Method by which the table can be refreshed from its source data. Supported values:
  - MANUAL: Refresh only occurs when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
  - ON_START: Refresh table on database startup and when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
  The default value is MANUAL.
- SCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.
- SHARD_KEYS: Comma separated list of column names to set as shard keys, when not specified in the type.
- SKIP_LINES: Skip number of lines from begining of file.
- SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values:
  - TRUE
  - FALSE
  The default value is FALSE.
- TABLE_INSERT_MODE: Insertion scheme to use when inserting records from multiple shapefiles. Supported values:
  - SINGLE: Insert all records into a single table.
  - TABLE_PER_FILE: Insert records from each file into a new table corresponding to that file.
  The default value is SINGLE.
- TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored.
  For delimited_text file_type only. The default value is '#'.
- TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present).
  For delimited_text file_type only. The default value is ','.
- TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data.
  An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, & vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value.
  The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not.
  For delimited_text file_type only.
- TEXT_HAS_HEADER: Indicates whether the source data contains a header row.
  For delimited_text file_type only. Supported values:
  - TRUE
  - FALSE
  The default value is TRUE.
- TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as text_delimiter.
  For delimited_text file_type only. The default value is '|'.
- TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data.
  For delimited_text file_type only. The default value is '\\N'.
- TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string.
  For delimited_text file_type only. The default value is '"'.
- TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma seperated list of column names or '*' for all columns. To add 'text_search' property only to string columns greater than or equal to a minimum size, also set the text_search_min_column_length
- TEXT_SEARCH_MIN_COLUMN_LENGTH: Set the minimum column size for strings to apply the 'text_search' property to. Used only when text_search_columns has a value.
- TRUNCATE_STRINGS: If set to true, truncate string values that are longer than the column's type size. Supported values:
  - TRUE
  - FALSE
  The default value is FALSE.
- TRUNCATE_TABLE: If set to true, truncates the table specified by tableName prior to loading the file(s). Supported values:
  - TRUE
  - FALSE
  The default value is FALSE.
- TYPE_INFERENCE_MODE: Optimize type inferencing for either speed or accuracy. Supported values:
  - ACCURACY: Scans data to get exactly-typed & sized columns for all data scanned.
  - SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
  The default value is SPEED.
- REMOTE_QUERY: Remote SQL query from which data will be sourced
- REMOTE_QUERY_FILTER_COLUMN: Name of column to be used for splitting remote_query into multiple sub-queries using the data distribution of given column
- REMOTE_QUERY_INCREASING_COLUMN: Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP).
- REMOTE_QUERY_PARTITION_COLUMN: Alias name for remote_query_filter_column.
- UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set to true, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be 'upserted'). If set to false, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by ignore_existing_pk & error_handling. If the specified table does not have a primary key, then this option has no effect. Supported values:
  - TRUE: Upsert new records when primary keys match existing records
  - FALSE: Reject new records when primary keys match existing records
  The default value is FALSE.
The default value is an empty Map. A set of string constants for the parameter options.

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`ABORT` Stops current insertion and aborts entire operation when an error is encountered.
`static String`	`ACCURACY` Scans data to get exactly-typed & sized columns for all data scanned.
`static String`	`AUTO` Auto detect compression type
`static String`	`AVRO` Avro file format
`static String`	`BAD_RECORD_TABLE_LIMIT` A positive integer indicating the maximum number of records that can be written to the bad-record-table.
`static String`	`BAD_RECORD_TABLE_LIMIT_PER_INPUT` For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload.
`static String`	`BAD_RECORD_TABLE_NAME` Name of a table to which records that were rejected are written.
`static String`	`BATCH_SIZE` Number of records to insert per batch when inserting data.
`static String`	`BZIP2` bzip2 file compression.
`static String`	`COLUMN_FORMATS` For each target column specified, applies the column-property-bound format to the source data loaded into that column.
`static String`	`COLUMNS_TO_LOAD` Specifies a comma-delimited list of columns from the source data to load.
`static String`	`COLUMNS_TO_SKIP` Specifies a comma-delimited list of columns from the source data to skip.
`static String`	`COMPRESSION_TYPE` Source data compression type Supported values: `NONE`: No compression.
`static String`	`DATASOURCE_NAME` Name of an existing external data source from which data file(s) specified in `filepaths` will be loaded
`static String`	`DEFAULT_COLUMN_FORMATS` Specifies the default format to be applied to source data loaded into columns with the corresponding column property.
`static String`	`DELIMITED_TEXT` Delimited text file format; e.g., CSV, TSV, PSV, etc.
`static String`	`DISTRIBUTED_LOCAL` A single worker process on each node loads all files that are available to it.
`static String`	`DISTRIBUTED_SHARED` The head node coordinates loading data by worker processes across all nodes from shared files available to all workers.
`static String`	`DRY_RUN` Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of `error_handling`.
`static String`	`EARLIEST`
`static String`	`ERROR_HANDLING` Specifies how errors should be handled upon insertion.
`static String`	`EXTERNAL_TABLE_TYPE` Specifies whether the external table holds a local copy of the external data.
`static String`	`FALSE` Reject new records when primary keys match existing records
`static String`	`FILE_TYPE` Specifies the type of the file(s) whose records will be inserted.
`static String`	`FULL` Run a type inference on the source data (if needed) and ingest
`static String`	`GDAL_CONFIGURATION_OPTIONS` Comma separated list of gdal conf options, for the specific requets: key=value
`static String`	`GDB` Esri/GDB file format
`static String`	`GZIP` gzip file compression.
`static String`	`HEAD` The head node loads all data.
`static String`	`IGNORE_BAD_RECORDS` Malformed records are skipped.
`static String`	`IGNORE_EXISTING_PK` Specifies the record collision error-suppression policy for inserting into a table with a	`static String`	`INGESTION_MODE` Whether to do a full load, dry run, or perform a type inference on the source data.
`static String`	`JDBC_FETCH_SIZE` The JDBC fetch size, which determines how many rows to fetch per round trip.
`static String`	`JSON` Json file format
`static String`	`KAFKA_CONSUMERS_PER_RANK` Number of Kafka consumer threads per rank (valid range 1-6).
`static String`	`KAFKA_GROUP_ID` The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
`static String`	`KAFKA_OFFSET_RESET_POLICY` Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset.
`static String`	`KAFKA_OPTIMISTIC_INGEST` Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism.
`static String`	`KAFKA_SUBSCRIPTION_CANCEL_AFTER` Sets the Kafka subscription lifespan (in minutes).
`static String`	`KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT` Maximum time to collect Kafka messages before type inferencing on the set of them.
`static String`	`LATEST`
`static String`	`LAYER` Geo files layer(s) name(s): comma separated.
`static String`	`LOADING_MODE` Scheme for distributing the extraction and loading of data from the source data file(s).
`static String`	`LOCAL_TIME_OFFSET` Apply an offset to Avro local timestamp columns.
`static String`	`LOGICAL` External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table
`static String`	`MANUAL` Refresh only occurs when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table.
`static String`	`MATERIALIZED` Loads a copy of the external data into the database, refreshed on demand
`static String`	`MAX_RECORDS_TO_LOAD` Limit the number of records to load in this request: if this number is larger than `batch_size`, then the number of records loaded will be limited to the next whole number of `batch_size` (per working thread).
`static String`	`NONE` No compression.
`static String`	`NUM_TASKS_PER_RANK` Number of tasks for reading file per rank.
`static String`	`ON_START` Refresh table on database startup and when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table.
`static String`	`PARQUET` Apache Parquet file format
`static String`	`PERMISSIVE` Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
`static String`	`POLL_INTERVAL` If `true`, the number of seconds between attempts to load external files into the table.
`static String`	`PRIMARY_KEYS` Comma separated list of column names to set as primary keys, when not specified in the type.
`static String`	`REFRESH_METHOD` Method by which the table can be refreshed from its source data.
`static String`	`REMOTE_QUERY` Remote SQL query from which data will be sourced
`static String`	`REMOTE_QUERY_FILTER_COLUMN` Name of column to be used for splitting `remote_query` into multiple sub-queries using the data distribution of given column
`static String`	`REMOTE_QUERY_INCREASING_COLUMN` Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP).
`static String`	`REMOTE_QUERY_PARTITION_COLUMN` Alias name for `remote_query_filter_column`.
`static String`	`SCHEMA_REGISTRY_SCHEMA_NAME` Name of the Avro schema in the schema registry to use when reading Avro records.
`static String`	`SHAPEFILE` ShapeFile file format
`static String`	`SHARD_KEYS` Comma separated list of column names to set as shard keys, when not specified in the type.
`static String`	`SINGLE` Insert all records into a single table.
`static String`	`SKIP_LINES` Skip number of lines from begining of file.
`static String`	`SPEED` Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
`static String`	`SUBSCRIBE` Continuously poll the data source to check for new data and load it into the table.
`static String`	`TABLE_INSERT_MODE` Insertion scheme to use when inserting records from multiple shapefiles.
`static String`	`TABLE_PER_FILE` Insert records from each file into a new table corresponding to that file.
`static String`	`TEXT_COMMENT_STRING` Specifies the character string that should be interpreted as a comment line prefix in the source data.
`static String`	`TEXT_DELIMITER` Specifies the character delimiting field values in the source data and field names in the header (if present).
`static String`	`TEXT_ESCAPE_CHARACTER` Specifies the character that is used to escape other characters in the source data.
`static String`	`TEXT_HAS_HEADER` Indicates whether the source data contains a header row.
`static String`	`TEXT_HEADER_PROPERTY_DELIMITER` Specifies the delimiter for	`static String`	`TEXT_NULL_STRING` Specifies the character string that should be interpreted as a null value in the source data.
`static String`	`TEXT_QUOTE_CHARACTER` Specifies the character that should be interpreted as a field value quoting character in the source data.
`static String`	`TEXT_SEARCH_COLUMNS` Add 'text_search' property to internally inferenced string columns.
`static String`	`TEXT_SEARCH_MIN_COLUMN_LENGTH` Set the minimum column size for strings to apply the 'text_search' property to.
`static String`	`TRUE` Upsert new records when primary keys match existing records
`static String`	`TRUNCATE_STRINGS` If set to `true`, truncate string values that are longer than the column's type size.
`static String`	`TRUNCATE_TABLE` If set to `true`, truncates the table specified by `tableName` prior to loading the file(s).
`static String`	`TYPE_INFERENCE_MODE` Optimize type inferencing for either speed or accuracy.
`static String`	`TYPE_INFERENCE_ONLY` Infer the type of the source data and return, without ingesting any data.
`static String`	`UPDATE_ON_EXISTING_PK` Specifies the record collision policy for inserting into a table with a Method Summary Methods inherited from class java.lang.Object `clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait` Field Detail BAD_RECORD_TABLE_NAME public static final String BAD_RECORD_TABLE_NAME Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When `error_handling` is `abort`, bad records table is not populated. See Also: Constant Field Values BAD_RECORD_TABLE_LIMIT public static final String BAD_RECORD_TABLE_LIMIT A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is '10000'. See Also: Constant Field Values BAD_RECORD_TABLE_LIMIT_PER_INPUT public static final String BAD_RECORD_TABLE_LIMIT_PER_INPUT For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be `bad_record_table_limit` and total size of the table per rank is limited to `bad_record_table_limit`. See Also: Constant Field Values BATCH_SIZE public static final String BATCH_SIZE Number of records to insert per batch when inserting data. The default value is '50000'. See Also: Constant Field Values COLUMN_FORMATS public static final String COLUMN_FORMATS For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, & datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. See `default_column_formats` for valid format syntax. See Also: Constant Field Values COLUMNS_TO_LOAD public static final String COLUMNS_TO_LOAD Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive with `columns_to_skip`. See Also: Constant Field Values COLUMNS_TO_SKIP public static final String COLUMNS_TO_SKIP Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with `columns_to_load`. See Also: Constant Field Values COMPRESSION_TYPE public static final String COMPRESSION_TYPE Source data compression type Supported values: `NONE`: No compression. `AUTO`: Auto detect compression type `GZIP`: gzip file compression. `BZIP2`: bzip2 file compression. The default value is `AUTO`. See Also: Constant Field Values NONE public static final String NONE No compression. See Also: Constant Field Values AUTO public static final String AUTO Auto detect compression type See Also: Constant Field Values GZIP public static final String GZIP gzip file compression. See Also: Constant Field Values BZIP2 public static final String BZIP2 bzip2 file compression. See Also: Constant Field Values DATASOURCE_NAME public static final String DATASOURCE_NAME Name of an existing external data source from which data file(s) specified in `filepaths` will be loaded See Also: Constant Field Values DEFAULT_COLUMN_FORMATS public static final String DEFAULT_COLUMN_FORMATS Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, & datetime. This default column-property-bound format can be overridden by specifying a column property & format for a given target column in `column_formats`. For each specified annotation, the format will apply to all columns with that annotation unless a custom `column_formats` for that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11" See Also: Constant Field Values ERROR_HANDLING public static final String ERROR_HANDLING Specifies how errors should be handled upon insertion. Supported values: `PERMISSIVE`: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped. `IGNORE_BAD_RECORDS`: Malformed records are skipped. `ABORT`: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode. The default value is `ABORT`. See Also: Constant Field Values PERMISSIVE public static final String PERMISSIVE Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped. See Also: Constant Field Values IGNORE_BAD_RECORDS public static final String IGNORE_BAD_RECORDS Malformed records are skipped. See Also: Constant Field Values ABORT public static final String ABORT Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode. See Also: Constant Field Values EXTERNAL_TABLE_TYPE public static final String EXTERNAL_TABLE_TYPE Specifies whether the external table holds a local copy of the external data. Supported values: `MATERIALIZED`: Loads a copy of the external data into the database, refreshed on demand `LOGICAL`: External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table The default value is `MATERIALIZED`. See Also: Constant Field Values MATERIALIZED public static final String MATERIALIZED Loads a copy of the external data into the database, refreshed on demand See Also: Constant Field Values LOGICAL public static final String LOGICAL External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table See Also: Constant Field Values FILE_TYPE public static final String FILE_TYPE Specifies the type of the file(s) whose records will be inserted. Supported values: `AVRO`: Avro file format `DELIMITED_TEXT`: Delimited text file format; e.g., CSV, TSV, PSV, etc. `GDB`: Esri/GDB file format `JSON`: Json file format `PARQUET`: Apache Parquet file format `SHAPEFILE`: ShapeFile file format The default value is `DELIMITED_TEXT`. See Also: Constant Field Values AVRO public static final String AVRO Avro file format See Also: Constant Field Values DELIMITED_TEXT public static final String DELIMITED_TEXT Delimited text file format; e.g., CSV, TSV, PSV, etc. See Also: Constant Field Values GDB public static final String GDB Esri/GDB file format See Also: Constant Field Values JSON public static final String JSON Json file format See Also: Constant Field Values PARQUET public static final String PARQUET Apache Parquet file format See Also: Constant Field Values SHAPEFILE public static final String SHAPEFILE ShapeFile file format See Also: Constant Field Values GDAL_CONFIGURATION_OPTIONS public static final String GDAL_CONFIGURATION_OPTIONS Comma separated list of gdal conf options, for the specific requets: key=value See Also: Constant Field Values IGNORE_EXISTING_PK public static final String IGNORE_EXISTING_PK Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when `update_on_existing_pk` is `false`). If set to `true`, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If `false`, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by `error_handling`. If the specified table does not have a primary key or if upsert mode is in effect (`update_on_existing_pk` is `true`), then this option has no effect. Supported values: `TRUE`: Ignore new records whose primary key values collide with those of existing records `FALSE`: Treat as errors any new records whose primary key values collide with those of existing records The default value is `FALSE`. See Also: Constant Field Values TRUE public static final String TRUE Upsert new records when primary keys match existing records See Also: Constant Field Values FALSE public static final String FALSE Reject new records when primary keys match existing records See Also: Constant Field Values INGESTION_MODE public static final String INGESTION_MODE Whether to do a full load, dry run, or perform a type inference on the source data. Supported values: `FULL`: Run a type inference on the source data (if needed) and ingest `DRY_RUN`: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of `error_handling`. `TYPE_INFERENCE_ONLY`: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response. The default value is `FULL`. See Also: Constant Field Values FULL public static final String FULL Run a type inference on the source data (if needed) and ingest See Also: Constant Field Values DRY_RUN public static final String DRY_RUN Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of `error_handling`. See Also: Constant Field Values TYPE_INFERENCE_ONLY public static final String TYPE_INFERENCE_ONLY Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response. See Also: Constant Field Values JDBC_FETCH_SIZE public static final String JDBC_FETCH_SIZE The JDBC fetch size, which determines how many rows to fetch per round trip. The default value is '50000'. See Also: Constant Field Values KAFKA_CONSUMERS_PER_RANK public static final String KAFKA_CONSUMERS_PER_RANK Number of Kafka consumer threads per rank (valid range 1-6). The default value is '1'. See Also: Constant Field Values KAFKA_GROUP_ID public static final String KAFKA_GROUP_ID The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions). See Also: Constant Field Values KAFKA_OFFSET_RESET_POLICY public static final String KAFKA_OFFSET_RESET_POLICY Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values: `EARLIEST` `LATEST` The default value is `EARLIEST`. See Also: Constant Field Values EARLIEST public static final String EARLIEST See Also: Constant Field Values LATEST public static final String LATEST See Also: Constant Field Values KAFKA_OPTIMISTIC_INGEST public static final String KAFKA_OPTIMISTIC_INGEST Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values: `TRUE` `FALSE` The default value is `FALSE`. See Also: Constant Field Values KAFKA_SUBSCRIPTION_CANCEL_AFTER public static final String KAFKA_SUBSCRIPTION_CANCEL_AFTER Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically. See Also: Constant Field Values KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT public static final String KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT Maximum time to collect Kafka messages before type inferencing on the set of them. See Also: Constant Field Values LAYER public static final String LAYER Geo files layer(s) name(s): comma separated. See Also: Constant Field Values LOADING_MODE public static final String LOADING_MODE Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database Supported values: `HEAD`: The head node loads all data. All files must be available to the head node. `DISTRIBUTED_SHARED`: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load. `DISTRIBUTED_LOCAL`: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded. The default value is `HEAD`. See Also: Constant Field Values HEAD public static final String HEAD The head node loads all data. All files must be available to the head node. See Also: Constant Field Values DISTRIBUTED_SHARED public static final String DISTRIBUTED_SHARED The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load. See Also: Constant Field Values DISTRIBUTED_LOCAL public static final String DISTRIBUTED_LOCAL A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded. See Also: Constant Field Values LOCAL_TIME_OFFSET public static final String LOCAL_TIME_OFFSET Apply an offset to Avro local timestamp columns. See Also: Constant Field Values MAX_RECORDS_TO_LOAD public static final String MAX_RECORDS_TO_LOAD Limit the number of records to load in this request: if this number is larger than `batch_size`, then the number of records loaded will be limited to the next whole number of `batch_size` (per working thread). See Also: Constant Field Values NUM_TASKS_PER_RANK public static final String NUM_TASKS_PER_RANK Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks. See Also: Constant Field Values POLL_INTERVAL public static final String POLL_INTERVAL If `true`, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is '0'. See Also: Constant Field Values PRIMARY_KEYS public static final String PRIMARY_KEYS Comma separated list of column names to set as primary keys, when not specified in the type. See Also: Constant Field Values REFRESH_METHOD public static final String REFRESH_METHOD Method by which the table can be refreshed from its source data. Supported values: `MANUAL`: Refresh only occurs when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table. `ON_START`: Refresh table on database startup and when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table. The default value is `MANUAL`. See Also: Constant Field Values MANUAL public static final String MANUAL Refresh only occurs when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table. See Also: Constant Field Values ON_START public static final String ON_START Refresh table on database startup and when manually requested by invoking the refresh action of `GPUdb.alterTable(AlterTableRequest)` on this table. See Also: Constant Field Values SCHEMA_REGISTRY_SCHEMA_NAME public static final String SCHEMA_REGISTRY_SCHEMA_NAME Name of the Avro schema in the schema registry to use when reading Avro records. See Also: Constant Field Values SHARD_KEYS public static final String SHARD_KEYS Comma separated list of column names to set as shard keys, when not specified in the type. See Also: Constant Field Values SKIP_LINES public static final String SKIP_LINES Skip number of lines from begining of file. See Also: Constant Field Values SUBSCRIBE public static final String SUBSCRIBE Continuously poll the data source to check for new data and load it into the table. Supported values: `TRUE` `FALSE` The default value is `FALSE`. See Also: Constant Field Values TABLE_INSERT_MODE public static final String TABLE_INSERT_MODE Insertion scheme to use when inserting records from multiple shapefiles. Supported values: `SINGLE`: Insert all records into a single table. `TABLE_PER_FILE`: Insert records from each file into a new table corresponding to that file. The default value is `SINGLE`. See Also: Constant Field Values SINGLE public static final String SINGLE Insert all records into a single table. See Also: Constant Field Values TABLE_PER_FILE public static final String TABLE_PER_FILE Insert records from each file into a new table corresponding to that file. See Also: Constant Field Values TEXT_COMMENT_STRING public static final String TEXT_COMMENT_STRING Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. For `delimited_text` `file_type` only. The default value is '#'. See Also: Constant Field Values TEXT_DELIMITER public static final String TEXT_DELIMITER Specifies the character delimiting field values in the source data and field names in the header (if present). For `delimited_text` `file_type` only. The default value is ','. See Also: Constant Field Values TEXT_ESCAPE_CHARACTER public static final String TEXT_ESCAPE_CHARACTER Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, & vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. For `delimited_text` `file_type` only. See Also: Constant Field Values TEXT_HAS_HEADER public static final String TEXT_HAS_HEADER Indicates whether the source data contains a header row. For `delimited_text` `file_type` only. Supported values: `TRUE` `FALSE` The default value is `TRUE`. See Also: Constant Field Values TEXT_HEADER_PROPERTY_DELIMITER public static final String TEXT_HEADER_PROPERTY_DELIMITER Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as `text_delimiter`. For `delimited_text` `file_type` only. The default value is '\|'. See Also: Constant Field Values TEXT_NULL_STRING public static final String TEXT_NULL_STRING Specifies the character string that should be interpreted as a null value in the source data. For `delimited_text` `file_type` only. The default value is '\\N'. See Also: Constant Field Values TEXT_QUOTE_CHARACTER public static final String TEXT_QUOTE_CHARACTER Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. For `delimited_text` `file_type` only. The default value is '"'. See Also: Constant Field Values TEXT_SEARCH_COLUMNS public static final String TEXT_SEARCH_COLUMNS Add 'text_search' property to internally inferenced string columns. Comma seperated list of column names or '*' for all columns. To add 'text_search' property only to string columns greater than or equal to a minimum size, also set the `text_search_min_column_length` See Also: Constant Field Values TEXT_SEARCH_MIN_COLUMN_LENGTH public static final String TEXT_SEARCH_MIN_COLUMN_LENGTH Set the minimum column size for strings to apply the 'text_search' property to. Used only when `text_search_columns` has a value. See Also: Constant Field Values TRUNCATE_STRINGS public static final String TRUNCATE_STRINGS If set to `true`, truncate string values that are longer than the column's type size. Supported values: `TRUE` `FALSE` The default value is `FALSE`. See Also: Constant Field Values TRUNCATE_TABLE public static final String TRUNCATE_TABLE If set to `true`, truncates the table specified by `tableName` prior to loading the file(s). Supported values: `TRUE` `FALSE` The default value is `FALSE`. See Also: Constant Field Values TYPE_INFERENCE_MODE public static final String TYPE_INFERENCE_MODE Optimize type inferencing for either speed or accuracy. Supported values: `ACCURACY`: Scans data to get exactly-typed & sized columns for all data scanned. `SPEED`: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned The default value is `SPEED`. See Also: Constant Field Values ACCURACY public static final String ACCURACY Scans data to get exactly-typed & sized columns for all data scanned. See Also: Constant Field Values SPEED public static final String SPEED Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned See Also: Constant Field Values REMOTE_QUERY public static final String REMOTE_QUERY Remote SQL query from which data will be sourced See Also: Constant Field Values REMOTE_QUERY_FILTER_COLUMN public static final String REMOTE_QUERY_FILTER_COLUMN Name of column to be used for splitting `remote_query` into multiple sub-queries using the data distribution of given column See Also: Constant Field Values REMOTE_QUERY_INCREASING_COLUMN public static final String REMOTE_QUERY_INCREASING_COLUMN Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP). See Also: Constant Field Values REMOTE_QUERY_PARTITION_COLUMN public static final String REMOTE_QUERY_PARTITION_COLUMN Alias name for `remote_query_filter_column`. See Also: Constant Field Values UPDATE_ON_EXISTING_PK public static final String UPDATE_ON_EXISTING_PK Specifies the record collision policy for inserting into a table with a primary key. If set to `true`, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be 'upserted'). If set to `false`, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by `ignore_existing_pk` & `error_handling`. If the specified table does not have a primary key, then this option has no effect. Supported values: `TRUE`: Upsert new records when primary keys match existing records `FALSE`: Reject new records when primary keys match existing records The default value is `FALSE`. See Also: Constant Field Values Skip navigation links Overview Package Class Use Tree Deprecated Index Help Prev Class Next Class Frames No Frames All Classes Summary: Nested \| Field \| Constr \| Method Detail: Field \| Constr \| Method Copyright © 2024. All rights reserved.

Class CreateTableExternalRequest.Options

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

BAD_RECORD_TABLE_NAME

BAD_RECORD_TABLE_LIMIT

BAD_RECORD_TABLE_LIMIT_PER_INPUT

BATCH_SIZE

COLUMN_FORMATS

COLUMNS_TO_LOAD

COLUMNS_TO_SKIP

COMPRESSION_TYPE

NONE

AUTO

GZIP

BZIP2

DATASOURCE_NAME

DEFAULT_COLUMN_FORMATS

ERROR_HANDLING

PERMISSIVE

IGNORE_BAD_RECORDS

ABORT

EXTERNAL_TABLE_TYPE

MATERIALIZED

LOGICAL

FILE_TYPE

AVRO

DELIMITED_TEXT

GDB

JSON

PARQUET

SHAPEFILE

GDAL_CONFIGURATION_OPTIONS

IGNORE_EXISTING_PK

TRUE

FALSE

INGESTION_MODE

FULL

DRY_RUN

TYPE_INFERENCE_ONLY

JDBC_FETCH_SIZE

KAFKA_CONSUMERS_PER_RANK

KAFKA_GROUP_ID

KAFKA_OFFSET_RESET_POLICY

EARLIEST

LATEST

KAFKA_OPTIMISTIC_INGEST

KAFKA_SUBSCRIPTION_CANCEL_AFTER

KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT

LAYER

LOADING_MODE

HEAD

DISTRIBUTED_SHARED

DISTRIBUTED_LOCAL

LOCAL_TIME_OFFSET

MAX_RECORDS_TO_LOAD

NUM_TASKS_PER_RANK

POLL_INTERVAL

PRIMARY_KEYS

REFRESH_METHOD

MANUAL

ON_START

SCHEMA_REGISTRY_SCHEMA_NAME

SHARD_KEYS

SKIP_LINES

SUBSCRIBE

TABLE_INSERT_MODE

SINGLE

TABLE_PER_FILE

TEXT_COMMENT_STRING

TEXT_DELIMITER

TEXT_ESCAPE_CHARACTER

TEXT_HAS_HEADER

TEXT_HEADER_PROPERTY_DELIMITER

TEXT_NULL_STRING

TEXT_QUOTE_CHARACTER

TEXT_SEARCH_COLUMNS

TEXT_SEARCH_MIN_COLUMN_LENGTH

TRUNCATE_STRINGS