InsertRecordsFromFilesRequest.Options

java.lang.Object

com.gpudb.protocol.InsertRecordsFromFilesRequest.Options

Enclosing class:

InsertRecordsFromFilesRequest

public static final class InsertRecordsFromFilesRequest.Options extends Object

A set of string constants for the InsertRecordsFromFilesRequest parameter options.

Optional parameters.

Field Summary
Fields
Modifier and Type
Field
Description
static final String
ABORT
Stops current insertion and aborts entire operation when an error is encountered.
static final String
ACCURACY
Scans data to get exactly-typed and sized columns for all data scanned.
static final String
AUTO
Auto detect compression type.
static final String
AVRO
Avro file format.
static final String
BAD_RECORD_TABLE_LIMIT
A positive integer indicating the maximum number of records that can be written to the bad-record-table.
static final String
BAD_RECORD_TABLE_LIMIT_PER_INPUT
For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload.
static final String
BAD_RECORD_TABLE_NAME
Name of a table to which records that were rejected are written.
static final String
BATCH_SIZE
Number of records to insert per batch when inserting data.
static final String
BZIP2
bzip2 file compression.
static final String
COLUMN_FORMATS
For each target column specified, applies the column-property-bound format to the source data loaded into that column.
static final String
COLUMNS_TO_LOAD
Specifies a comma-delimited list of columns from the source data to load.
static final String
COLUMNS_TO_SKIP
Specifies a comma-delimited list of columns from the source data to skip.
static final String
COMPRESSION_TYPE
Source data compression type.
static final String
DATASOURCE_NAME
Name of an existing external data source from which data file(s) specified in filepaths will be loaded.
static final String
DEFAULT_COLUMN_FORMATS
Specifies the default format to be applied to source data loaded into columns with the corresponding column property.
static final String
DELIMITED_TEXT
Delimited text file format; e.g., CSV, TSV, PSV, etc.
static final String
DISTRIBUTED_LOCAL
A single worker process on each node loads all files that are available to it.
static final String
DISTRIBUTED_SHARED
The head node coordinates loading data by worker processes across all nodes from shared files available to all workers.
static final String
DRY_RUN
Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of ERROR_HANDLING.
static final String
EARLIEST

static final String
ENABLE_INPLACE_UPDATES
Applies only when upserting (when update_on_existing_pk is true).
static final String
ERROR_HANDLING
Specifies how errors should be handled upon insertion.
static final String
FALSE
Reject new records when primary keys match existing records.
static final String
FILE_TYPE
Specifies the type of the file(s) whose records will be inserted.
static final String
FLATTEN_COLUMNS
Specifies how to handle nested columns.
static final String
FULL
Run a type inference on the source data (if needed) and ingest.
static final String
GDAL_CONFIGURATION_OPTIONS
Comma separated list of gdal conf options, for the specific requests: key=value.
static final String
GDB
Esri/GDB file format.
static final String
GZIP
gzip file compression.
static final String
HEAD
The head node loads all data.
static final String
IGNORE_BAD_RECORDS
Alias for skip.
static final String
IGNORE_EXISTING_PK
Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when UPDATE_ON_EXISTING_PK is FALSE).
static final String
INGESTION_MODE
Whether to do a full load, dry run, or perform a type inference on the source data.
static final String
JSON
JSON file format.
static final String
KAFKA_CONSUMERS_PER_RANK
Number of Kafka consumer threads per rank (valid range 1-6).
static final String
KAFKA_GROUP_ID
The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
static final String
KAFKA_OFFSET_RESET_POLICY
Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset.
static final String
KAFKA_OPTIMISTIC_INGEST
Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism.
static final String
KAFKA_SUBSCRIPTION_CANCEL_AFTER
Sets the Kafka subscription lifespan (in minutes).
static final String
KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT
Maximum time to collect Kafka messages before type inferencing on the set of them.
static final String
LATEST

static final String
LAYER
Geo files layer(s) name(s): comma separated.
static final String
LOADING_MODE
Scheme for distributing the extraction and loading of data from the source data file(s).
static final String
LOCAL_TIME_OFFSET
Apply an offset to Avro local timestamp columns.
static final String
MAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE
Max records to skip due to schema related errors, before failing.
static final String
MAX_RECORDS_TO_LOAD
Limit the number of records to load in this request: if this number is larger than BATCH_SIZE, then the number of records loaded will be limited to the next whole number of BATCH_SIZE (per working thread).
static final String
NAME_COLUMNS_FROM_FILE
Specifies a comma-delimited list of column names to be used as the source-data column names.
static final String
NONE
No compression.
static final String
NUM_TASKS_PER_RANK
Number of tasks for reading file per rank.
static final String
PARQUET
Apache Parquet file format.
static final String
PERMISSIVE
Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
static final String
POLL_INTERVAL
If TRUE, the number of seconds between attempts to load external files into the table.
static final String
PRIMARY_KEYS
Comma separated list of column names to set as primary keys, when not specified in the type.
static final String
SCHEMA_REGISTRY_CONNECTION_RETRIES
Confluent Schema registry connection timeout (in secs).
static final String
SCHEMA_REGISTRY_CONNECTION_TIMEOUT
Confluent Schema registry connection timeout (in secs).
static final String
SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES
Max records to skip due to SR connection failures, before failing.
static final String
SCHEMA_REGISTRY_SCHEMA_NAME
Name of the Avro schema in the schema registry to use when reading Avro records.
static final String
SHAPEFILE
ShapeFile file format.
static final String
SHARD_KEYS
Comma separated list of column names to set as shard keys, when not specified in the type.
static final String
SINGLE
Insert all records into a single table.
static final String
SKIP
Malformed records are skipped.
static final String
SKIP_LINES
Skip a number of lines from the beginning of the file.
static final String
SPEED
Scans data and picks the widest possible column types so that ‘all’ values will fit with minimum data scanned.
static final String
START_OFFSETS
Starting offsets by partition to fetch from kafka.
static final String
SUBSCRIBE
Continuously poll the data source to check for new data and load it into the table.
static final String
TABLE_INSERT_MODE
Insertion scheme to use when inserting records from multiple shapefiles.
static final String
TABLE_PER_FILE
Insert records from each file into a new table corresponding to that file.
static final String
TEXT_COMMENT_STRING
Specifies the character string that should be interpreted as a comment line prefix in the source data.
static final String
TEXT_DELIMITER
Specifies the character delimiting field values in the source data and field names in the header (if present).
static final String
TEXT_ESCAPE_CHARACTER
Specifies the character that is used to escape other characters in the source data.
static final String
TEXT_HAS_HEADER
Indicates whether the source data contains a header row.
static final String
TEXT_HEADER_PROPERTY_DELIMITER
Specifies the delimiter for column properties in the header row (if present).
static final String
TEXT_NULL_STRING
Specifies the character string that should be interpreted as a null value in the source data.
static final String
TEXT_QUOTE_CHARACTER
Specifies the character that should be interpreted as a field value quoting character in the source data.
static final String
TEXT_SEARCH_COLUMNS
Add ‘text_search’ property to internally inferenced string columns.
static final String
TEXT_SEARCH_MIN_COLUMN_LENGTH
Set the minimum column size for strings to apply the ‘text_search’ property to.
static final String
TRANSFORMATIONS
Comma-separated expressions, one per target table column.
static final String
TRIM_SPACE
If set to TRUE, remove leading or trailing space from fields.
static final String
TRUE
Upsert new records when primary keys match existing records.
static final String
TRUNCATE_STRINGS
If set to TRUE, truncate string values that are longer than the column’s type size.
static final String
TRUNCATE_TABLE
If set to TRUE, truncates the table specified by tableName prior to loading the file(s).
static final String
TYPE_INFERENCE_MAX_RECORDS_READ

static final String
TYPE_INFERENCE_MODE
Optimize type inferencing for either speed or accuracy.
static final String
TYPE_INFERENCE_ONLY
Infer the type of the source data and return, without ingesting any data.
static final String
UPDATE_ON_EXISTING_PK
Specifies the record collision policy for inserting into a table with a primary key.
Method Summary

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- BAD_RECORD_TABLE_NAME
  public static final String BAD_RECORD_TABLE_NAME
  Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When ERROR_HANDLING is ABORT, bad records table is not populated.
  See Also:
  Constant Field Values
- BAD_RECORD_TABLE_LIMIT
  public static final String BAD_RECORD_TABLE_LIMIT
  A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is ‘10000’.
  See Also:
  Constant Field Values
- BAD_RECORD_TABLE_LIMIT_PER_INPUT
  public static final String BAD_RECORD_TABLE_LIMIT_PER_INPUT
  For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be BAD_RECORD_TABLE_LIMIT and total size of the table per rank is limited to BAD_RECORD_TABLE_LIMIT.
  See Also:
  Constant Field Values
- BATCH_SIZE
  public static final String BATCH_SIZE
  Number of records to insert per batch when inserting data. The default value is ‘50000’.
  See Also:
  Constant Field Values
- COLUMN_FORMATS
  public static final String COLUMN_FORMATS
  For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, and datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., ’ “order_date” : “date” : “%Y.%m.%d” , “order_time” : “time” : “%H:%M:%S” ’.
  See DEFAULT_COLUMN_FORMATS for valid format syntax.
  See Also:
  Constant Field Values
- COLUMNS_TO_LOAD
  public static final String COLUMNS_TO_LOAD
  Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files.
  Column numbers can be specified discretely or as a range. For example, a value of ‘5,7,1..3’ will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table.
  
  If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn’t exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of ‘C, B, A’ will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful.
  
  Mutually exclusive with COLUMNS_TO_SKIP.
  See Also:
  Constant Field Values
- COLUMNS_TO_SKIP
  public static final String COLUMNS_TO_SKIP
  Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with COLUMNS_TO_LOAD.
  See Also:
  Constant Field Values
- COMPRESSION_TYPE
  public static final String COMPRESSION_TYPE
  Source data compression type. Supported values:
  NONE: No compression.
  AUTO: Auto detect compression type.
  GZIP: gzip file compression.
  BZIP2: bzip2 file compression.
  The default value is AUTO.
  See Also:
  Constant Field Values
- NONE
  public static final String NONE
  No compression.
  See Also:
  Constant Field Values
- AUTO
  public static final String AUTO
  Auto detect compression type.
  See Also:
  Constant Field Values
- GZIP
  public static final String GZIP
  gzip file compression.
  See Also:
  Constant Field Values
- BZIP2
  public static final String BZIP2
  bzip2 file compression.
  See Also:
  Constant Field Values
- DATASOURCE_NAME
  public static final String DATASOURCE_NAME
  Name of an existing external data source from which data file(s) specified in filepaths will be loaded.
  See Also:
  Constant Field Values
- DEFAULT_COLUMN_FORMATS
  public static final String DEFAULT_COLUMN_FORMATS
  Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, and datetime. This default column-property-bound format can be overridden by specifying a column property and format for a given target column in COLUMN_FORMATS. For each specified annotation, the format will apply to all columns with that annotation unless a custom COLUMN_FORMATS for that annotation is specified.
  The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., ’ “date” : “%Y.%m.%d”, “time” : “%H:%M:%S” ’. Column formats are specified as a string of control characters and plain text. The supported control characters are ‘Y’, ‘m’, ‘d’, ‘H’, ‘M’, ‘S’, and ‘s’, which follow the Linux ‘strptime()’ specification, as well as ‘s’, which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds).
  
  Formats for the ‘date’ annotation must include the ‘Y’, ‘m’, and ‘d’ control characters. Formats for the ‘time’ annotation must include the ‘H’, ‘M’, and either ‘S’ or ‘s’ (but not both) control characters. Formats for the ‘datetime’ annotation meet both the ‘date’ and ‘time’ control character requirements. For example, ‘“datetime” : “%m/%d/%Y %H:%M:%S” ’ would be used to interpret text as “05/04/2000 12:12:11”
  See Also:
  Constant Field Values
- ERROR_HANDLING
  public static final String ERROR_HANDLING
  Specifies how errors should be handled upon insertion. Supported values:
  PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
  SKIP: Malformed records are skipped.
  IGNORE_BAD_RECORDS: Alias for skip.
  ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
  The default value is ABORT.
  See Also:
  Constant Field Values
- PERMISSIVE
  public static final String PERMISSIVE
  Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
  See Also:
  Constant Field Values
- SKIP
  public static final String SKIP
  Malformed records are skipped.
  See Also:
  Constant Field Values
- IGNORE_BAD_RECORDS
  public static final String IGNORE_BAD_RECORDS
  Alias for skip.
  See Also:
  Constant Field Values
- ABORT
  public static final String ABORT
  Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
  See Also:
  Constant Field Values
- FILE_TYPE
  public static final String FILE_TYPE
  Specifies the type of the file(s) whose records will be inserted. Supported values:
  AVRO: Avro file format.
  DELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.
  GDB: Esri/GDB file format.
  JSON: JSON file format.
  PARQUET: Apache Parquet file format.
  SHAPEFILE: ShapeFile file format.
  The default value is DELIMITED_TEXT.
  See Also:
  Constant Field Values
- AVRO
  public static final String AVRO
  Avro file format.
  See Also:
  Constant Field Values
- DELIMITED_TEXT
  public static final String DELIMITED_TEXT
  Delimited text file format; e.g., CSV, TSV, PSV, etc.
  See Also:
  Constant Field Values
- GDB
  public static final String GDB
  Esri/GDB file format.
  See Also:
  Constant Field Values
- JSON
  public static final String JSON
  JSON file format.
  See Also:
  Constant Field Values
- PARQUET
  public static final String PARQUET
  Apache Parquet file format.
  See Also:
  Constant Field Values
- SHAPEFILE
  public static final String SHAPEFILE
  ShapeFile file format.
  See Also:
  Constant Field Values
- FLATTEN_COLUMNS
  public static final String FLATTEN_COLUMNS
  Specifies how to handle nested columns. Supported values:
  TRUE: Break up nested columns to multiple columns.
  FALSE: Treat nested columns as JSON columns instead of flattening.
  The default value is FALSE.
  See Also:
  Constant Field Values
- TRUE
  public static final String TRUE
  Upsert new records when primary keys match existing records.
  See Also:
  Constant Field Values
- FALSE
  public static final String FALSE
  Reject new records when primary keys match existing records.
  See Also:
  Constant Field Values
- GDAL_CONFIGURATION_OPTIONS
  public static final String GDAL_CONFIGURATION_OPTIONS
  Comma separated list of gdal conf options, for the specific requests: key=value.
  See Also:
  Constant Field Values
- IGNORE_EXISTING_PK
  public static final String IGNORE_EXISTING_PK
  Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when UPDATE_ON_EXISTING_PK is FALSE). If set to TRUE, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If FALSE, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by ERROR_HANDLING. If the specified table does not have a primary key or if upsert mode is in effect (UPDATE_ON_EXISTING_PK is TRUE), then this option has no effect. Supported values:
  TRUE: Ignore new records whose primary key values collide with those of existing records.
  FALSE: Treat as errors any new records whose primary key values collide with those of existing records.
  The default value is FALSE.
  See Also:
  Constant Field Values
- INGESTION_MODE
  public static final String INGESTION_MODE
  Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:
  FULL: Run a type inference on the source data (if needed) and ingest.
  DRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of ERROR_HANDLING.
  TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
  The default value is FULL.
  See Also:
  Constant Field Values
- FULL
  public static final String FULL
  Run a type inference on the source data (if needed) and ingest.
  See Also:
  Constant Field Values
- DRY_RUN
  public static final String DRY_RUN
  Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of ERROR_HANDLING.
  See Also:
  Constant Field Values
- TYPE_INFERENCE_ONLY
  public static final String TYPE_INFERENCE_ONLY
  Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
  See Also:
  Constant Field Values
- KAFKA_CONSUMERS_PER_RANK
  public static final String KAFKA_CONSUMERS_PER_RANK
  Number of Kafka consumer threads per rank (valid range 1-6). The default value is ‘1’.
  See Also:
  Constant Field Values
- KAFKA_GROUP_ID
  public static final String KAFKA_GROUP_ID
  The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
  See Also:
  Constant Field Values
- KAFKA_OFFSET_RESET_POLICY
  public static final String KAFKA_OFFSET_RESET_POLICY
  Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values:
  EARLIEST
  LATEST
  The default value is EARLIEST.
  See Also:
  Constant Field Values
- EARLIEST
  public static final String EARLIEST
  See Also:
  Constant Field Values
- LATEST
  public static final String LATEST
  See Also:
  Constant Field Values
- KAFKA_OPTIMISTIC_INGEST
  public static final String KAFKA_OPTIMISTIC_INGEST
  Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  See Also:
  Constant Field Values
- KAFKA_SUBSCRIPTION_CANCEL_AFTER
  public static final String KAFKA_SUBSCRIPTION_CANCEL_AFTER
  Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically.
  See Also:
  Constant Field Values
- KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT
  public static final String KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT
  Maximum time to collect Kafka messages before type inferencing on the set of them.
  See Also:
  Constant Field Values
- LAYER
  public static final String LAYER
  Geo files layer(s) name(s): comma separated.
  See Also:
  Constant Field Values
- LOADING_MODE
  public static final String LOADING_MODE
  Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database. Supported values:
  HEAD: The head node loads all data. All files must be available to the head node.
  DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
  DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn’t exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
  The default value is HEAD.
  See Also:
  Constant Field Values
- HEAD
  public static final String HEAD
  The head node loads all data. All files must be available to the head node.
  See Also:
  Constant Field Values
- DISTRIBUTED_SHARED
  public static final String DISTRIBUTED_SHARED
  The head node coordinates loading data by worker processes across all nodes from shared files available to all workers.
  NOTE:
  
  Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
  See Also:
  Constant Field Values
- DISTRIBUTED_LOCAL
  public static final String DISTRIBUTED_LOCAL
  A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data).
  NOTE:
  
  If the target table doesn’t exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail.
  
  If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
  See Also:
  Constant Field Values
- LOCAL_TIME_OFFSET
  public static final String LOCAL_TIME_OFFSET
  Apply an offset to Avro local timestamp columns.
  See Also:
  Constant Field Values
- MAX_RECORDS_TO_LOAD
  public static final String MAX_RECORDS_TO_LOAD
  Limit the number of records to load in this request: if this number is larger than BATCH_SIZE, then the number of records loaded will be limited to the next whole number of BATCH_SIZE (per working thread).
  See Also:
  Constant Field Values
- NAME_COLUMNS_FROM_FILE
  public static final String NAME_COLUMNS_FROM_FILE
  Specifies a comma-delimited list of column names to be used as the source-data column names. If the file has a header row (i.e., TEXT_HAS_HEADER is TRUE), these names override the file’s header names. If the file has no header row, these names are used as the source-data column names. Either way, the i-th name in this list applies to the i-th column in the file, enabling name-based matching against the target table’s columns (and use with COLUMNS_TO_LOAD / COLUMNS_TO_SKIP).
  See Also:
  Constant Field Values
- NUM_TASKS_PER_RANK
  public static final String NUM_TASKS_PER_RANK
  Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks.
  See Also:
  Constant Field Values
- POLL_INTERVAL
  public static final String POLL_INTERVAL
  If TRUE, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is ‘0’.
  See Also:
  Constant Field Values
- PRIMARY_KEYS
  public static final String PRIMARY_KEYS
  Comma separated list of column names to set as primary keys, when not specified in the type.
  See Also:
  Constant Field Values
- SCHEMA_REGISTRY_CONNECTION_RETRIES
  public static final String SCHEMA_REGISTRY_CONNECTION_RETRIES
  Confluent Schema registry connection timeout (in secs).
  See Also:
  Constant Field Values
- SCHEMA_REGISTRY_CONNECTION_TIMEOUT
  public static final String SCHEMA_REGISTRY_CONNECTION_TIMEOUT
  Confluent Schema registry connection timeout (in secs).
  See Also:
  Constant Field Values
- SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES
  public static final String SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES
  Max records to skip due to SR connection failures, before failing.
  See Also:
  Constant Field Values
- MAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE
  public static final String MAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE
  Max records to skip due to schema related errors, before failing.
  See Also:
  Constant Field Values
- SCHEMA_REGISTRY_SCHEMA_NAME
  public static final String SCHEMA_REGISTRY_SCHEMA_NAME
  Name of the Avro schema in the schema registry to use when reading Avro records.
  See Also:
  Constant Field Values
- SHARD_KEYS
  public static final String SHARD_KEYS
  Comma separated list of column names to set as shard keys, when not specified in the type.
  See Also:
  Constant Field Values
- SKIP_LINES
  public static final String SKIP_LINES
  Skip a number of lines from the beginning of the file.
  See Also:
  Constant Field Values
- START_OFFSETS
  public static final String START_OFFSETS
  Starting offsets by partition to fetch from kafka. A comma separated list of partition:offset pairs.
  See Also:
  Constant Field Values
- SUBSCRIBE
  public static final String SUBSCRIBE
  Continuously poll the data source to check for new data and load it into the table. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  See Also:
  Constant Field Values
- TABLE_INSERT_MODE
  public static final String TABLE_INSERT_MODE
  Insertion scheme to use when inserting records from multiple shapefiles. Supported values:
  SINGLE: Insert all records into a single table.
  TABLE_PER_FILE: Insert records from each file into a new table corresponding to that file.
  The default value is SINGLE.
  See Also:
  Constant Field Values
- SINGLE
  public static final String SINGLE
  Insert all records into a single table.
  See Also:
  Constant Field Values
- TABLE_PER_FILE
  public static final String TABLE_PER_FILE
  Insert records from each file into a new table corresponding to that file.
  See Also:
  Constant Field Values
- TEXT_COMMENT_STRING
  public static final String TEXT_COMMENT_STRING
  Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored.
  For DELIMITED_TEXT FILE_TYPE only. The default value is ’#’.
  See Also:
  Constant Field Values
- TEXT_DELIMITER
  public static final String TEXT_DELIMITER
  Specifies the character delimiting field values in the source data and field names in the header (if present).
  For DELIMITED_TEXT FILE_TYPE only. The default value is ’,’.
  See Also:
  Constant Field Values
- TEXT_ESCAPE_CHARACTER
  public static final String TEXT_ESCAPE_CHARACTER
  Specifies the character that is used to escape other characters in the source data.
  An ‘a’, ‘b’, ‘f’, ‘n’, ‘r’, ‘t’, or ‘v’ preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, and vertical tab, respectively. For example, the escape character followed by an ‘n’ will be interpreted as a newline within a field value.
  
  The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not.
  
  For DELIMITED_TEXT FILE_TYPE only.
  See Also:
  Constant Field Values
- TEXT_HAS_HEADER
  public static final String TEXT_HAS_HEADER
  Indicates whether the source data contains a header row.
  For DELIMITED_TEXT FILE_TYPE only. Supported values:
  TRUE
  FALSE
  The default value is TRUE.
  See Also:
  Constant Field Values
- TEXT_HEADER_PROPERTY_DELIMITER
  public static final String TEXT_HEADER_PROPERTY_DELIMITER
  Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as TEXT_DELIMITER.
  For DELIMITED_TEXT FILE_TYPE only. The default value is ’|’.
  See Also:
  Constant Field Values
- TEXT_NULL_STRING
  public static final String TEXT_NULL_STRING
  Specifies the character string that should be interpreted as a null value in the source data.
  For DELIMITED_TEXT FILE_TYPE only. The default value is ‘\N’.
  See Also:
  Constant Field Values
- TEXT_QUOTE_CHARACTER
  public static final String TEXT_QUOTE_CHARACTER
  Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string.
  For DELIMITED_TEXT FILE_TYPE only. The default value is ’”’.
  See Also:
  Constant Field Values
- TEXT_SEARCH_COLUMNS
  public static final String TEXT_SEARCH_COLUMNS
  Add ‘text_search’ property to internally inferenced string columns. Comma separated list of column names or ’*’ for all columns. To add ‘text_search’ property only to string columns greater than or equal to a minimum size, also set the TEXT_SEARCH_MIN_COLUMN_LENGTH
  See Also:
  Constant Field Values
- TEXT_SEARCH_MIN_COLUMN_LENGTH
  public static final String TEXT_SEARCH_MIN_COLUMN_LENGTH
  Set the minimum column size for strings to apply the ‘text_search’ property to. Used only when TEXT_SEARCH_COLUMNS has a value.
  See Also:
  Constant Field Values
- TRIM_SPACE
  public static final String TRIM_SPACE
  If set to TRUE, remove leading or trailing space from fields. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  See Also:
  Constant Field Values
- TRUNCATE_STRINGS
  public static final String TRUNCATE_STRINGS
  If set to TRUE, truncate string values that are longer than the column’s type size. Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  See Also:
  Constant Field Values
- TRUNCATE_TABLE
  public static final String TRUNCATE_TABLE
  If set to TRUE, truncates the table specified by tableName prior to loading the file(s). Supported values:
  TRUE
  FALSE
  The default value is FALSE.
  See Also:
  Constant Field Values
- TYPE_INFERENCE_MAX_RECORDS_READ
  public static final String TYPE_INFERENCE_MAX_RECORDS_READ
  See Also:
  Constant Field Values
- TYPE_INFERENCE_MODE
  public static final String TYPE_INFERENCE_MODE
  Optimize type inferencing for either speed or accuracy. Supported values:
  ACCURACY: Scans data to get exactly-typed and sized columns for all data scanned.
  SPEED: Scans data and picks the widest possible column types so that ‘all’ values will fit with minimum data scanned.
  The default value is ACCURACY.
  See Also:
  Constant Field Values
- ACCURACY
  public static final String ACCURACY
  Scans data to get exactly-typed and sized columns for all data scanned.
  See Also:
  Constant Field Values
- SPEED
  public static final String SPEED
  Scans data and picks the widest possible column types so that ‘all’ values will fit with minimum data scanned.
  See Also:
  Constant Field Values
- ENABLE_INPLACE_UPDATES
  public static final String ENABLE_INPLACE_UPDATES
  Applies only when upserting (when update_on_existing_pk is true). If set to true (the default), an existing record matched by primary key is modified in place. If set to false, the matched record is updated by deleting it and inserting a replacement (delete and insert), which prevents the change from being reflected in dependent materialized views until they are refreshed. Supported values:
  TRUE
  FALSE
  The default value is TRUE.
  See Also:
  Constant Field Values
- UPDATE_ON_EXISTING_PK
  public static final String UPDATE_ON_EXISTING_PK
  Specifies the record collision policy for inserting into a table with a primary key. If set to TRUE, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be ‘upserted’). If set to FALSE, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by IGNORE_EXISTING_PK and ERROR_HANDLING. If the specified table does not have a primary key, then this option has no effect. Supported values:
  TRUE: Upsert new records when primary keys match existing records.
  FALSE: Reject new records when primary keys match existing records.
  The default value is FALSE.
  See Also:
  Constant Field Values
- TRANSFORMATIONS
  public static final String TRANSFORMATIONS
  Comma-separated expressions, one per target table column. Each expression is evaluated per record. Empty entries (two consecutive commas) mean no transformation for that column — the value is resolved from the input record, table default, NULL, or an error. Expressions may reference input columns by name or by position ( $1 for the first input column,$ 2 for the second, etc.). The default value is ”.
  See Also:
  Constant Field Values

InsertRecordsFromFilesRequest.CreateTableOptions InsertRecordsFromFilesResponse

⌘I

​Class InsertRecordsFromFilesRequest.Options

​Field Summary

​Method Summary

​Methods inherited from class java.lang.Object

​Field Details

​BAD_RECORD_TABLE_NAME

​BAD_RECORD_TABLE_LIMIT

​BAD_RECORD_TABLE_LIMIT_PER_INPUT

​BATCH_SIZE

​COLUMN_FORMATS

​COLUMNS_TO_LOAD

​COLUMNS_TO_SKIP

​COMPRESSION_TYPE

​NONE

​AUTO

​GZIP

​BZIP2

​DATASOURCE_NAME

​DEFAULT_COLUMN_FORMATS

​ERROR_HANDLING

​PERMISSIVE

​SKIP

​IGNORE_BAD_RECORDS

​ABORT

​FILE_TYPE

​AVRO

​DELIMITED_TEXT

​GDB

​JSON

​PARQUET

​SHAPEFILE

​FLATTEN_COLUMNS

​TRUE

​FALSE

​GDAL_CONFIGURATION_OPTIONS

​IGNORE_EXISTING_PK

​INGESTION_MODE

​FULL

​DRY_RUN

​TYPE_INFERENCE_ONLY

​KAFKA_CONSUMERS_PER_RANK

​KAFKA_GROUP_ID

​KAFKA_OFFSET_RESET_POLICY

​EARLIEST

​LATEST

​KAFKA_OPTIMISTIC_INGEST

​KAFKA_SUBSCRIPTION_CANCEL_AFTER

​KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT

​LAYER

​LOADING_MODE

​HEAD

​DISTRIBUTED_SHARED

​DISTRIBUTED_LOCAL

​LOCAL_TIME_OFFSET

​MAX_RECORDS_TO_LOAD

​NAME_COLUMNS_FROM_FILE

​NUM_TASKS_PER_RANK

​POLL_INTERVAL

​PRIMARY_KEYS

​SCHEMA_REGISTRY_CONNECTION_RETRIES

​SCHEMA_REGISTRY_CONNECTION_TIMEOUT

​SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES

​MAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE

​SCHEMA_REGISTRY_SCHEMA_NAME

​SHARD_KEYS

​SKIP_LINES

​START_OFFSETS

​SUBSCRIBE

​TABLE_INSERT_MODE

​SINGLE

​TABLE_PER_FILE

​TEXT_COMMENT_STRING

​TEXT_DELIMITER

​TEXT_ESCAPE_CHARACTER

​TEXT_HAS_HEADER

​TEXT_HEADER_PROPERTY_DELIMITER

​TEXT_NULL_STRING

​TEXT_QUOTE_CHARACTER

​TEXT_SEARCH_COLUMNS

​TEXT_SEARCH_MIN_COLUMN_LENGTH

Class InsertRecordsFromFilesRequest.Options

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

BAD_RECORD_TABLE_NAME

BAD_RECORD_TABLE_LIMIT

BAD_RECORD_TABLE_LIMIT_PER_INPUT

BATCH_SIZE

COLUMN_FORMATS

COLUMNS_TO_LOAD

COLUMNS_TO_SKIP

COMPRESSION_TYPE

NONE

AUTO

GZIP

BZIP2

DATASOURCE_NAME

DEFAULT_COLUMN_FORMATS

ERROR_HANDLING

PERMISSIVE

SKIP

IGNORE_BAD_RECORDS

ABORT

FILE_TYPE

AVRO

DELIMITED_TEXT

GDB

JSON

PARQUET

SHAPEFILE

FLATTEN_COLUMNS

TRUE

FALSE

GDAL_CONFIGURATION_OPTIONS

IGNORE_EXISTING_PK

INGESTION_MODE

FULL

DRY_RUN

TYPE_INFERENCE_ONLY

KAFKA_CONSUMERS_PER_RANK

KAFKA_GROUP_ID

KAFKA_OFFSET_RESET_POLICY

EARLIEST

LATEST

KAFKA_OPTIMISTIC_INGEST

KAFKA_SUBSCRIPTION_CANCEL_AFTER

KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT

LAYER

LOADING_MODE

HEAD

DISTRIBUTED_SHARED

DISTRIBUTED_LOCAL

LOCAL_TIME_OFFSET

MAX_RECORDS_TO_LOAD

NAME_COLUMNS_FROM_FILE

NUM_TASKS_PER_RANK

POLL_INTERVAL

PRIMARY_KEYS

SCHEMA_REGISTRY_CONNECTION_RETRIES

SCHEMA_REGISTRY_CONNECTION_TIMEOUT

SCHEMA_REGISTRY_MAX_CONSECUTIVE_CONNECTION_FAILURES

MAX_CONSECUTIVE_INVALID_SCHEMA_FAILURE

SCHEMA_REGISTRY_SCHEMA_NAME

SHARD_KEYS

SKIP_LINES

START_OFFSETS

SUBSCRIBE

TABLE_INSERT_MODE

SINGLE

TABLE_PER_FILE

TEXT_COMMENT_STRING

TEXT_DELIMITER

TEXT_ESCAPE_CHARACTER

TEXT_HAS_HEADER

TEXT_HEADER_PROPERTY_DELIMITER

TEXT_NULL_STRING

TEXT_QUOTE_CHARACTER

TEXT_SEARCH_COLUMNS

TEXT_SEARCH_MIN_COLUMN_LENGTH