CreateTableExternalRequest (Kinetica Java API 7.1.10.4 API)

java.lang.Object
- com.gpudb.protocol.CreateTableExternalRequest

All Implemented Interfaces:

org.apache.avro.generic.GenericContainer, org.apache.avro.generic.IndexedRecord
```
public class CreateTableExternalRequest
extends Object
implements org.apache.avro.generic.IndexedRecord
```
A set of parameters for GPUdb.createTableExternal(CreateTableExternalRequest).
Creates a new external table, which is a local database object whose source data is located externally to the database. The source data can be located either in KiFS; on the cluster, accessible to the database; or remotely, accessible via a pre-defined external data source.
The external table can have its structure defined explicitly, via createTableOptions, which contains many of the options from GPUdb.createTable(CreateTableRequest); or defined implicitly, inferred from the source data.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`CreateTableExternalRequest.CreateTableOptions` Options from `GPUdb.createTable(CreateTableRequest)`, allowing the structure of the table to be defined independently of the data source `TYPE_ID`: ID of a currently registered	`static class`	`CreateTableExternalRequest.Options` Optional parameters.

Constructor Summary

Constructors
Constructor and Description
`CreateTableExternalRequest()` Constructs a CreateTableExternalRequest object with default parameters.
`CreateTableExternalRequest(String tableName, List<String> filepaths, Map<String,Map<String,String>> modifyColumns, Map<String,String> createTableOptions, Map<String,String> options)` Constructs a CreateTableExternalRequest object with the specified parameters.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`equals(Object obj)`
`Object`	`get(int index)` This method supports the Avro framework and is not intended to be called directly by the user.
`static org.apache.avro.Schema`	`getClassSchema()` This method supports the Avro framework and is not intended to be called directly by the user.
`Map<String,String>`	`getCreateTableOptions()`
`List<String>`	`getFilepaths()`
`Map<String,Map<String,String>>`	`getModifyColumns()`
`Map<String,String>`	`getOptions()`
`org.apache.avro.Schema`	`getSchema()` This method supports the Avro framework and is not intended to be called directly by the user.
`String`	`getTableName()`
`int`	`hashCode()`
`void`	`put(int index, Object value)` This method supports the Avro framework and is not intended to be called directly by the user.
`CreateTableExternalRequest`	`setCreateTableOptions(Map<String,String> createTableOptions)`
`CreateTableExternalRequest`	`setFilepaths(List<String> filepaths)`
`CreateTableExternalRequest`	`setModifyColumns(Map<String,Map<String,String>> modifyColumns)`
`CreateTableExternalRequest`	`setOptions(Map<String,String> options)`
`CreateTableExternalRequest`	`setTableName(String tableName)`
`String`	`toString()`

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - CreateTableExternalRequest
```
public CreateTableExternalRequest()
```
    Constructs a CreateTableExternalRequest object with default parameters.
  - CreateTableExternalRequest
```
public CreateTableExternalRequest(String tableName,
                                  List<String> filepaths,
                                  Map<String,Map<String,String>> modifyColumns,
                                  Map<String,String> createTableOptions,
                                  Map<String,String> options)
```
    Constructs a CreateTableExternalRequest object with the specified parameters.
    Parameters:
    
    tableName - Name of the table to be created, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria.
    
    filepaths - A list of file paths from which data will be sourced; For paths in KiFS, use the uri prefix of kifs:// followed by the path to a file or directory. File matching by prefix is supported, e.g. kifs://dir/file would match dir/file_1 and dir/file_2. When prefix matching is used, the path must start with a full, valid KiFS directory name. If an external data source is specified in datasource_name, these file paths must resolve to accessible files at that data source location. Prefix matching is supported. If the data source is hdfs, prefixes must be aligned with directories, i.e. partial file names will not match. If no data source is specified, the files are assumed to be local to the database and must all be accessible to the gpudb user, residing on the path (or relative to the path) specified by the external files directory in the Kinetica configuration file. Wildcards (*) can be used to specify a group of files. Prefix matching is supported, the prefixes must be aligned with directories. If the first path ends in .tsv, the text delimiter will be defaulted to a tab character. If the first path ends in .psv, the text delimiter will be defaulted to a pipe character (|).
    
    modifyColumns - Not implemented yet. The default value is an empty Map.
    createTableOptions - Options from GPUdb.createTable(CreateTableRequest), allowing the structure of the table to be defined independently of the data source
    
    TYPE_ID: ID of a currently registered type.
    NO_ERROR_IF_EXISTS: If true, prevents an error from occurring if the table already exists and is of the given type. If a table with the same name but a different type exists, it is still an error. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    IS_REPLICATED: Affects the distribution scheme for the table's data. If true and the given table has no explicit shard key defined, the table will be replicated. If false, the table will be sharded according to the shard key specified in the given type_id, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.
    FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.
    PARTITION_TYPE: Partitioning scheme to use. Supported values:
    
    RANGE: Use range partitioning.
    INTERVAL: Use interval partitioning.
    LIST: Use list partitioning.
    HASH: Use hash partitioning.
    SERIES: Use series partitioning.
    
    PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined by partition_definitions.
    PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice of partition_type. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.
    IS_AUTOMATIC_PARTITION: If true, a new partition will be created for values which don't fall into an existing partition. Currently, only supported for list partitions. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TTL: Sets the TTL of the table specified in tableName.
    CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.
    IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with store_only or text_search data-handling or that are non-charN strings, and it will not be retained if the server is restarted. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    STRATEGY_DEFINITION: The tier strategy for the table and its columns.
    The default value is an empty Map.
    options - Optional parameters.
    
    BAD_RECORD_TABLE_NAME: Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When error_handling is abort, bad records table is not populated.
    BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is '10000'.
    BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be bad_record_table_limit and total size of the table per rank is limited to bad_record_table_limit.
    BATCH_SIZE: Number of records to insert per batch when inserting data. The default value is '50000'.
    COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, & datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. See default_column_formats for valid format syntax.
    COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive with columns_to_skip.
    COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with columns_to_load.
    COMPRESSION_TYPE: Source data compression type Supported values:
    
    NONE: No compression.
    AUTO: Auto detect compression type
    GZIP: gzip file compression.
    BZIP2: bzip2 file compression.
    The default value is AUTO.
    DATASOURCE_NAME: Name of an existing external data source from which data file(s) specified in filepaths will be loaded
    DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, & datetime. This default column-property-bound format can be overridden by specifying a column property & format for a given target column in column_formats. For each specified annotation, the format will apply to all columns with that annotation unless a custom column_formats for that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"
    ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:
    
    PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
    IGNORE_BAD_RECORDS: Malformed records are skipped.
    ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
    The default value is ABORT.
    EXTERNAL_TABLE_TYPE: Specifies whether the external table holds a local copy of the external data. Supported values:
    
    MATERIALIZED: Loads a copy of the external data into the database, refreshed on demand
    LOGICAL: External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table
    The default value is MATERIALIZED.
    FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:
    
    AVRO: Avro file format
    DELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.
    GDB: Esri/GDB file format
    JSON: Json file format
    PARQUET: Apache Parquet file format
    SHAPEFILE: ShapeFile file format
    The default value is DELIMITED_TEXT.
    GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requets: key=value
    IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when update_on_existing_pk is false). If set to true, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If false, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by error_handling. If the specified table does not have a primary key or if upsert mode is in effect (update_on_existing_pk is true), then this option has no effect. Supported values:
    
    TRUE: Ignore new records whose primary key values collide with those of existing records
    FALSE: Treat as errors any new records whose primary key values collide with those of existing records
    The default value is FALSE.
    INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:
    
    FULL: Run a type inference on the source data (if needed) and ingest
    DRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of error_handling.
    TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
    The default value is FULL.
    JDBC_FETCH_SIZE: The JDBC fetch size, which determines how many rows to fetch per round trip. The default value is '50000'.
    KAFKA_CONSUMERS_PER_RANK: Number of Kafka consumer threads per rank (valid range 1-6). The default value is '1'.
    KAFKA_GROUP_ID: The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
    KAFKA_OFFSET_RESET_POLICY: Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values:
    
    EARLIEST
    LATEST
    The default value is EARLIEST.
    KAFKA_OPTIMISTIC_INGEST: Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    KAFKA_SUBSCRIPTION_CANCEL_AFTER: Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically.
    KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT: Maximum time to collect Kafka messages before type inferencing on the set of them.
    LAYER: Geo files layer(s) name(s): comma separated.
    LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database Supported values:
    
    HEAD: The head node loads all data. All files must be available to the head node.
    DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
    DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
    The default value is HEAD.
    LOCAL_TIME_OFFSET: Apply an offset to Avro local timestamp columns.
    MAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: if this number is larger than batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread).
    NUM_TASKS_PER_RANK: Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks.
    POLL_INTERVAL: If true, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is '0'.
    PRIMARY_KEYS: Comma separated list of column names to set as primary keys, when not specified in the type.
    REFRESH_METHOD: Method by which the table can be refreshed from its source data. Supported values:
    
    MANUAL: Refresh only occurs when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    ON_START: Refresh table on database startup and when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    The default value is MANUAL.
    SCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.
    SHARD_KEYS: Comma separated list of column names to set as shard keys, when not specified in the type.
    SKIP_LINES: Skip number of lines from begining of file.
    SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TABLE_INSERT_MODE: Insertion scheme to use when inserting records from multiple shapefiles. Supported values:
    
    SINGLE: Insert all records into a single table.
    TABLE_PER_FILE: Insert records from each file into a new table corresponding to that file.
    The default value is SINGLE.
    TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. For delimited_text file_type only. The default value is '#'.
    TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). For delimited_text file_type only. The default value is ','.
    TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, & vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. For delimited_text file_type only.
    TEXT_HAS_HEADER: Indicates whether the source data contains a header row. For delimited_text file_type only. Supported values:
    
    TRUE
    FALSE
    The default value is TRUE.
    TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as text_delimiter. For delimited_text file_type only. The default value is '|'.
    TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. For delimited_text file_type only. The default value is '\\N'.
    TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. For delimited_text file_type only. The default value is '"'.
    TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma seperated list of column names or '*' for all columns. To add 'text_search' property only to string columns greater than or equal to a minimum size, also set the text_search_min_column_length
    TEXT_SEARCH_MIN_COLUMN_LENGTH: Set the minimum column size for strings to apply the 'text_search' property to. Used only when text_search_columns has a value.
    TRUNCATE_STRINGS: If set to true, truncate string values that are longer than the column's type size. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TRUNCATE_TABLE: If set to true, truncates the table specified by tableName prior to loading the file(s). Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TYPE_INFERENCE_MODE: Optimize type inferencing for either speed or accuracy. Supported values:
    
    ACCURACY: Scans data to get exactly-typed & sized columns for all data scanned.
    SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
    The default value is SPEED.
    REMOTE_QUERY: Remote SQL query from which data will be sourced
    REMOTE_QUERY_FILTER_COLUMN: Name of column to be used for splitting remote_query into multiple sub-queries using the data distribution of given column
    REMOTE_QUERY_INCREASING_COLUMN: Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP).
    REMOTE_QUERY_PARTITION_COLUMN: Alias name for remote_query_filter_column.
    UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set to true, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be 'upserted'). If set to false, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by ignore_existing_pk & error_handling. If the specified table does not have a primary key, then this option has no effect. Supported values:
    
    TRUE: Upsert new records when primary keys match existing records
    FALSE: Reject new records when primary keys match existing records
    The default value is FALSE.
    The default value is an empty Map.
- Method Detail
  - getClassSchema
```
public static org.apache.avro.Schema getClassSchema()
```
    This method supports the Avro framework and is not intended to be called directly by the user.
    
    Returns:
    
    the schema for the class.
  - getTableName
```
public String getTableName()
```
    Returns:
    
    Name of the table to be created, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria.
  - setTableName
```
public CreateTableExternalRequest setTableName(String tableName)
```
    Parameters:
    
    tableName - Name of the table to be created, in [schema_name.]table_name format, using standard name resolution rules and meeting table naming criteria.
    
    Returns:
    
    this to mimic the builder pattern.
  - getFilepaths
```
public List<String> getFilepaths()
```
    Returns:
    
    A list of file paths from which data will be sourced; For paths in KiFS, use the uri prefix of kifs:// followed by the path to a file or directory. File matching by prefix is supported, e.g. kifs://dir/file would match dir/file_1 and dir/file_2. When prefix matching is used, the path must start with a full, valid KiFS directory name. If an external data source is specified in datasource_name, these file paths must resolve to accessible files at that data source location. Prefix matching is supported. If the data source is hdfs, prefixes must be aligned with directories, i.e. partial file names will not match. If no data source is specified, the files are assumed to be local to the database and must all be accessible to the gpudb user, residing on the path (or relative to the path) specified by the external files directory in the Kinetica configuration file. Wildcards (*) can be used to specify a group of files. Prefix matching is supported, the prefixes must be aligned with directories. If the first path ends in .tsv, the text delimiter will be defaulted to a tab character. If the first path ends in .psv, the text delimiter will be defaulted to a pipe character (|).
  - setFilepaths
```
public CreateTableExternalRequest setFilepaths(List<String> filepaths)
```
    Parameters:
    
    filepaths - A list of file paths from which data will be sourced; For paths in KiFS, use the uri prefix of kifs:// followed by the path to a file or directory. File matching by prefix is supported, e.g. kifs://dir/file would match dir/file_1 and dir/file_2. When prefix matching is used, the path must start with a full, valid KiFS directory name. If an external data source is specified in datasource_name, these file paths must resolve to accessible files at that data source location. Prefix matching is supported. If the data source is hdfs, prefixes must be aligned with directories, i.e. partial file names will not match. If no data source is specified, the files are assumed to be local to the database and must all be accessible to the gpudb user, residing on the path (or relative to the path) specified by the external files directory in the Kinetica configuration file. Wildcards (*) can be used to specify a group of files. Prefix matching is supported, the prefixes must be aligned with directories. If the first path ends in .tsv, the text delimiter will be defaulted to a tab character. If the first path ends in .psv, the text delimiter will be defaulted to a pipe character (|).
    
    Returns:
    
    this to mimic the builder pattern.
  - getModifyColumns
```
public Map<String,Map<String,String>> getModifyColumns()
```
    Returns:
    
    Not implemented yet. The default value is an empty Map.
  - setModifyColumns
```
public CreateTableExternalRequest setModifyColumns(Map<String,Map<String,String>> modifyColumns)
```
    Parameters:
    
    modifyColumns - Not implemented yet. The default value is an empty Map.
    
    Returns:
    
    this to mimic the builder pattern.
  - getCreateTableOptions
```
public Map<String,String> getCreateTableOptions()
```
    Returns:
    Options from GPUdb.createTable(CreateTableRequest), allowing the structure of the table to be defined independently of the data source
    
    TYPE_ID: ID of a currently registered type.
    NO_ERROR_IF_EXISTS: If true, prevents an error from occurring if the table already exists and is of the given type. If a table with the same name but a different type exists, it is still an error. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    IS_REPLICATED: Affects the distribution scheme for the table's data. If true and the given table has no explicit shard key defined, the table will be replicated. If false, the table will be sharded according to the shard key specified in the given type_id, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.
    FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.
    PARTITION_TYPE: Partitioning scheme to use. Supported values:
    
    RANGE: Use range partitioning.
    INTERVAL: Use interval partitioning.
    LIST: Use list partitioning.
    HASH: Use hash partitioning.
    SERIES: Use series partitioning.
    
    PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined by partition_definitions.
    PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice of partition_type. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.
    IS_AUTOMATIC_PARTITION: If true, a new partition will be created for values which don't fall into an existing partition. Currently, only supported for list partitions. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TTL: Sets the TTL of the table specified in tableName.
    CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.
    IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with store_only or text_search data-handling or that are non-charN strings, and it will not be retained if the server is restarted. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    STRATEGY_DEFINITION: The tier strategy for the table and its columns.
    The default value is an empty Map.
  - setCreateTableOptions
```
public CreateTableExternalRequest setCreateTableOptions(Map<String,String> createTableOptions)
```
    Parameters:
    createTableOptions - Options from GPUdb.createTable(CreateTableRequest), allowing the structure of the table to be defined independently of the data source
    
    TYPE_ID: ID of a currently registered type.
    NO_ERROR_IF_EXISTS: If true, prevents an error from occurring if the table already exists and is of the given type. If a table with the same name but a different type exists, it is still an error. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    IS_REPLICATED: Affects the distribution scheme for the table's data. If true and the given table has no explicit shard key defined, the table will be replicated. If false, the table will be sharded according to the shard key specified in the given type_id, or randomly sharded, if no shard key is specified. Note that a type containing a shard key cannot be used to create a replicated table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    FOREIGN_KEYS: Semicolon-separated list of foreign keys, of the format '(source_column_name [, ...]) references target_table_name(primary_key_column_name [, ...]) [as foreign_key_name]'.
    FOREIGN_SHARD_KEY: Foreign shard key of the format 'source_column references shard_by_column from target_table(primary_key_column)'.
    PARTITION_TYPE: Partitioning scheme to use. Supported values:
    
    RANGE: Use range partitioning.
    INTERVAL: Use interval partitioning.
    LIST: Use list partitioning.
    HASH: Use hash partitioning.
    SERIES: Use series partitioning.
    
    PARTITION_KEYS: Comma-separated list of partition keys, which are the columns or column expressions by which records will be assigned to partitions defined by partition_definitions.
    PARTITION_DEFINITIONS: Comma-separated list of partition definitions, whose format depends on the choice of partition_type. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.
    IS_AUTOMATIC_PARTITION: If true, a new partition will be created for values which don't fall into an existing partition. Currently, only supported for list partitions. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TTL: Sets the TTL of the table specified in tableName.
    CHUNK_SIZE: Indicates the number of records per chunk to be used for this table.
    IS_RESULT_TABLE: Indicates whether the table is a memory-only table. A result table cannot contain columns with store_only or text_search data-handling or that are non-charN strings, and it will not be retained if the server is restarted. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    STRATEGY_DEFINITION: The tier strategy for the table and its columns.
    The default value is an empty Map.
    Returns:
    
    this to mimic the builder pattern.
  - getOptions
```
public Map<String,String> getOptions()
```
    Returns:
    Optional parameters.
    
    BAD_RECORD_TABLE_NAME: Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When error_handling is abort, bad records table is not populated.
    BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is '10000'.
    BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be bad_record_table_limit and total size of the table per rank is limited to bad_record_table_limit.
    BATCH_SIZE: Number of records to insert per batch when inserting data. The default value is '50000'.
    COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, & datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. See default_column_formats for valid format syntax.
    COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive with columns_to_skip.
    COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with columns_to_load.
    COMPRESSION_TYPE: Source data compression type Supported values:
    
    NONE: No compression.
    AUTO: Auto detect compression type
    GZIP: gzip file compression.
    BZIP2: bzip2 file compression.
    The default value is AUTO.
    DATASOURCE_NAME: Name of an existing external data source from which data file(s) specified in filepaths will be loaded
    DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, & datetime. This default column-property-bound format can be overridden by specifying a column property & format for a given target column in column_formats. For each specified annotation, the format will apply to all columns with that annotation unless a custom column_formats for that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"
    ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:
    
    PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
    IGNORE_BAD_RECORDS: Malformed records are skipped.
    ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
    The default value is ABORT.
    EXTERNAL_TABLE_TYPE: Specifies whether the external table holds a local copy of the external data. Supported values:
    
    MATERIALIZED: Loads a copy of the external data into the database, refreshed on demand
    LOGICAL: External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table
    The default value is MATERIALIZED.
    FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:
    
    AVRO: Avro file format
    DELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.
    GDB: Esri/GDB file format
    JSON: Json file format
    PARQUET: Apache Parquet file format
    SHAPEFILE: ShapeFile file format
    The default value is DELIMITED_TEXT.
    GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requets: key=value
    IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when update_on_existing_pk is false). If set to true, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If false, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by error_handling. If the specified table does not have a primary key or if upsert mode is in effect (update_on_existing_pk is true), then this option has no effect. Supported values:
    
    TRUE: Ignore new records whose primary key values collide with those of existing records
    FALSE: Treat as errors any new records whose primary key values collide with those of existing records
    The default value is FALSE.
    INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:
    
    FULL: Run a type inference on the source data (if needed) and ingest
    DRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of error_handling.
    TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
    The default value is FULL.
    JDBC_FETCH_SIZE: The JDBC fetch size, which determines how many rows to fetch per round trip. The default value is '50000'.
    KAFKA_CONSUMERS_PER_RANK: Number of Kafka consumer threads per rank (valid range 1-6). The default value is '1'.
    KAFKA_GROUP_ID: The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
    KAFKA_OFFSET_RESET_POLICY: Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values:
    
    EARLIEST
    LATEST
    The default value is EARLIEST.
    KAFKA_OPTIMISTIC_INGEST: Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    KAFKA_SUBSCRIPTION_CANCEL_AFTER: Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically.
    KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT: Maximum time to collect Kafka messages before type inferencing on the set of them.
    LAYER: Geo files layer(s) name(s): comma separated.
    LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database Supported values:
    
    HEAD: The head node loads all data. All files must be available to the head node.
    DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
    DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
    The default value is HEAD.
    LOCAL_TIME_OFFSET: Apply an offset to Avro local timestamp columns.
    MAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: if this number is larger than batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread).
    NUM_TASKS_PER_RANK: Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks.
    POLL_INTERVAL: If true, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is '0'.
    PRIMARY_KEYS: Comma separated list of column names to set as primary keys, when not specified in the type.
    REFRESH_METHOD: Method by which the table can be refreshed from its source data. Supported values:
    
    MANUAL: Refresh only occurs when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    ON_START: Refresh table on database startup and when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    The default value is MANUAL.
    SCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.
    SHARD_KEYS: Comma separated list of column names to set as shard keys, when not specified in the type.
    SKIP_LINES: Skip number of lines from begining of file.
    SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TABLE_INSERT_MODE: Insertion scheme to use when inserting records from multiple shapefiles. Supported values:
    
    SINGLE: Insert all records into a single table.
    TABLE_PER_FILE: Insert records from each file into a new table corresponding to that file.
    The default value is SINGLE.
    TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. For delimited_text file_type only. The default value is '#'.
    TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). For delimited_text file_type only. The default value is ','.
    TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, & vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. For delimited_text file_type only.
    TEXT_HAS_HEADER: Indicates whether the source data contains a header row. For delimited_text file_type only. Supported values:
    
    TRUE
    FALSE
    The default value is TRUE.
    TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as text_delimiter. For delimited_text file_type only. The default value is '|'.
    TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. For delimited_text file_type only. The default value is '\\N'.
    TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. For delimited_text file_type only. The default value is '"'.
    TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma seperated list of column names or '*' for all columns. To add 'text_search' property only to string columns greater than or equal to a minimum size, also set the text_search_min_column_length
    TEXT_SEARCH_MIN_COLUMN_LENGTH: Set the minimum column size for strings to apply the 'text_search' property to. Used only when text_search_columns has a value.
    TRUNCATE_STRINGS: If set to true, truncate string values that are longer than the column's type size. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TRUNCATE_TABLE: If set to true, truncates the table specified by tableName prior to loading the file(s). Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TYPE_INFERENCE_MODE: Optimize type inferencing for either speed or accuracy. Supported values:
    
    ACCURACY: Scans data to get exactly-typed & sized columns for all data scanned.
    SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
    The default value is SPEED.
    REMOTE_QUERY: Remote SQL query from which data will be sourced
    REMOTE_QUERY_FILTER_COLUMN: Name of column to be used for splitting remote_query into multiple sub-queries using the data distribution of given column
    REMOTE_QUERY_INCREASING_COLUMN: Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP).
    REMOTE_QUERY_PARTITION_COLUMN: Alias name for remote_query_filter_column.
    UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set to true, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be 'upserted'). If set to false, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by ignore_existing_pk & error_handling. If the specified table does not have a primary key, then this option has no effect. Supported values:
    
    TRUE: Upsert new records when primary keys match existing records
    FALSE: Reject new records when primary keys match existing records
    The default value is FALSE.
    The default value is an empty Map.
  - setOptions
```
public CreateTableExternalRequest setOptions(Map<String,String> options)
```
    Parameters:
    options - Optional parameters.
    
    BAD_RECORD_TABLE_NAME: Name of a table to which records that were rejected are written. The bad-record-table has the following columns: line_number (long), line_rejected (string), error_message (string). When error_handling is abort, bad records table is not populated.
    BAD_RECORD_TABLE_LIMIT: A positive integer indicating the maximum number of records that can be written to the bad-record-table. The default value is '10000'.
    BAD_RECORD_TABLE_LIMIT_PER_INPUT: For subscriptions, a positive integer indicating the maximum number of records that can be written to the bad-record-table per file/payload. Default value will be bad_record_table_limit and total size of the table per rank is limited to bad_record_table_limit.
    BATCH_SIZE: Number of records to insert per batch when inserting data. The default value is '50000'.
    COLUMN_FORMATS: For each target column specified, applies the column-property-bound format to the source data loaded into that column. Each column format will contain a mapping of one or more of its column properties to an appropriate format for each property. Currently supported column properties include date, time, & datetime. The parameter value must be formatted as a JSON string of maps of column names to maps of column properties to their corresponding column formats, e.g., '{ "order_date" : { "date" : "%Y.%m.%d" }, "order_time" : { "time" : "%H:%M:%S" } }'. See default_column_formats for valid format syntax.
    COLUMNS_TO_LOAD: Specifies a comma-delimited list of columns from the source data to load. If more than one file is being loaded, this list applies to all files. Column numbers can be specified discretely or as a range. For example, a value of '5,7,1..3' will insert values from the fifth column in the source data into the first column in the target table, from the seventh column in the source data into the second column in the target table, and from the first through third columns in the source data into the third through fifth columns in the target table. If the source data contains a header, column names matching the file header names may be provided instead of column numbers. If the target table doesn't exist, the table will be created with the columns in this order. If the target table does exist with columns in a different order than the source data, this list can be used to match the order of the target table. For example, a value of 'C, B, A' will create a three column table with column C, followed by column B, followed by column A; or will insert those fields in that order into a table created with columns in that order. If the target table exists, the column names must match the source data field names for a name-mapping to be successful. Mutually exclusive with columns_to_skip.
    COLUMNS_TO_SKIP: Specifies a comma-delimited list of columns from the source data to skip. Mutually exclusive with columns_to_load.
    COMPRESSION_TYPE: Source data compression type Supported values:
    
    NONE: No compression.
    AUTO: Auto detect compression type
    GZIP: gzip file compression.
    BZIP2: bzip2 file compression.
    The default value is AUTO.
    DATASOURCE_NAME: Name of an existing external data source from which data file(s) specified in filepaths will be loaded
    DEFAULT_COLUMN_FORMATS: Specifies the default format to be applied to source data loaded into columns with the corresponding column property. Currently supported column properties include date, time, & datetime. This default column-property-bound format can be overridden by specifying a column property & format for a given target column in column_formats. For each specified annotation, the format will apply to all columns with that annotation unless a custom column_formats for that annotation is specified. The parameter value must be formatted as a JSON string that is a map of column properties to their respective column formats, e.g., '{ "date" : "%Y.%m.%d", "time" : "%H:%M:%S" }'. Column formats are specified as a string of control characters and plain text. The supported control characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()' specification, as well as 's', which specifies seconds and fractional seconds (though the fractional component will be truncated past milliseconds). Formats for the 'date' annotation must include the 'Y', 'm', and 'd' control characters. Formats for the 'time' annotation must include the 'H', 'M', and either 'S' or 's' (but not both) control characters. Formats for the 'datetime' annotation meet both the 'date' and 'time' control character requirements. For example, '{"datetime" : "%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000 12:12:11"
    ERROR_HANDLING: Specifies how errors should be handled upon insertion. Supported values:
    
    PERMISSIVE: Records with missing columns are populated with nulls if possible; otherwise, the malformed records are skipped.
    IGNORE_BAD_RECORDS: Malformed records are skipped.
    ABORT: Stops current insertion and aborts entire operation when an error is encountered. Primary key collisions are considered abortable errors in this mode.
    The default value is ABORT.
    EXTERNAL_TABLE_TYPE: Specifies whether the external table holds a local copy of the external data. Supported values:
    
    MATERIALIZED: Loads a copy of the external data into the database, refreshed on demand
    LOGICAL: External data will not be loaded into the database; the data will be retrieved from the source upon servicing each query against the external table
    The default value is MATERIALIZED.
    FILE_TYPE: Specifies the type of the file(s) whose records will be inserted. Supported values:
    
    AVRO: Avro file format
    DELIMITED_TEXT: Delimited text file format; e.g., CSV, TSV, PSV, etc.
    GDB: Esri/GDB file format
    JSON: Json file format
    PARQUET: Apache Parquet file format
    SHAPEFILE: ShapeFile file format
    The default value is DELIMITED_TEXT.
    GDAL_CONFIGURATION_OPTIONS: Comma separated list of gdal conf options, for the specific requets: key=value
    IGNORE_EXISTING_PK: Specifies the record collision error-suppression policy for inserting into a table with a primary key, only used when not in upsert mode (upsert mode is disabled when update_on_existing_pk is false). If set to true, any record being inserted that is rejected for having primary key values that match those of an existing table record will be ignored with no error generated. If false, the rejection of any record for having primary key values matching an existing record will result in an error being reported, as determined by error_handling. If the specified table does not have a primary key or if upsert mode is in effect (update_on_existing_pk is true), then this option has no effect. Supported values:
    
    TRUE: Ignore new records whose primary key values collide with those of existing records
    FALSE: Treat as errors any new records whose primary key values collide with those of existing records
    The default value is FALSE.
    INGESTION_MODE: Whether to do a full load, dry run, or perform a type inference on the source data. Supported values:
    
    FULL: Run a type inference on the source data (if needed) and ingest
    DRY_RUN: Does not load data, but walks through the source data and determines the number of valid records, taking into account the current mode of error_handling.
    TYPE_INFERENCE_ONLY: Infer the type of the source data and return, without ingesting any data. The inferred type is returned in the response.
    The default value is FULL.
    JDBC_FETCH_SIZE: The JDBC fetch size, which determines how many rows to fetch per round trip. The default value is '50000'.
    KAFKA_CONSUMERS_PER_RANK: Number of Kafka consumer threads per rank (valid range 1-6). The default value is '1'.
    KAFKA_GROUP_ID: The group id to be used when consuming data from a Kafka topic (valid only for Kafka datasource subscriptions).
    KAFKA_OFFSET_RESET_POLICY: Policy to determine whether the Kafka data consumption starts either at earliest offset or latest offset. Supported values:
    
    EARLIEST
    LATEST
    The default value is EARLIEST.
    KAFKA_OPTIMISTIC_INGEST: Enable optimistic ingestion where Kafka topic offsets and table data are committed independently to achieve parallelism. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    KAFKA_SUBSCRIPTION_CANCEL_AFTER: Sets the Kafka subscription lifespan (in minutes). Expired subscription will be cancelled automatically.
    KAFKA_TYPE_INFERENCE_FETCH_TIMEOUT: Maximum time to collect Kafka messages before type inferencing on the set of them.
    LAYER: Geo files layer(s) name(s): comma separated.
    LOADING_MODE: Scheme for distributing the extraction and loading of data from the source data file(s). This option applies only when loading files that are local to the database Supported values:
    
    HEAD: The head node loads all data. All files must be available to the head node.
    DISTRIBUTED_SHARED: The head node coordinates loading data by worker processes across all nodes from shared files available to all workers. NOTE: Instead of existing on a shared source, the files can be duplicated on a source local to each host to improve performance, though the files must appear as the same data set from the perspective of all hosts performing the load.
    DISTRIBUTED_LOCAL: A single worker process on each node loads all files that are available to it. This option works best when each worker loads files from its own file system, to maximize performance. In order to avoid data duplication, either each worker performing the load needs to have visibility to a set of files unique to it (no file is visible to more than one node) or the target table needs to have a primary key (which will allow the worker to automatically deduplicate data). NOTE: If the target table doesn't exist, the table structure will be determined by the head node. If the head node has no files local to it, it will be unable to determine the structure and the request will fail. If the head node is configured to have no worker processes, no data strictly accessible to the head node will be loaded.
    The default value is HEAD.
    LOCAL_TIME_OFFSET: Apply an offset to Avro local timestamp columns.
    MAX_RECORDS_TO_LOAD: Limit the number of records to load in this request: if this number is larger than batch_size, then the number of records loaded will be limited to the next whole number of batch_size (per working thread).
    NUM_TASKS_PER_RANK: Number of tasks for reading file per rank. Default will be system configuration parameter, external_file_reader_num_tasks.
    POLL_INTERVAL: If true, the number of seconds between attempts to load external files into the table. If zero, polling will be continuous as long as data is found. If no data is found, the interval will steadily increase to a maximum of 60 seconds. The default value is '0'.
    PRIMARY_KEYS: Comma separated list of column names to set as primary keys, when not specified in the type.
    REFRESH_METHOD: Method by which the table can be refreshed from its source data. Supported values:
    
    MANUAL: Refresh only occurs when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    ON_START: Refresh table on database startup and when manually requested by invoking the refresh action of GPUdb.alterTable(AlterTableRequest) on this table.
    The default value is MANUAL.
    SCHEMA_REGISTRY_SCHEMA_NAME: Name of the Avro schema in the schema registry to use when reading Avro records.
    SHARD_KEYS: Comma separated list of column names to set as shard keys, when not specified in the type.
    SKIP_LINES: Skip number of lines from begining of file.
    SUBSCRIBE: Continuously poll the data source to check for new data and load it into the table. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TABLE_INSERT_MODE: Insertion scheme to use when inserting records from multiple shapefiles. Supported values:
    
    SINGLE: Insert all records into a single table.
    TABLE_PER_FILE: Insert records from each file into a new table corresponding to that file.
    The default value is SINGLE.
    TEXT_COMMENT_STRING: Specifies the character string that should be interpreted as a comment line prefix in the source data. All lines in the data starting with the provided string are ignored. For delimited_text file_type only. The default value is '#'.
    TEXT_DELIMITER: Specifies the character delimiting field values in the source data and field names in the header (if present). For delimited_text file_type only. The default value is ','.
    TEXT_ESCAPE_CHARACTER: Specifies the character that is used to escape other characters in the source data. An 'a', 'b', 'f', 'n', 'r', 't', or 'v' preceded by an escape character will be interpreted as the ASCII bell, backspace, form feed, line feed, carriage return, horizontal tab, & vertical tab, respectively. For example, the escape character followed by an 'n' will be interpreted as a newline within a field value. The escape character can also be used to escape the quoting character, and will be treated as an escape character whether it is within a quoted field value or not. For delimited_text file_type only.
    TEXT_HAS_HEADER: Indicates whether the source data contains a header row. For delimited_text file_type only. Supported values:
    
    TRUE
    FALSE
    The default value is TRUE.
    TEXT_HEADER_PROPERTY_DELIMITER: Specifies the delimiter for column properties in the header row (if present). Cannot be set to same value as text_delimiter. For delimited_text file_type only. The default value is '|'.
    TEXT_NULL_STRING: Specifies the character string that should be interpreted as a null value in the source data. For delimited_text file_type only. The default value is '\\N'.
    TEXT_QUOTE_CHARACTER: Specifies the character that should be interpreted as a field value quoting character in the source data. The character must appear at beginning and end of field value to take effect. Delimiters within quoted fields are treated as literals and not delimiters. Within a quoted field, two consecutive quote characters will be interpreted as a single literal quote character, effectively escaping it. To not have a quote character, specify an empty string. For delimited_text file_type only. The default value is '"'.
    TEXT_SEARCH_COLUMNS: Add 'text_search' property to internally inferenced string columns. Comma seperated list of column names or '*' for all columns. To add 'text_search' property only to string columns greater than or equal to a minimum size, also set the text_search_min_column_length
    TEXT_SEARCH_MIN_COLUMN_LENGTH: Set the minimum column size for strings to apply the 'text_search' property to. Used only when text_search_columns has a value.
    TRUNCATE_STRINGS: If set to true, truncate string values that are longer than the column's type size. Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TRUNCATE_TABLE: If set to true, truncates the table specified by tableName prior to loading the file(s). Supported values:
    
    TRUE
    FALSE
    The default value is FALSE.
    TYPE_INFERENCE_MODE: Optimize type inferencing for either speed or accuracy. Supported values:
    
    ACCURACY: Scans data to get exactly-typed & sized columns for all data scanned.
    SPEED: Scans data and picks the widest possible column types so that 'all' values will fit with minimum data scanned
    The default value is SPEED.
    REMOTE_QUERY: Remote SQL query from which data will be sourced
    REMOTE_QUERY_FILTER_COLUMN: Name of column to be used for splitting remote_query into multiple sub-queries using the data distribution of given column
    REMOTE_QUERY_INCREASING_COLUMN: Column on subscribed remote query result that will increase for new records (e.g., TIMESTAMP).
    REMOTE_QUERY_PARTITION_COLUMN: Alias name for remote_query_filter_column.
    UPDATE_ON_EXISTING_PK: Specifies the record collision policy for inserting into a table with a primary key. If set to true, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record (the new data will be 'upserted'). If set to false, any existing table record with primary key values that match those of a record being inserted will remain unchanged, while the new record will be rejected and the error handled as determined by ignore_existing_pk & error_handling. If the specified table does not have a primary key, then this option has no effect. Supported values:
    
    TRUE: Upsert new records when primary keys match existing records
    FALSE: Reject new records when primary keys match existing records
    The default value is FALSE.
    The default value is an empty Map.
    Returns:
    
    this to mimic the builder pattern.
  - getSchema
```
public org.apache.avro.Schema getSchema()
```
    This method supports the Avro framework and is not intended to be called directly by the user.
    
    Specified by:
    
    getSchema in interface org.apache.avro.generic.GenericContainer
    
    Returns:
    
    the schema object describing this class.
  - get
```
public Object get(int index)
```
    This method supports the Avro framework and is not intended to be called directly by the user.
    
    Specified by:
    
    get in interface org.apache.avro.generic.IndexedRecord
    
    Parameters:
    
    index - the position of the field to get
    
    Returns:
    
    value of the field with the given index.
    
    Throws:
    
    IndexOutOfBoundsException
  - put
```
public void put(int index,
                Object value)
```
    This method supports the Avro framework and is not intended to be called directly by the user.
    
    Specified by:
    
    put in interface org.apache.avro.generic.IndexedRecord
    
    Parameters:
    
    index - the position of the field to set
    
    value - the value to set
    
    Throws:
    
    IndexOutOfBoundsException
  - equals
```
public boolean equals(Object obj)
```
    Overrides:
    
    equals in class Object
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object

Class CreateTableExternalRequest

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

CreateTableExternalRequest

CreateTableExternalRequest

Method Detail

getClassSchema

getTableName

setTableName

getFilepaths

setFilepaths

getModifyColumns

setModifyColumns

getCreateTableOptions

setCreateTableOptions

getOptions

setOptions

getSchema

get

put

equals

toString

hashCode