public static final class InsertRecordsFromFilesRequest.Options extends Object
BATCH_SIZE
: Specifies number of records to process before inserting.
COLUMN_FORMATS
: For each target column specified, applies the
column-property-bound format to the source data loaded into that column.
Each column format will contain a mapping of one or more of its column
properties to an appropriate format for each property. Currently
supported column properties include date, time, & datetime. The
parameter value must be formatted as a JSON string of maps of column
names to maps of column properties to their corresponding column
formats, e.g., { "order_date" : { "date" : "%Y.%m.%d" }, "order_time" :
{ "time" : "%H:%M:%S" } }. See default_column_formats
for valid
format syntax.
COLUMNS_TO_LOAD
: For delimited_text
file_type
only.
Specifies a comma-delimited list of column positions or names to load
instead of loading all columns in the file(s); if more than one file is
being loaded, the list of columns will apply to all files. Column
numbers can be specified discretely or as a range, e.g., a value of
'5,7,1..3' will create a table with the first column in the table being
the fifth column in the file, followed by seventh column in the file,
then the first column through the fourth column in the file.
DEFAULT_COLUMN_FORMATS
: Specifies the default format to be applied to
source data loaded into columns with the corresponding column property.
This default column-property-bound format can be overridden by
specifying a column property & format for a given target column in
column_formats
. For each specified annotation, the format will
apply to all columns with that annotation unless a custom column_formats
for that annotation is specified. The parameter value
must be formatted as a JSON string that is a map of column properties to
their respective column formats, e.g., { "date" : "%Y.%m.%d", "time" :
"%H:%M:%S" }. Column formats are specified as a string of control
characters and plain text. The supported control characters are 'Y',
'm', 'd', 'H', 'M', 'S', and 's', which follow the Linux 'strptime()'
specification, as well as 's', which specifies seconds and fractional
seconds (though the fractional component will be truncated past
milliseconds). Formats for the 'date' annotation must include the 'Y',
'm', and 'd' control characters. Formats for the 'time' annotation must
include the 'H', 'M', and either 'S' or 's' (but not both) control
characters. Formats for the 'datetime' annotation meet both the 'date'
and 'time' control character requirements. For example, '{"datetime" :
"%m/%d/%Y %H:%M:%S" }' would be used to interpret text as "05/04/2000
12:12:11"
DRY_RUN
: If set to true
, no data will be inserted but the file
will be read with the applied error_handling
mode and the number
of valid records that would be normally inserted are returned.
Supported values:
The default value is FALSE
.
ERROR_HANDLING
: Specifies how errors should be handled upon insertion.
Supported values:
PERMISSIVE
: Records with missing columns are populated with nulls if
possible; otherwise, the malformed records are skipped.
IGNORE_BAD_RECORDS
: Malformed records are skipped.
ABORT
:
Stops current insertion and aborts entire operation when an error is
encountered.
PERMISSIVE
.
FILE_TYPE
: File type for the file(s).
Supported values:
DELIMITED_TEXT
: Indicates the file(s) are in delimited text format,
e.g., CSV, TSV, PSV, etc.
DELIMITED_TEXT
.
LOADING_MODE
: Specifies how to divide data loading among nodes.
Supported values:
HEAD
: The
head node loads all data. All files must be available on the head node.
DISTRIBUTED_SHARED
: The worker nodes coordinate loading a set of files
that are available to all of them. All files must be available on all
nodes. This option is best when there is a shared file system.
DISTRIBUTED_LOCAL
: Each worker node loads all files that are available
to it. This option is best when each worker node has its own file
system.
HEAD
.
TEXT_COMMENT_STRING
: For delimited_text
file_type
only.
All lines in the file(s) starting with the provided string are ignored.
The comment string has no effect unless it appears at the beginning of a
line. The default value is '#'.
TEXT_DELIMITER
: For delimited_text
file_type
only.
Specifies the delimiter for values and columns in the header row (if
present). Must be a single character. The default value is ','.
TEXT_ESCAPE_CHARACTER
: For delimited_text
file_type
only. The character used in the file(s) to escape certain character
sequences in text. For example, the escape character followed by a
literal 'n' escapes to a newline character within the field. Can be used
within quoted string to escape a quote character. An empty value for
this option does not specify an escape character.
TEXT_HAS_HEADER
: For delimited_text
file_type
only.
Indicates whether the delimited text files have a header row.
Supported values:
The default value is TRUE
.
TEXT_HEADER_PROPERTY_DELIMITER
: For delimited_text
file_type
only. Specifies the delimiter for column properties in the
header row (if present). Cannot be set to same value as text_delimiter.
The default value is '|'.
TEXT_NULL_STRING
: For delimited_text
file_type
only.
The value in the file(s) to treat as a null value in the database. The
default value is ''.
TEXT_QUOTE_CHARACTER
: For delimited_text
file_type
only. The quote character used in the file(s), typically encompassing a
field value. The character must appear at beginning and end of field to
take effect. Delimiters within quoted fields are not treated as
delimiters. Within a quoted field, double quotes (") can be used to
escape a single literal quote character. To not have a quote character,
specify an empty string (""). The default value is '"'.
TRUNCATE_TABLE
: If set to true
, truncates the table specified
by tableName
prior to loading the file(s).
Supported values:
The default value is FALSE
.
NUM_TASKS_PER_RANK
: Optional: number of tasks for reading file per
rank. Default will be external_file_reader_num_tasks
Map
.
A set of string constants for the parameter options
.Modifier and Type | Field and Description |
---|---|
static String |
ABORT
Stops current insertion and aborts entire operation when an error is
encountered.
|
static String |
BATCH_SIZE
Specifies number of records to process before inserting.
|
static String |
COLUMN_FORMATS
For each target column specified, applies the column-property-bound
format to the source data loaded into that column.
|
static String |
COLUMNS_TO_LOAD
For
delimited_text file_type only. |
static String |
DEFAULT_COLUMN_FORMATS
Specifies the default format to be applied to source data loaded
into columns with the corresponding column property.
|
static String |
DELIMITED_TEXT
Indicates the file(s) are in delimited text format, e.g., CSV, TSV,
PSV, etc.
|
static String |
DISTRIBUTED_LOCAL
Each worker node loads all files that are available to it.
|
static String |
DISTRIBUTED_SHARED
The worker nodes coordinate loading a set of files that are
available to all of them.
|
static String |
DRY_RUN
If set to
true , no data will be inserted but the file will
be read with the applied error_handling mode and the number
of valid records that would be normally inserted are returned. |
static String |
ERROR_HANDLING
Specifies how errors should be handled upon insertion.
|
static String |
FALSE |
static String |
FILE_TYPE
File type for the file(s).
|
static String |
HEAD
The head node loads all data.
|
static String |
IGNORE_BAD_RECORDS
Malformed records are skipped.
|
static String |
LOADING_MODE
Specifies how to divide data loading among nodes.
|
static String |
NUM_TASKS_PER_RANK
Optional: number of tasks for reading file per rank.
|
static String |
PERMISSIVE
Records with missing columns are populated with nulls if possible;
otherwise, the malformed records are skipped.
|
static String |
TEXT_COMMENT_STRING
For
delimited_text file_type only. |
static String |
TEXT_DELIMITER
For
delimited_text file_type only. |
static String |
TEXT_ESCAPE_CHARACTER
For
delimited_text file_type only. |
static String |
TEXT_HAS_HEADER
For
delimited_text file_type only. |
static String |
TEXT_HEADER_PROPERTY_DELIMITER
For
delimited_text file_type only. |
static String |
TEXT_NULL_STRING
For
delimited_text file_type only. |
static String |
TEXT_QUOTE_CHARACTER
For
delimited_text file_type only. |
static String |
TRUE |
static String |
TRUNCATE_TABLE
If set to
true , truncates the table specified by tableName prior to loading the file(s). |
public static final String BATCH_SIZE
public static final String COLUMN_FORMATS
default_column_formats
for valid format syntax.public static final String COLUMNS_TO_LOAD
delimited_text
file_type
only. Specifies a
comma-delimited list of column positions or names to load instead of
loading all columns in the file(s); if more than one file is being
loaded, the list of columns will apply to all files. Column numbers
can be specified discretely or as a range, e.g., a value of
'5,7,1..3' will create a table with the first column in the table
being the fifth column in the file, followed by seventh column in
the file, then the first column through the fourth column in the
file.public static final String DEFAULT_COLUMN_FORMATS
column_formats
. For each specified annotation, the format will
apply to all columns with that annotation unless a custom column_formats
for that annotation is specified. The parameter
value must be formatted as a JSON string that is a map of column
properties to their respective column formats, e.g., { "date" :
"%Y.%m.%d", "time" : "%H:%M:%S" }. Column formats are specified as a
string of control characters and plain text. The supported control
characters are 'Y', 'm', 'd', 'H', 'M', 'S', and 's', which follow
the Linux 'strptime()' specification, as well as 's', which
specifies seconds and fractional seconds (though the fractional
component will be truncated past milliseconds). Formats for the
'date' annotation must include the 'Y', 'm', and 'd' control
characters. Formats for the 'time' annotation must include the 'H',
'M', and either 'S' or 's' (but not both) control characters.
Formats for the 'datetime' annotation meet both the 'date' and
'time' control character requirements. For example, '{"datetime" :
"%m/%d/%Y %H:%M:%S" }' would be used to interpret text as
"05/04/2000 12:12:11"public static final String DRY_RUN
true
, no data will be inserted but the file will
be read with the applied error_handling
mode and the number
of valid records that would be normally inserted are returned.
Supported values:
The default value is FALSE
.public static final String FALSE
public static final String TRUE
public static final String ERROR_HANDLING
PERMISSIVE
: Records with missing columns are populated with nulls
if possible; otherwise, the malformed records are skipped.
IGNORE_BAD_RECORDS
: Malformed records are skipped.
ABORT
: Stops current insertion and aborts entire operation when an
error is encountered.
PERMISSIVE
.public static final String PERMISSIVE
public static final String IGNORE_BAD_RECORDS
public static final String ABORT
public static final String FILE_TYPE
DELIMITED_TEXT
: Indicates the file(s) are in delimited text format,
e.g., CSV, TSV, PSV, etc.
DELIMITED_TEXT
.public static final String DELIMITED_TEXT
public static final String LOADING_MODE
HEAD
:
The head node loads all data. All files must be available on the
head node.
DISTRIBUTED_SHARED
: The worker nodes coordinate loading a set of
files that are available to all of them. All files must be available
on all nodes. This option is best when there is a shared file
system.
DISTRIBUTED_LOCAL
: Each worker node loads all files that are
available to it. This option is best when each worker node has its
own file system.
HEAD
.public static final String HEAD
public static final String DISTRIBUTED_SHARED
public static final String DISTRIBUTED_LOCAL
public static final String TEXT_COMMENT_STRING
delimited_text
file_type
only. All lines in the
file(s) starting with the provided string are ignored. The comment
string has no effect unless it appears at the beginning of a line.
The default value is '#'.public static final String TEXT_DELIMITER
delimited_text
file_type
only. Specifies the
delimiter for values and columns in the header row (if present).
Must be a single character. The default value is ','.public static final String TEXT_ESCAPE_CHARACTER
delimited_text
file_type
only. The character
used in the file(s) to escape certain character sequences in text.
For example, the escape character followed by a literal 'n' escapes
to a newline character within the field. Can be used within quoted
string to escape a quote character. An empty value for this option
does not specify an escape character.public static final String TEXT_HAS_HEADER
delimited_text
file_type
only. Indicates whether
the delimited text files have a header row.
Supported values:
The default value is TRUE
.public static final String TEXT_HEADER_PROPERTY_DELIMITER
delimited_text
file_type
only. Specifies the
delimiter for column properties in the header row (if present).
Cannot be set to same value as text_delimiter. The default value is
'|'.public static final String TEXT_NULL_STRING
delimited_text
file_type
only. The value in the
file(s) to treat as a null value in the database. The default value
is ''.public static final String TEXT_QUOTE_CHARACTER
delimited_text
file_type
only. The quote
character used in the file(s), typically encompassing a field value.
The character must appear at beginning and end of field to take
effect. Delimiters within quoted fields are not treated as
delimiters. Within a quoted field, double quotes (") can be used to
escape a single literal quote character. To not have a quote
character, specify an empty string (""). The default value is '"'.public static final String TRUNCATE_TABLE
true
, truncates the table specified by tableName
prior to loading the file(s).
Supported values:
The default value is FALSE
.public static final String NUM_TASKS_PER_RANK
Copyright © 2020. All rights reserved.