public class InsertRecordsFromQueryRequest extends Object implements org.apache.avro.generic.IndexedRecord
GPUdb.insertRecordsFromQuery(InsertRecordsFromQueryRequest)
.
Computes remote query result and inserts the result data into a new or existing table
Modifier and Type | Class and Description |
---|---|
static class |
InsertRecordsFromQueryRequest.CreateTableOptions
Options used when creating the target table.
|
static class |
InsertRecordsFromQueryRequest.Options
Optional parameters.
|
Constructor and Description |
---|
InsertRecordsFromQueryRequest()
Constructs an InsertRecordsFromQueryRequest object with default
parameters.
|
InsertRecordsFromQueryRequest(String tableName,
String remoteQuery,
Map<String,Map<String,String>> modifyColumns,
Map<String,String> createTableOptions,
Map<String,String> options)
Constructs an InsertRecordsFromQueryRequest object with the specified
parameters.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
Object |
get(int index)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
static org.apache.avro.Schema |
getClassSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
Map<String,String> |
getCreateTableOptions() |
Map<String,Map<String,String>> |
getModifyColumns() |
Map<String,String> |
getOptions() |
String |
getRemoteQuery() |
org.apache.avro.Schema |
getSchema()
This method supports the Avro framework and is not intended to be called
directly by the user.
|
String |
getTableName() |
int |
hashCode() |
void |
put(int index,
Object value)
This method supports the Avro framework and is not intended to be called
directly by the user.
|
InsertRecordsFromQueryRequest |
setCreateTableOptions(Map<String,String> createTableOptions) |
InsertRecordsFromQueryRequest |
setModifyColumns(Map<String,Map<String,String>> modifyColumns) |
InsertRecordsFromQueryRequest |
setOptions(Map<String,String> options) |
InsertRecordsFromQueryRequest |
setRemoteQuery(String remoteQuery) |
InsertRecordsFromQueryRequest |
setTableName(String tableName) |
String |
toString() |
public InsertRecordsFromQueryRequest()
public InsertRecordsFromQueryRequest(String tableName, String remoteQuery, Map<String,Map<String,String>> modifyColumns, Map<String,String> createTableOptions, Map<String,String> options)
tableName
- Name of the table into which the data will be
inserted, in
[schema_name.]table_name format, using standard
name resolution rules.
If the table does not exist, the table will be created
using either an existing
type_id
or the type inferred from the
remote query, and the new table name will have to meet
standard
table naming criteria.remoteQuery
- Query for which result data needs to be importedmodifyColumns
- Not implemented yet. The default value is an
empty Map
.createTableOptions
- Options used when creating the target table.
TYPE_ID
: ID of a currently registered type. The default value is
''.
NO_ERROR_IF_EXISTS
: If true
,
prevents an error from occurring if the table
already exists and is of the given type. If
a table with the same ID but a different type
exists, it is still an error.
Supported values:
The default value is FALSE
.
IS_REPLICATED
: Affects the distribution scheme for the
table's data. If true
and the given
type has no explicit shard key defined, the
table will be replicated. If false
, the table will be sharded according to the
shard key specified in the given type_id
, or randomly sharded, if no
shard key is specified. Note that a type
containing a shard key cannot be used to
create a replicated table.
Supported values:
The default value is FALSE
.
FOREIGN_KEYS
: Semicolon-separated list of foreign keys, of the format
'(source_column_name [, ...]) references
target_table_name(primary_key_column_name [,
...]) [as foreign_key_name]'.
FOREIGN_SHARD_KEY
: Foreign shard key of the
format 'source_column references
shard_by_column from
target_table(primary_key_column)'.
PARTITION_TYPE
: Partitioning scheme to use.
Supported values:
RANGE
: Use range partitioning.
INTERVAL
: Use interval partitioning.
LIST
: Use list partitioning.
HASH
: Use hash partitioning.
SERIES
: Use series partitioning.
PARTITION_KEYS
: Comma-separated list of
partition keys, which are the columns or
column expressions by which records will be
assigned to partitions defined by partition_definitions
.
PARTITION_DEFINITIONS
: Comma-separated list
of partition definitions, whose format
depends on the choice of partition_type
. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for
example formats.
IS_AUTOMATIC_PARTITION
: If true
, a
new partition will be created for values
which don't fall into an existing partition.
Currently only supported for list partitions.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified
in tableName
.
CHUNK_SIZE
: Indicates the number of records
per chunk to be used for this table.
IS_RESULT_TABLE
: Indicates whether the table
is a memory-only table. A result
table cannot contain columns with store_only
or text_search data-handling or that are
non-charN strings, and it
will not be retained if the server is
restarted.
Supported values:
The default value is FALSE
.
STRATEGY_DEFINITION
: The tier strategy for the table
and its columns.
Map
.options
- Optional parameters.
BAD_RECORD_TABLE_NAME
: Optional name of a table to
which records that were rejected are written. The
bad-record-table has the following columns: line_number
(long), line_rejected (string), error_message (string).
When error handling is Abort, bad records table is not
populated.
BAD_RECORD_TABLE_LIMIT
: A positive integer indicating
the maximum number of records that can be written to
the bad-record-table. Default value is 10000
BATCH_SIZE
: Number of records per batch when inserting
data.
DATASOURCE_NAME
: Name of an existing external data
source from which table will be loaded
ERROR_HANDLING
: Specifies how errors should be handled
upon insertion.
Supported values:
PERMISSIVE
: Records with missing columns are populated
with nulls if possible; otherwise, the malformed records
are skipped.
IGNORE_BAD_RECORDS
: Malformed records are skipped.
ABORT
: Stops current insertion and aborts entire
operation when an error is encountered. Primary key
collisions are considered abortable errors in this mode.
ABORT
.
IGNORE_EXISTING_PK
: Specifies the record collision
error-suppression policy for
inserting into a table with a primary key, only used when
not in upsert mode (upsert mode is disabled when update_on_existing_pk
is
false
). If set to
true
, any record being inserted that is rejected
for having primary key values that match those of an
existing table record will be ignored with no
error generated. If false
, the rejection of any
record for having primary key values matching an
existing record will result in an error being
reported, as determined by error_handling
. If
the specified table does not
have a primary key or if upsert mode is in effect
(update_on_existing_pk
is
true
), then this option has no effect.
Supported values:
TRUE
: Ignore new records whose primary key values
collide with those of existing records
FALSE
: Treat as errors any new records whose primary
key values collide with those of existing records
FALSE
.
INGESTION_MODE
: Whether to do a full load, dry run, or
perform a type inference on the source data.
Supported values:
FULL
: Run a type inference on the source data (if
needed) and ingest
DRY_RUN
: Does not load data, but walks through the
source data and determines the number of valid records,
taking into account the current mode of error_handling
.
TYPE_INFERENCE_ONLY
: Infer the type of the source data
and return, without ingesting any data. The inferred
type is returned in the response.
FULL
.
JDBC_FETCH_SIZE
: The JDBC fetch size, which determines
how many rows to fetch per round trip.
JDBC_SESSION_INIT_STATEMENT
: Executes the statement per
each jdbc session before doing actual load. The default
value is ''.
NUM_SPLITS_PER_RANK
: Optional: number of splits for
reading data per rank. Default will be
external_file_reader_num_tasks. The default value is
''.
NUM_TASKS_PER_RANK
: Optional: number of tasks for
reading data per rank. Default will be
external_file_reader_num_tasks
PRIMARY_KEYS
: Optional: comma separated list of column
names, to set as primary keys, when not specified in the
type. The default value is ''.
SHARD_KEYS
: Optional: comma separated list of column
names, to set as primary keys, when not specified in the
type. The default value is ''.
SUBSCRIBE
: Continuously poll the data source to check
for new data and load it into the table.
Supported values:
The default value is FALSE
.
TRUNCATE_TABLE
: If set to true
, truncates the
table specified by tableName
prior to loading
the data.
Supported values:
The default value is FALSE
.
REMOTE_QUERY
: Remote SQL query from which data will be
sourced
REMOTE_QUERY_ORDER_BY
: Name of column to be used for
splitting the query into multiple sub-queries using
ordering of given column. The default value is ''.
REMOTE_QUERY_FILTER_COLUMN
: Name of column to be used
for splitting the query into multiple sub-queries using
the data distribution of given column. The default
value is ''.
REMOTE_QUERY_INCREASING_COLUMN
: Column on subscribed
remote query result that will increase for new records
(e.g., TIMESTAMP). The default value is ''.
REMOTE_QUERY_PARTITION_COLUMN
: Alias name for
remote_query_filter_column. The default value is ''.
TRUNCATE_STRINGS
: If set to true
, truncate
string values that are longer than the column's type
size.
Supported values:
The default value is FALSE
.
UPDATE_ON_EXISTING_PK
: Specifies the record collision
policy for inserting into a table
with a primary key. If set to
true
, any existing table record with primary
key values that match those of a record being inserted
will be replaced by that new record (the new
data will be "upserted"). If set to false
,
any existing table record with primary key values that
match those of a record being inserted will
remain unchanged, while the new record will be rejected
and the error handled as determined by
ignore_existing_pk
& error_handling
. If
the
specified table does not have a primary key, then this
option has no effect.
Supported values:
TRUE
: Upsert new records when primary keys match
existing records
FALSE
: Reject new records when primary keys match
existing records
FALSE
.
Map
.public static org.apache.avro.Schema getClassSchema()
public String getTableName()
type_id
or the type inferred from the
remote query, and the new table name will have to meet standard
table naming criteria.public InsertRecordsFromQueryRequest setTableName(String tableName)
tableName
- Name of the table into which the data will be
inserted, in
[schema_name.]table_name format, using standard
name resolution rules.
If the table does not exist, the table will be created
using either an existing
type_id
or the type inferred from the
remote query, and the new table name will have to meet
standard
table naming criteria.this
to mimic the builder pattern.public String getRemoteQuery()
public InsertRecordsFromQueryRequest setRemoteQuery(String remoteQuery)
remoteQuery
- Query for which result data needs to be importedthis
to mimic the builder pattern.public Map<String,Map<String,String>> getModifyColumns()
Map
.public InsertRecordsFromQueryRequest setModifyColumns(Map<String,Map<String,String>> modifyColumns)
modifyColumns
- Not implemented yet. The default value is an
empty Map
.this
to mimic the builder pattern.public Map<String,String> getCreateTableOptions()
TYPE_ID
: ID of a currently registered type.
The default value is ''.
NO_ERROR_IF_EXISTS
: If true
, prevents an error from
occurring if the table already exists and is of the given type.
If a table with the same ID but a different type exists, it is
still an error.
Supported values:
The default value is FALSE
.
IS_REPLICATED
: Affects the distribution scheme for the table's data. If
true
and the given type has no explicit shard key defined, the table will be replicated. If false
, the table will
be sharded according to the shard key specified
in the given type_id
, or randomly sharded, if no shard key is
specified. Note that a type containing a shard key cannot be
used to create a replicated table.
Supported values:
The default value is FALSE
.
FOREIGN_KEYS
: Semicolon-separated list of foreign keys, of the format
'(source_column_name [, ...]) references
target_table_name(primary_key_column_name [, ...]) [as
foreign_key_name]'.
FOREIGN_SHARD_KEY
: Foreign shard key of the format
'source_column references shard_by_column from
target_table(primary_key_column)'.
PARTITION_TYPE
: Partitioning scheme to use.
Supported values:
RANGE
: Use range partitioning.
INTERVAL
: Use interval partitioning.
LIST
: Use list partitioning.
HASH
: Use hash partitioning.
SERIES
: Use series partitioning.
PARTITION_KEYS
: Comma-separated list of partition keys, which
are the columns or column expressions by which records will be
assigned to partitions defined by partition_definitions
.
PARTITION_DEFINITIONS
: Comma-separated list of partition
definitions, whose format depends on the choice of partition_type
. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for example formats.
IS_AUTOMATIC_PARTITION
: If true
, a new partition will
be created for values which don't fall into an existing
partition. Currently only supported for list partitions.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified in tableName
.
CHUNK_SIZE
: Indicates the number of records per chunk to be
used for this table.
IS_RESULT_TABLE
: Indicates whether the table is a memory-only table. A result table cannot
contain columns with store_only or text_search data-handling or that are non-charN strings, and it will not be retained
if the server is restarted.
Supported values:
The default value is FALSE
.
STRATEGY_DEFINITION
: The tier strategy for the table and its columns.
Map
.public InsertRecordsFromQueryRequest setCreateTableOptions(Map<String,String> createTableOptions)
createTableOptions
- Options used when creating the target table.
TYPE_ID
: ID of a currently registered type. The default value is
''.
NO_ERROR_IF_EXISTS
: If true
,
prevents an error from occurring if the table
already exists and is of the given type. If
a table with the same ID but a different type
exists, it is still an error.
Supported values:
The default value is FALSE
.
IS_REPLICATED
: Affects the distribution scheme for the
table's data. If true
and the given
type has no explicit shard key defined, the
table will be replicated. If false
, the table will be sharded according to the
shard key specified in the given type_id
, or randomly sharded, if no
shard key is specified. Note that a type
containing a shard key cannot be used to
create a replicated table.
Supported values:
The default value is FALSE
.
FOREIGN_KEYS
: Semicolon-separated list of foreign keys, of the format
'(source_column_name [, ...]) references
target_table_name(primary_key_column_name [,
...]) [as foreign_key_name]'.
FOREIGN_SHARD_KEY
: Foreign shard key of the
format 'source_column references
shard_by_column from
target_table(primary_key_column)'.
PARTITION_TYPE
: Partitioning scheme to use.
Supported values:
RANGE
: Use range partitioning.
INTERVAL
: Use interval partitioning.
LIST
: Use list partitioning.
HASH
: Use hash partitioning.
SERIES
: Use series partitioning.
PARTITION_KEYS
: Comma-separated list of
partition keys, which are the columns or
column expressions by which records will be
assigned to partitions defined by partition_definitions
.
PARTITION_DEFINITIONS
: Comma-separated list
of partition definitions, whose format
depends on the choice of partition_type
. See range partitioning, interval partitioning, list partitioning, hash partitioning, or series partitioning for
example formats.
IS_AUTOMATIC_PARTITION
: If true
, a
new partition will be created for values
which don't fall into an existing partition.
Currently only supported for list partitions.
Supported values:
The default value is FALSE
.
TTL
: Sets the TTL of the table specified
in tableName
.
CHUNK_SIZE
: Indicates the number of records
per chunk to be used for this table.
IS_RESULT_TABLE
: Indicates whether the table
is a memory-only table. A result
table cannot contain columns with store_only
or text_search data-handling or that are
non-charN strings, and it
will not be retained if the server is
restarted.
Supported values:
The default value is FALSE
.
STRATEGY_DEFINITION
: The tier strategy for the table
and its columns.
Map
.this
to mimic the builder pattern.public Map<String,String> getOptions()
BAD_RECORD_TABLE_NAME
: Optional name of a table to which
records that were rejected are written. The bad-record-table
has the following columns: line_number (long), line_rejected
(string), error_message (string). When error handling is Abort,
bad records table is not populated.
BAD_RECORD_TABLE_LIMIT
: A positive integer indicating the
maximum number of records that can be written to the
bad-record-table. Default value is 10000
BATCH_SIZE
: Number of records per batch when inserting data.
DATASOURCE_NAME
: Name of an existing external data source from
which table will be loaded
ERROR_HANDLING
: Specifies how errors should be handled upon
insertion.
Supported values:
PERMISSIVE
: Records with missing columns are populated with
nulls if possible; otherwise, the malformed records are skipped.
IGNORE_BAD_RECORDS
: Malformed records are skipped.
ABORT
: Stops current insertion and aborts entire operation when
an error is encountered. Primary key collisions are considered
abortable errors in this mode.
ABORT
.
IGNORE_EXISTING_PK
: Specifies the record collision
error-suppression policy for
inserting into a table with a primary key, only used when
not in upsert mode (upsert mode is disabled when update_on_existing_pk
is
false
). If set to
true
, any record being inserted that is rejected
for having primary key values that match those of an existing
table record will be ignored with no
error generated. If false
, the rejection of any
record for having primary key values matching an existing record
will result in an error being
reported, as determined by error_handling
. If the
specified table does not
have a primary key or if upsert mode is in effect (update_on_existing_pk
is
true
), then this option has no effect.
Supported values:
TRUE
: Ignore new records whose primary key values collide with
those of existing records
FALSE
: Treat as errors any new records whose primary key values
collide with those of existing records
FALSE
.
INGESTION_MODE
: Whether to do a full load, dry run, or perform
a type inference on the source data.
Supported values:
FULL
: Run a type inference on the source data (if needed) and
ingest
DRY_RUN
: Does not load data, but walks through the source data
and determines the number of valid records, taking into account
the current mode of error_handling
.
TYPE_INFERENCE_ONLY
: Infer the type of the source data and
return, without ingesting any data. The inferred type is
returned in the response.
FULL
.
JDBC_FETCH_SIZE
: The JDBC fetch size, which determines how many
rows to fetch per round trip.
JDBC_SESSION_INIT_STATEMENT
: Executes the statement per each
jdbc session before doing actual load. The default value is ''.
NUM_SPLITS_PER_RANK
: Optional: number of splits for reading
data per rank. Default will be external_file_reader_num_tasks.
The default value is ''.
NUM_TASKS_PER_RANK
: Optional: number of tasks for reading data
per rank. Default will be external_file_reader_num_tasks
PRIMARY_KEYS
: Optional: comma separated list of column names,
to set as primary keys, when not specified in the type. The
default value is ''.
SHARD_KEYS
: Optional: comma separated list of column names, to
set as primary keys, when not specified in the type. The
default value is ''.
SUBSCRIBE
: Continuously poll the data source to check for new
data and load it into the table.
Supported values:
The default value is FALSE
.
TRUNCATE_TABLE
: If set to true
, truncates the table
specified by tableName
prior to loading the data.
Supported values:
The default value is FALSE
.
REMOTE_QUERY
: Remote SQL query from which data will be sourced
REMOTE_QUERY_ORDER_BY
: Name of column to be used for splitting
the query into multiple sub-queries using ordering of given
column. The default value is ''.
REMOTE_QUERY_FILTER_COLUMN
: Name of column to be used for
splitting the query into multiple sub-queries using the data
distribution of given column. The default value is ''.
REMOTE_QUERY_INCREASING_COLUMN
: Column on subscribed remote
query result that will increase for new records (e.g.,
TIMESTAMP). The default value is ''.
REMOTE_QUERY_PARTITION_COLUMN
: Alias name for
remote_query_filter_column. The default value is ''.
TRUNCATE_STRINGS
: If set to true
, truncate string
values that are longer than the column's type size.
Supported values:
The default value is FALSE
.
UPDATE_ON_EXISTING_PK
: Specifies the record collision policy
for inserting into a table
with a primary key. If set to
true
, any existing table record with primary
key values that match those of a record being inserted will be
replaced by that new record (the new
data will be "upserted"). If set to false
,
any existing table record with primary key values that match
those of a record being inserted will
remain unchanged, while the new record will be rejected and the
error handled as determined by
ignore_existing_pk
& error_handling
. If the
specified table does not have a primary key, then this option
has no effect.
Supported values:
TRUE
: Upsert new records when primary keys match existing
records
FALSE
: Reject new records when primary keys match existing
records
FALSE
.
Map
.public InsertRecordsFromQueryRequest setOptions(Map<String,String> options)
options
- Optional parameters.
BAD_RECORD_TABLE_NAME
: Optional name of a table to
which records that were rejected are written. The
bad-record-table has the following columns: line_number
(long), line_rejected (string), error_message (string).
When error handling is Abort, bad records table is not
populated.
BAD_RECORD_TABLE_LIMIT
: A positive integer indicating
the maximum number of records that can be written to
the bad-record-table. Default value is 10000
BATCH_SIZE
: Number of records per batch when inserting
data.
DATASOURCE_NAME
: Name of an existing external data
source from which table will be loaded
ERROR_HANDLING
: Specifies how errors should be handled
upon insertion.
Supported values:
PERMISSIVE
: Records with missing columns are populated
with nulls if possible; otherwise, the malformed records
are skipped.
IGNORE_BAD_RECORDS
: Malformed records are skipped.
ABORT
: Stops current insertion and aborts entire
operation when an error is encountered. Primary key
collisions are considered abortable errors in this mode.
ABORT
.
IGNORE_EXISTING_PK
: Specifies the record collision
error-suppression policy for
inserting into a table with a primary key, only used when
not in upsert mode (upsert mode is disabled when update_on_existing_pk
is
false
). If set to
true
, any record being inserted that is rejected
for having primary key values that match those of an
existing table record will be ignored with no
error generated. If false
, the rejection of any
record for having primary key values matching an
existing record will result in an error being
reported, as determined by error_handling
. If
the specified table does not
have a primary key or if upsert mode is in effect
(update_on_existing_pk
is
true
), then this option has no effect.
Supported values:
TRUE
: Ignore new records whose primary key values
collide with those of existing records
FALSE
: Treat as errors any new records whose primary
key values collide with those of existing records
FALSE
.
INGESTION_MODE
: Whether to do a full load, dry run, or
perform a type inference on the source data.
Supported values:
FULL
: Run a type inference on the source data (if
needed) and ingest
DRY_RUN
: Does not load data, but walks through the
source data and determines the number of valid records,
taking into account the current mode of error_handling
.
TYPE_INFERENCE_ONLY
: Infer the type of the source data
and return, without ingesting any data. The inferred
type is returned in the response.
FULL
.
JDBC_FETCH_SIZE
: The JDBC fetch size, which determines
how many rows to fetch per round trip.
JDBC_SESSION_INIT_STATEMENT
: Executes the statement per
each jdbc session before doing actual load. The default
value is ''.
NUM_SPLITS_PER_RANK
: Optional: number of splits for
reading data per rank. Default will be
external_file_reader_num_tasks. The default value is
''.
NUM_TASKS_PER_RANK
: Optional: number of tasks for
reading data per rank. Default will be
external_file_reader_num_tasks
PRIMARY_KEYS
: Optional: comma separated list of column
names, to set as primary keys, when not specified in the
type. The default value is ''.
SHARD_KEYS
: Optional: comma separated list of column
names, to set as primary keys, when not specified in the
type. The default value is ''.
SUBSCRIBE
: Continuously poll the data source to check
for new data and load it into the table.
Supported values:
The default value is FALSE
.
TRUNCATE_TABLE
: If set to true
, truncates the
table specified by tableName
prior to loading
the data.
Supported values:
The default value is FALSE
.
REMOTE_QUERY
: Remote SQL query from which data will be
sourced
REMOTE_QUERY_ORDER_BY
: Name of column to be used for
splitting the query into multiple sub-queries using
ordering of given column. The default value is ''.
REMOTE_QUERY_FILTER_COLUMN
: Name of column to be used
for splitting the query into multiple sub-queries using
the data distribution of given column. The default
value is ''.
REMOTE_QUERY_INCREASING_COLUMN
: Column on subscribed
remote query result that will increase for new records
(e.g., TIMESTAMP). The default value is ''.
REMOTE_QUERY_PARTITION_COLUMN
: Alias name for
remote_query_filter_column. The default value is ''.
TRUNCATE_STRINGS
: If set to true
, truncate
string values that are longer than the column's type
size.
Supported values:
The default value is FALSE
.
UPDATE_ON_EXISTING_PK
: Specifies the record collision
policy for inserting into a table
with a primary key. If set to
true
, any existing table record with primary
key values that match those of a record being inserted
will be replaced by that new record (the new
data will be "upserted"). If set to false
,
any existing table record with primary key values that
match those of a record being inserted will
remain unchanged, while the new record will be rejected
and the error handled as determined by
ignore_existing_pk
& error_handling
. If
the
specified table does not have a primary key, then this
option has no effect.
Supported values:
TRUE
: Upsert new records when primary keys match
existing records
FALSE
: Reject new records when primary keys match
existing records
FALSE
.
Map
.this
to mimic the builder pattern.public org.apache.avro.Schema getSchema()
getSchema
in interface org.apache.avro.generic.GenericContainer
public Object get(int index)
get
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to getIndexOutOfBoundsException
public void put(int index, Object value)
put
in interface org.apache.avro.generic.IndexedRecord
index
- the position of the field to setvalue
- the value to setIndexOutOfBoundsException
Copyright © 2024. All rights reserved.