Class GPUdbIngestor

class gpudb_ingestor.GPUdbIngestor(gpudb, table_name, record_type, batch_size, options=None, workers=None)[source]

Initializes the GPUdbIngestor instance.

Parameters

gpudb (GPUdb) –
The client handle through which the ingestion process is to be conducted.
table_name (str) –
The name of the table into which records will be ingested. Must be an existing table.
record_type (GPUdbRecordType) –
The type for the records which will be ingested; must match the type of the given table.
batch_size (int) –
The size of the queues; when any queue (one per worker rank of the database server) attains the given size, the queued records will be automatically flushed. Until then, those records will be held client-side and not actually ingested. (Unless flush() is called, of course.)
options (dict of str to str) –
Any insertion options to be passed onto the GPUdb server. Optional parameter.
workers (GPUdbWorkerList) –
Optional parameter. A list of GPUdb worker rank addresses.
get_gpudb()[source]

Return the instance of GPUdb client used by this ingestor.

get_table_name()[source]

Return the GPUdb table associated with this ingestor.

get_batch_size()[source]

Return the batch_size used for this ingestor.

get_options()[source]

Return the options used for this ingestor.

get_count_inserted()[source]

Return the number of records inserted thus far.

get_count_updated()[source]

Return the number of records updated thus far.

insert_record(record, record_encoding='binary')[source]

Queues a record for insertion into GPUdb. If the queue reaches the {@link #get_batch_size batch size}, all records in the queue will be inserted into GPUdb before the method returns. If an error occurs while inserting the records, the records will no longer be in the queue nor in GPUdb; catch {@link InsertionException} to get the list of records that were being inserted if needed (for example, to retry).

Parameters

record (GPUdbRecord, collections.OrderedDict) –
The record to insert.
record_encoding (str) –

The encoding to use for the insertion. Allowed values are:

  • ‘binary’
  • ‘json’

The default values is ‘binary’.

@throws InsertionException if an error occurs while inserting.

insert_records(records, record_encoding='binary')[source]

Queues a list of records for insertion into GPUdb. If any queue reaches the {@link #get_batch_size batch size}, all records in that queue will be inserted into GPUdb before the method returns. If an error occurs while inserting the queued records, the records will no longer be in that queue nor in GPUdb; catch {@link InsertionException} to get the list of records that were being inserted (including any from the queue in question and any remaining in the list not yet queued) if needed (for example, to retry). Note that depending on the number of records, multiple calls to GPUdb may occur.

Parameters

records (GPUdbRecord, collections.OrderedDict) –
The records to insert

@throws InsertionException if an error occurs while inserting

flush()[source]

Ensures that any queued records are inserted into GPUdb. If an error occurs while inserting the records from any queue, the records will no longer be in that queue nor in GPUdb; catch {@link InsertException} to get the list of records that were being inserted if needed (for example, to retry). Other queues may also still contain unflushed records if this occurs.

@throws InsertException if an error occurs while inserting records.