- Distributed Ingest Client
- Fact Table (ingestion target)
- Materialized View from Fact Table (key lookup source)
- Distributed Key Lookup Client

Setup Phase
First, a database connection needs to be established.Establish Database Connection
Create/Use Optional Schema
order_history, needs to be created
before ingestion can begin. It serves as the distributed ingestion target.
The table is sharded on store_id, so that the
subsequent aggregation on store_id by the materialized view can execute
quickly, without having to synchronize aggregation results across shards.
Create Order History Table
store_sales, needs to be created before keyed lookups can begin. It serves
as the distributed key lookup source.
The materialized view aggregates each store’s total sales by date from the
order_history table, grouping on store_id and the date portion of the
order timestamp. It also uses the KI_HINT_GROUP_BY_PK hint to make a
primary key out of these two grouping columns.
Doing so automatically creates a
primary key index on the two columns, meeting
the keyed lookup criteria that
all columns involved in the lookup be indexed.
The refresh interval given to the materialized view is for reference only; in
this example, the view will be manually refreshed to allow results to be shown
immediately.
Create Store Sales Materialized View
Ingest Phase
First, theType schema of the ingest target table needs to be extracted,
for later use in BulkInserter construction and record insertion.
Get Table Schema
BulkInserter is the Java object that performs distributed ingestion.
To create one, give it the database handle, table name & Type schema, record
ingest queue size, and any ingest options.
Configure & Create a BulkInserter
BulkInserter has been created, GenericRecord objects of the
given Type are inserted into it. The indexes used in assigning column
values to each GenericRecord are 0-based and match the column order of the
target table: store_id, id, total_amount, and then timestamp.
Insert Data
BulkInserter automatically inserts records as each of its queues reaches
the configured queueSize. Before ending the ingest session, the
BulkInserter is flushed, initiating inserts of any remaining queued records.
Flush Data to Server
Refresh Phase
Though the materialized view in this example (and in the associated use case) is configured to periodically refresh, a manual refresh is done between ingest & key lookup phases to avoid having to wait for the refresh cycle. The manual refresh requires only a database connection in order to be initiated.Refresh Materialized View
Key Lookup Phase
TheRecordRetriever is the Java object that performs
distributed key lookup. Creating one requires steps similar to creating a
BulkInserter.
First, the database connection is used to extract the Type schema of the
lookup source.
Get View Schema
RecordRetriever is created with that Type.
Create a RecordRetriever
Create Key Lookup Filter
store_id, of the
source view, store_sales. Since the view has only one column in its shard
key, there is only one entry in each key List.
Create Key Lookup Shard Value List
RecordRetriever extracts the corresponding store sales total and outputs the
store_id & total_sales. Similar to the assigning of column values with
the GenericRecord during ingest, the indexes used in extracting column
values from the result of the keyed lookup are 0-based and match the column
order of the source view.
Execute Key Lookup
Download & Run
Included below are the artifacts needed to run this example on an instance of Kinetica. Maven will be used to compile the Java program.- pom.xml - for compiling the Java example program
- DistributedIOUseCase.java
- the Java example program itself
-
Create a directory for the example Java project and move the example files
into it as shown:
Distributed Ingest & Key Lookup Project Layout
-
Compile the Java program in the directory with the pom.xml:
Compile Distributed Ingest & Key Lookup Project
-
Run the example:
Run Distributed Ingest & Key Lookup Project
-
Verify the output consists of aggregated sales totals for each store for the
current day and should be similar to the
following:
Ingest & Key Lookup Project Output