Kinetica supports the concept of a materialized view, which is a greater-performant query across one or more tables. For details on interfacing with materialized views from SQL, see CREATE MATERIALIZED VIEW.
A materialized view consists of one or more operations against one or more input tables. These operations will result in member tables that will be used to construct the materialized view and keep its data current. The member tables will be managed by the materialized view and will not be available to be operated on directly outside of the materialized view.
Member tables can be the result of any of the following:
- /aggregate/groupby with result_table option
- /aggregate/unique with result_table option
- /aggregate/unpivot with result_table option
The materialized view can be accessed via its root table, which is the final member table in the materialized view creation sequence.
When a change occurs in an input table, the member tables that are derived from that input table are marked as stale and in need of rebuilding if and when a data refresh is requested. When such a request is received, any stale member tables are rebuilt, and the root table will once again contain current data.
There are seven events that will cause the member tables of a materialized view to be marked as stale:
- /append/records against an input table
- /insert/records against an input table
- /insert/records/random against an input table
- /update/records against an input table
- Update within a filtered view that is an input table
- Update within a join view that is an input table
- UDF writing to an input table
There are four modes that govern the way in which a materialized view's data can be kept current:
The materialized view manages the refresh processing for all entities contained within it. The specified refresh mode of any member table (join, for instance) will be ignored.
A manual refresh of a materialized view can be performed on any materialized view, regardless of operating mode, to update the view's data. If a refresh is already in progress, the manual refresh request will wait until the currently executing refresh is complete before executing.
An example, in Python, of performing a manual refresh of a materialized view (customer_order_count):
In manual mode, the materialized view will only be made fully current by performing a manual refresh of the view. This will result in the immediate rebuilding of any of the view's member tables that require it.
On Change Mode
In on change mode, the materialized view will be kept current by detecting changes in any member tables or input tables and updating any member tables that require it. This behavior means that the returned data will always be current, with no overhead on query, but with potentially significant refresh processing for each change in the supporting data.
The granularity of the change detection is at the endpoint call level; e.g., a call to /update/records that resulted in 1,000,000 records being updated would be detected as one change, and one refresh would be initiated.
On change mode cannot be applied if the enable_worker_http_servers configuration setting (i.e. multi-head) is set to true. This is because multi-head inserts bypass the on change mode's notification system and are written directly to the tables on the nodes they're located on instead of communicating to the head node that a change has occurred.
On Query Mode
In on query mode, the materialized view will be refreshed upon query. This behavior means that the returned data will always be current, with some overhead on query.
In periodic mode, the materialized view will be kept current by refreshing the data at a given interval. This behavior means that the returned data may not be current for a given query, but there will be no query overhead and no refresh waiting time for modification calls as with on change mode.
When the beginning of a period is reached, the time of the following period is calculated and a refresh process initiated. If the refresh process does not complete by that pre-calculated time, another refresh process will be initiated immediately after the first completes. When this next refresh begins, the time of the next period in the sequence that occurs at some time past the current time will be calculated and the cycle repeated. In this manner, only one catch-up refresh will be processed regardless of the number of periods skipped by a long-running refresh.
Creating a Materialized View
There are three steps in creating a materialized view using the native API:
- Reserve the name of the materialized view and receive a unique ID that will be used to identify member tables as being part of the materialized view. The /create/materializedview endpoint will be used for this purpose. Options such as refresh mode and TTL will also be configured in this step.
- Issue requests against the input tables and against any result tables created by those requests, tagging each result table created with the unique ID for the materialized view.
- Issue the final processing request, naming the resulting table with the reserved name of the materialized view and tagging it with the unique ID. This member table becomes the root table and the materialized view creation sequence is complete.
To create a materialized view in SQL, see CREATE MATERIALIZED VIEW.
The following examples use the Python API to construct a materialized view.
To create a materialized view (recent_orders) on a single table (orders), and have recent_orders show records in orders that were ordered in the last 24 hours (as denoted by the table's o_orderdate column), refreshed every time the data in orders is updated:
To create a materialized view (customer_order_count) across two tables: a base table (orders) and a parent table (customer), and refresh the order total per customer in customer_order_count every hour:
Limitations and Cautions
- A materialized view cannot have a persisted table as a member table; e.g., attempting to add a projection created with a persist option set to true to a materialized view will fail.
- A materialized view is not persisted, by default, and will not survive a database restart. However, specifying a persist option of true when creating one will make the view permanent and not expire; database restarts will cause the materialized view to be rebuilt from its input tables.
- A /merge/records operation cannot be part of a materialized view.
- When performing a refresh, a materialized view will not take into account DML operations on member tables that occurred during its creation; these include:
- When any one of the input tables to a materialized view is altered or dropped, the materialized view will also be dropped.