Version:

Migrating to Kinetica 7.0

While any version of Kinetica can be upgraded to the latest version using KAgent, there are several aspects of upgrading that should be considered before doing so, as well as a number of steps that should be taken after installing Kinetica 7.0.

Database Clients

The native API is not backward compatible with prior releases; any clients that make use of these interfaces need to be updated after installation. See Native APIs for instructions on upgrading client APIs. After the upgrade, client code that makes use of those interfaces will also need to be modified, as well as any supporting configuration (Java pom.xml files, Python packages, etc.).

Database Connectors

Database connectors (Kafka, Spark, etc.) make use of the updated native API and ODBC/JDBC drivers when communicating with the database and should also be updated to the latest versions after installation. See Data Connectors for the list of connectors and their respective configuration guides.

Database Drivers

The ODBC/JDBC interfaces are not backward compatible with prior releases; any clients that make use of the ODBC or JDBC drivers need to be updated after installation. See ODBC/JDBC for instructions on upgrading drivers.

Core

On-Change Refreshed Materialized Views

Materialized views with an on-change refresh mode will not detect changes to input tables being ingested into with multi-head ingest turned on. Both manual and periodic refresh modes will detect these changes and rebuild the view appropriately.

Refreshable Join Views

Join Views, entities created by the /create/jointable endpoint, are no longer able to be refreshed directly or by event. This refresh functionality has been moved into materialized views, entities created by the /create/materializedview endpoint. Wrapping a join view in a materialized view can restore this behavior; details below.

Previous refresh_method options:

  • manual
  • on_query
  • on_insert

Previous refresh options:

  • refresh (incremental refresh if base table modified; complete otherwise)
  • full_refresh (complete refresh of member tables)

Manually Refreshed Join Views

A manually-refreshed join view can be wrapped in materialized view for the same effect.

For example, in Python:

Previously supported:

h_db.create_join_table(
    join_table_name="order_products,
    table_names=["orders as o", "products as p"],
    column_names=["o.id as order_id", "p.product_name"],
    expressions=["o.product_id = p.id"],
    options={"refresh_method": "manual"}
)

Currently supported:

view_id = h_db.create_materialized_view(
              table_name = 'order_products',
              options = {'refresh_method':'manual'}
          )['view_id']

h_db.create_join_table(
    join_table_name="order_products,
    table_names=["orders as o", "products as p"],
    column_names=["o.id as order_id", "p.product_name"],
    expressions=["o.product_id = p.id"]
)

On-Insert Refreshed Join Views

The on-insert refresh mode for join views is no longer supported. The recommended replacement strategy is to wrap the join view in a materialized view in on-change refresh mode.

For example, in Python:

Previously supported:

h_db.create_join_table(
    join_table_name="order_products,
    table_names=["orders as o", "products as p"],
    column_names=["o.id as order_id", "p.product_name"],
    expressions=["o.product_id = p.id"],
    options={"refresh_method": "on_insert"}
)

Currently supported:

view_id = h_db.create_materialized_view(
              table_name = 'order_products',
              options = {'refresh_method':'on_change'}
          )['view_id']

h_db.create_join_table(
    join_table_name="order_products,
    table_names=["orders as o", "products as p"],
    column_names=["o.id as order_id", "p.product_name"],
    expressions=["o.product_id = p.id"]
)

Note

This replacement strategy will cause updates to the view to be processed more often than the original if records are updated or deleted in the base tables of the join.

On-Query Refreshed Join Views

On-query refresh modes for both join views & materialized views are no longer supported. However, there are several ways to achieve the effective result with the current feature set:

  • Wrap the join view in a materialized view in manual refresh mode, as demonstrated under Manually Refreshed Join Views; then manually refresh it before each query.

  • Wrap the join view in a materialized view in on-change refresh mode, as demonstrated under On-Insert Refreshed Join Views.

  • Wrap the join view in a materialized view in periodic refresh mode with a reasonable refresh interval

    For example, in Python:

    Previously supported:

    h_db.create_join_table(
        join_table_name="order_products,
        table_names=["orders as o", "products as p"],
        column_names=["o.id as order_id", "p.product_name"],
        expressions=["o.product_id = p.id"],
        options={"refresh_method": "on_query"}
    )
    

    Currently supported:

    view_id = h_db.create_materialized_view(
                  table_name = 'order_products',
                  options = {
                      'refresh_method':'periodic',
                      'refresh_period':'60'
                  }
              )['view_id']
    
    h_db.create_join_table(
        join_table_name="order_products,
        table_names=["orders as o", "products as p"],
        column_names=["o.id as order_id", "p.product_name"],
        expressions=["o.product_id = p.id"]
    )
    

    Note

    This creates a materialized view with a 1-minute periodic refresh.

  • Use a SQL view to perform the join

    For example, in Python:

    Previously supported:

    h_db.create_join_table(
        join_table_name="order_products,
        table_names=["orders as o", "products as p"],
        column_names=["o.id as order_id", "p.product_name"],
        expressions=["o.product_id = p.id"],
        options={"refresh_method": "on_query"}
    )
    

    Now, in SQL:

    Currently supported:

    CREATE VIEW order_products AS
    SELECT o.id AS order_id, p.product_name
    FROM orders o
    JOIN products p ON o.product_id = p.id
    

    Note

    SQL views are only accessible via ODBC/JDBC or the /execute/sql endpoint.

Manual Refresh

Manually refreshing a join view is no longer supported. However, when wrapping a join view in a materialized view in manual refresh mode, as demonstrated under Manually Refreshed Join Views, the view can be manually refreshed as outlined under Manual Refresh.

Note

Only a full refresh is supported for materialized views.

SQL Support

Kinetica 7.0 implements a new SQL engine that adheres more closely to the SQL-92 standard than the previous one. This, along with other changes to the core, has narrowed support for some SQL operations, listed below.

Root Schema/Collection

In previous releases, the top-level (or root) schema/collection was referenced either by not specifying a schema, or by specifying an empty string. Now, the top-level schema is referenced by the keyword ROOT. The default schema is ROOT, and the default catalog is KINETICA.

Any existing database objects named ROOT should be renamed before upgrading. While ALTER TABLE (and the corresponding /alter/table endpoint) will still be supported for renaming objects after upgrade, use of these objects until they are renamed will be diminished.

General Reference

Previously supported:

SELECT *
FROM top_level_table

Currently supported:

SELECT *
FROM top_level_table
SELECT *
FROM root.top_level_table
SELECT *
FROM kinetica.root.top_level_table

Moving to Top Level

Previously supported:

ALTER TABLE some_schema.in_schema_table
MOVE TO ""

Currently supported:

ALTER TABLE some_schema.in_schema_table
MOVE TO ROOT

Data Types

The following data types are no longer supported, both of which were previously backed by native int8 type columns:

  • BIT
  • SQL_BIT

Previously supported:

CREATE TABLE bit_fields
(
    b BIT,
    sb SQL_BIT
)

Currently supported:

CREATE TABLE bit_fields
(
    b INT8,
    sb TINYINT
)

Note

Both of these types will result in native int8 column types.

Column Aliases

Some queries, such as those involving the GROUPING function may require any column alias used to be different from the original (or another) column name in the same query.

Previously supported:

SELECT
    IF (GROUPING(vendor_id) = 1, '<ALL VENDORS>', vendor_id) AS vendor_id,
    COUNT(*) AS total_pickups
FROM
    nyctaxi
GROUP BY
    ROLLUP(vendor_id)

Currently supported:

SELECT
    IF (GROUPING(vendor_id) = 1, '<ALL VENDORS>', vendor_id) AS vendor_id_alias,
    COUNT(*) AS total_pickups
FROM
    nyctaxi
GROUP BY
    ROLLUP(vendor_id)

Common Table Expressions (WITH statement)

Common Table Expressions (CTEs) are no longer supported outside of DML or DDL statements. Instead, they must be moved within the statement.

DML

Previously supported:

WITH source_cte (new_id) AS
(SELECT id + 1 FROM old_table)
INSERT INTO new_table
SELECT new_id, 'B'
FROM source_cte

Currently supported:

INSERT INTO new_table
WITH source_cte (new_id) AS
(SELECT id + 1 FROM old_table)
SELECT new_id, 'B'
FROM source_cte

DDL

Previously supported:

WITH source_cte (new_id) AS
(SELECT id + 1 FROM old_table)
CREATE TABLE new_table AS
SELECT new_id, 'B' AS new_code
FROM source_cte

Currently supported:

CREATE TABLE new_table AS
WITH source_cte (new_id) AS
(SELECT id + 1 FROM old_table)
SELECT new_id, 'B' AS new_code
FROM source_cte

Interval Expression

Three changes relating to ANSI standards compliance have affected the syntax of the interval expression:

  • Quotes around the interval value are required
  • Digits in the interval value have to be reflected in the interval type's parameter for interval values of more than two digits
  • Interval types are always singular

Previously supported:

SELECT
    purchase_date + INTERVAL 1 DAY,
    purchase_date + INTERVAL 10 DAYS,
    purchase_date + INTERVAL 100 DAYS,
    purchase_date + INTERVAL 1000 DAYS
FROM orders

Currently supported:

SELECT
    purchase_date + INTERVAL '1' DAY,
    purchase_date + INTERVAL '10' DAY,
    purchase_date + INTERVAL '100' DAY(3),
    purchase_date + INTERVAL '1000' DAY(4)
FROM orders

Reserved Words

Reserved words can be used as table & column names, but must be quoted if so. See SQL Reserved Word List for the complete list of reserved words.

Given that FLOOR & CONDITION are reserved words, both must be quoted when referenced as table/column names:

CREATE OR REPLACE TABLE "floor"
(
    id INT,
    type VARCHAR(16),
    "condition" VARCHAR(32)
)
INSERT INTO "floor" (id, type, "condition")
VALUES (1, 'HARDWOOD', 'NEW')
SELECT id, type, "condition"
FROM "floor"

Reading/Writing CSV Files

Reading from CSV files is now only supported via INSERT INTO statements. The target table must already exist and its columns must match the CSV file being loaded in number and order.

Writing to CSV files is no longer supported, though they can be loaded by other means.

Previously supported:

  • File Reading

    SELECT <column list>
    FROM FILE."<file name>"
    <JOIN, WHERE, GROUP BY, etc. clauses>
    
  • File Loading

    INSERT INTO <table name> [(<column list>)]
    SELECT <column list>
    FROM FILE."<file name>"
    <JOIN, WHERE, GROUP BY, etc. clauses>
    
  • File Creation

    CREATE TABLE FILE."<file name>" AS
    <SELECT statement>
    
  • File Appending

    INSERT INTO FILE."<file name>"
    <SELECT statement>
    

Currently supported:

Hints

The following hints are no longer supported with the new SQL engine, most of which are now optimized internally:

  • KI_HINT_ALLOW_PARTIAL_PASSDOWN
  • KI_HINT_DONT_FILTER_IN_AGGREGATE
  • KI_HINT_DONT_SPLIT_JOINS
  • KI_HINT_EXPLAIN_JOINS (replaced by full execution planner command)
  • KI_HINT_KEEP_CROSSJOINS
  • KI_HINT_MAX_QUERY_DIMENSIONS
  • KI_HINT_MAX_ROWS_TO_FETCH
  • KI_HINT_NO_PASSDOWN
  • KI_HINT_ROWS_PER_FETCH
  • KI_HINT_SIMULATION
  • KI_HINT_SPLIT_JOINS