Core
The size limit for primary keys has been increased to 320 bytes [1]
Dictionary encoding performance enhancements
Jobs can now be performed asynchronously
The Kinetica File System (KiFS) has been introduced to greatly ease ML workflows centered around file processing and file generation
Non-HA multi-head primary key lookup is now possible using the RecordRetriever object available in the Java and Python APIs
HA support for multi-head primary key lookup and multi-head ingest
/alter/table jobs are now cancellable
A Host Manager-controlled alerting system for application-level significant events and hardware resource usage. Alerts can be managed via gpudb.conf, and you can view the most recent alerts in Kinetica Administration Application (GAdmin)
Improvements to aggregation:
- Aggregates listed in a HAVING expression no longer have to exist in the column name list
- Aggregates can now be listed before grouping attributes in the column name list
- Additional grouping functions (for both SQL and the native APIs):
Full Materialized View support via SQL or the native APIs--any number of source tables and intermediary tables & operations can be involved in creating a materialized view
Various improvements to the GAdmin user interface
Support for rank, partition, and window functionality in both SQL and the native APIs
The Python API has been updated to include an extension that enables increased speed when inserting and retrieving records from a GPUdbTable object
Joins can now be created using derived columns, e.g.,
h_db.create_join_table( join_table_name="my_join", table_names=[table.alias("a"),table.alias("b")], column_names=["a.x as ax","b.y as by","a.x+b.y as c"], expressions=["a.x = b.x"] )
The /aggregate/statistics endpoint can now tune the behavior of the percentile() function using a second, comma-separated resolution value. The higher the resolution, the more accurate the estimation is but the longer the calculation takes, e.g., a 50th percentile resolution of 200:
h_db.aggregate_statistics( table_name="my_table" column_name="col1", stats="count,min,percentile(50,200)" )
Geospatial
- Improved symbology scaling
Security
- Passwords now have a character limit of 1024, and user names and role names now have a character limit of 64
SQL
- SQL Views & Materialized Views
- Operations:
User-Defined Functions (UDFs)
- Improved performance of non-distributed UDF read and write
- Per-node concurrency limit setting available in GAdmin or the max_concurrency_per_node option in the /create/proc endpoint
- Each UDF API has access to a status field under ProcData that helps convey status information during UDF execution
Footnotes
[1] | Primary key max length increased from 160 to 320 in version 6.2.0.25 |