Kinetica provides support for User-Defined Function (UDF) creation and management in SQL. Unlike conventional UDFs, Kinetica UDFs are external programs that can be managed via SQL and may run in distributed fashion across the cluster.
A UDF that is defined to return a specific table type is known as a User-Defined Table Function (UDTF), which can be used within a SELECT statement to return the output of the function as a query result set.
Package management for Python UDFs is available via the UDF environment feature.
UDF/UDTF features accessible via SQL include:
For UDF/UDTF execute permission management, see:
UDF/UDTF environment management features accessible via SQL include:
- CREATE FUNCTION ENVIRONMENT
- ALTER FUNCTION ENVIRONMENT
- DROP FUNCTION ENVIRONMENT
- SHOW FUNCTION ENVIRONMENT
- DESCRIBE FUNCTION ENVIRONMENT
CREATE FUNCTION
Creates a new UDF/UDTF with the given options. A UDTF is simply a UDF defined to return a single table, using the RETURNS TABLE clause.
|
|
Parameters | Description | ||||||
---|---|---|---|---|---|---|---|
OR REPLACE | Any existing UDF with the same name will be dropped before creating this one | ||||||
<function name> | Name of the UDF, which can be referenced in subsequent commands | ||||||
RETURNS TABLE | Specifies that the UDF returns a table after execution and defines the column names and types to which the function's output data is expected to conform Important This defines the UDF as a User-Defined Table Function (UDTF), which allows the function to be called within a SQL query; see Executing Functions for details. | ||||||
MODE | Name of the execution mode for the UDF; the following modes are available:
| ||||||
RUN_COMMAND | The command used to execute the files associated with the UDF; e.g., python | ||||||
RUN_COMMAND_ARGS | The arguments supplied alongside the given RUN_COMMAND; e.g.: 'udf.py','--username','jdoe','--password','jdoe123' | ||||||
FILE PATHS | A comma-separated list of single-quoted file paths from which the function and any additional files will be loaded; the file paths should be KiFS locations and exist prior to the UDF/UDTF creation. | ||||||
WITH OPTIONS | Optional indicator that a comma-delimited list of option/value assignments will follow
|
Before creating a UDF/UDTF, upload the function file,
udf_st.py
to a new directory, udf
, in
KiFS:
|
|
|
|
To create a distributed Python udf_st UDF that uses that file:
|
|
To create the same function as a UDTF that returns a result set instead:
|
|
Executing Functions
An existing UDF/UDTF can be executed using the EXECUTE FUNCTION command. Additionally, a UDTF can be executed as a table function within a SELECT statement.
|
|
|
|
The <function name> should be the name of an existing UDF.
Tip
To preserve the runtime statistics of a UDF/UDTF, use the KI_HINT_SAVE_UDF_STATS hint in the call to it; see below for an example.
Function Parameters
The following are the available function parameters that may be passed to a UDF in an EXECUTE FUNCTION or table function call, along with their corresponding function values.
Parameters | Description | ||||||
---|---|---|---|---|---|---|---|
INPUT_TABLE_NAMES | Optional set of "tables" that will be used as input to the UDF/UDTF, specified as a set of queries defining the source data for each "table". The value of this parameter is either the INPUT_TABLE or INPUT_TABLES function, depending on the number of input "tables".
| ||||||
OUTPUT_TABLE_NAMES | Optional set of tables that will be used as output, specified as a set of names passed as a comma-delimited list of strings to the OUTPUT_TABLES function; e.g.: OUTPUT_TABLES('sales_summary', 'customer_profile') | ||||||
PARAMS | Optional list of parameters to pass to the function, specified as a set of key/value pairs passed as a comma-delimited list of <key> = '<value>' assignments to the KV_PAIRS function; e.g.: KV_PAIRS(customer_id = '1', report_type = 'yearly') | ||||||
OPTIONS | Optional list of function execution options, specified as a set of key/value pairs passed as a comma-delimited list of <key> = '<value>' assignments to the KV_PAIRS function; e.g.: KV_PAIRS(run_tag = 'my-udf') All valid options from Execute Proc can be used here. |
Examples
To execute the udf_st UDF, specifying an input and output table and the hint to save runtime statistics:
|
|
To execute the UDTF_ST UDTF, specifying an input table and returning the output as the result set of a SELECT statement:
|
|
DROP FUNCTION
Removes an existing UDF/UDTF.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF to remove |
To drop the UDF_SOS_PY_PROC UDF:
|
|
SHOW FUNCTION
Outputs the configuration of one or more existing UDFs/UDTFs. Users with SYSTEM ADMIN or ADMIN permission on the UDF (or all UDFs) will see the entire UDF configuration, while users with EXECUTE permissions will only see the name of the UDF & its execution mode.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF whose configuration will be output. Use * instead to output the DDL of all UDFs. |
Note
The response to SHOW FUNCTION is a single-column result set with the DDL statement as the value in the DDL column.
To show the UDF_SOS_PY_PROC UDF:
|
|
SHOW FUNCTION STATUS
Outputs the status of one or more running, or previously running, UDFs/UDTFs. Users with SYSTEM ADMIN or ADMIN permission on all UDFs will see all current & historical UDF runs, while users with EXECUTE or ADMIN permissions on individual UDFs will only see the current & historical runs for UDFs/UDTFs that they have executed themselves. Historical run data is preserved when a UDF/UDTF is executed with the KI_HINT_SAVE_UDF_STATS hint.
|
|
Parameters | Description |
---|---|
VERBOSE | Specifies verbose output mode. This modifies the response as noted below. |
FOR '<run ID>' | Specifies that only the status of UDF/UDTF execution with run ID <run ID> should be output. |
Note
The response to SHOW FUNCTION STATUS depends on whether the VERBOSE option was specified.
- No VERBOSE - output is a three-column result set:
- FUNCTION_NAME - name of the UDF/UDTF
- RUN_ID - ID associated with a given execution of the UDF/UDTF
- OVERALL_STATUS - completion status of UDF/UDTF
- VERBOSE - output is a single-column, single-row result set with a list of all UDF/UDTF statuses, each as a formatted block of information in the FUNCTION_STATUSES column
To show the status of all UDF/UDTF runs:
|
|
To show the verbose status of a UDF/UDTF with run ID 1000:
|
|
DESCRIBE FUNCTION
Displays the attributes of the given UDF/UDTF in tabular format. Users with SYSTEM ADMIN or ADMIN permission on the UDF (or all UDFs) will see the entire UDF configuration, while users with EXECUTE permissions will only see the name of the UDF & its execution mode.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF whose attributes will be output. Use * instead to output the attributes of all UDFs. |
Note
The response to DESCRIBE FUNCTION is a six-column result set:
- FUNCTION_NAME - name of the UDF/UDTF
- MODE - execution mode of the function; distributed or nondistributed
- RUN_COMMAND - command used to run the function
- RUN_COMMAND_ARGS - command arguments used to run the function
- FILE_PATHS - list of files involved in the execution of the function
- WITH_OPTIONS - list of key/value pairs of options used to create the function
To describe the UDF_SOS_PY_PROC UDF:
|
|
CREATE FUNCTION ENVIRONMENT
Creates a new function environment for UDFs/UDTFs written using the Python UDF API, where any packages required by the Python UDF can be installed and used. Packages can be installed with the ALTER FUNCTION ENVIRONMENT command.
|
|
Note
UDFs running with no function environment specified run in the default function environment, kinetica-default-environment. See Pre-installed Libraries on the Cluster for the list of pre-installed Python packages.
Parameters | Description |
---|---|
OR REPLACE | Any existing UDF environment with the same name will be dropped before creating this one |
<environment name> | Name of the UDF environment, which can be referenced in subsequent commands |
To create the Python UDF environment udfe_st:
|
|
ALTER FUNCTION ENVIRONMENT
Manages the installed Python packages associated with the given Python UDF/UDTF environment. Packages are given as a space-separated list of names; and for installations, an optional version number per package can be specified.
|
|
|
|
Parameters | Description |
---|---|
<environment name> | Name of the UDF environment to manage |
<package name> | Name of the Python package to install into or uninstall from the UDF environment |
<package version> | Version number of the Python package to install |
To install the sentence-transformers package and urllib3 version 1.26.18 into the Python UDF environment udfe_st:
|
|
To uninstall the sentence-transformers & urllib3 packages from the Python UDF environment udfe_st:
|
|
DROP FUNCTION ENVIRONMENT
Removes an existing Python UDF/UDTF environment.
|
|
Parameters | Description |
---|---|
<environment name> | Name of the UDF environment to remove |
To remove the Python UDF environment udfe_st:
|
|
SHOW FUNCTION ENVIRONMENT
Outputs the CREATE FUNCTION ENVIRONMENT statement used to create the given Python UDF/UDTF environment(s).
|
|
Note
The response to SHOW FUNCTION ENVIRONMENT is a single-column result set with the DDL statement as the value in the DDL column.
Parameters | Description |
---|---|
<environment name> | Name of the UDF environment whose CREATE statement will be output. Use * instead to output the CREATE statements of all UDF environments. |
To show the Python UDF environment udfe_st:
|
|
To show all Python UDF environments:
|
|
DESCRIBE FUNCTION ENVIRONMENT
Displays the Python package listing for the given Python UDF/UDTF environment(s).
|
|
Note
The response to DESCRIBE FUNCTION ENVIRONMENT is a two-column result set:
- ENVIRONMENT_NAME - name of the function environment name
- PACKAGES - names & versions of the Python packages installed in the function environment; this includes both user-installed packages and those installed by default
Parameters | Description |
---|---|
<environment name> | Name of the UDF environment whose Python packages will be output. Use * instead to output the Python packages of all UDF environments. |
To list the packages in the Python UDF environment udfe_st:
|
|
To list the packages in all Python UDF environments:
|
|