Kinetica provides support for User-Defined Function (UDF) creation and management in SQL. Unlike conventional UDFs, Kinetica UDFs are external programs that can be managed via SQL and may run in distributed fashion across the cluster.
A UDF that is defined to return a specific table type is known as a User-Defined Table Function (UDTF), which can be used within a SELECT statement to return the output of the function as a query result set.
UDF/UDTF features accessible via SQL include:
For UDF/UDTF execute permission management, see:
CREATE FUNCTION
Creates a new UDF/UDTF with the given options. A UDTF is simply a UDF defined to return a single table, using the RETURNS TABLE clause. Only users with the SYSTEM ADMIN permission can create UDFs & UDTFs.
|
|
Parameters | Description | ||||||
---|---|---|---|---|---|---|---|
OR REPLACE | Any existing UDF with the same name will be dropped before creating this one | ||||||
<function name> | Name of the UDF, which can be referenced in subsequent commands | ||||||
RETURNS TABLE | Specifies that the UDF returns a table after execution and defines the column names and types to which the function's output data is expected to conform Important This defines the UDF as a User-Defined Table Function (UDTF), which allows the function to be called within a SQL query; see Executing Functions for details. |
||||||
MODE | Name of the execution mode for the UDF; the following modes are available:
|
||||||
RUN_COMMAND | The command used to execute the files associated with the UDF; e.g., python, java | ||||||
RUN_COMMAND_ARGS | The arguments supplied alongside the given RUN_COMMAND; e.g., udf.py, --username jdoe --password 'jdoe123' | ||||||
FILE PATHS | A comma-separated list of single-quoted file paths from which the function and any additional files will be loaded; the file paths should be KiFS locations and exist prior to the UDF/UDTF creation. | ||||||
WITH OPTIONS | Optional indicator that a comma-delimited list of option/value assignments will follow
|
Before creating a UDF/UDTF, upload the function file,
udf_sos_proc.py
to a new directory, udf
, in
KiFS:
|
|
|
|
To create a distributed Python UDF_SOS_PY_PROC UDF that uses that file and returns a table featuring two columns:
|
|
To create the same function as a UDTF instead:
|
|
The Python script, udf_sos_proc.py , employed in these examples is shown below:
|
|
Executing Functions
An existing UDF/UDTF can be executed using the EXECUTE FUNCTION command. Additionally, a UDTF can be executed as a table function within a SELECT statement. Any user with the SYSTEM ADMIN permission or the EXECUTE FUNCTION permission on a specific UDF/UDTF (or across all UDFs/UDTFs) is allowed to execute it.
|
|
|
|
The <function name> should be the name of an existing UDF.
Function Parameters
The following are the available function parameters that may be passed to a UDF in an EXECUTE FUNCTION or table function call, along with their corresponding function values.
Parameters | Description | ||||||
---|---|---|---|---|---|---|---|
INPUT_TABLE_NAMES | Optional set of "tables" that will be used as input to the UDF/UDTF, specified as a set of queries defining the source data for each "table". The value of this parameter is either the INPUT_TABLE or INPUT_TABLES function, depending on the number of input "tables".
|
||||||
OUTPUT_TABLE_NAMES | Optional set of tables that will be used as output, specified as a set of names passed as a comma-delimited list of strings to the OUTPUT_TABLES function; e.g.: OUTPUT_TABLES('sales_summary', 'customer_profile') |
||||||
PARAMS | Optional list of parameters to pass to the function, specified as a set of key/value pairs passed as a comma-delimited list of <key> = '<value>' assignments to the KV_PAIRS function; e.g.: KV_PAIRS(customer_id = '1', report_type = 'yearly') |
||||||
OPTIONS | Optional list of function execution options, specified as a set of key/value pairs passed as a comma-delimited list of <key> = '<value>' assignments to the KV_PAIRS function; e.g.: KV_PAIRS(max_concurrency_per_node = '2') |
For example, to execute the UDF_SOS_PY_PROC UDF, specifying an input and output table:
|
|
To execute the UDTF_SOS_PY_PROC UDTF, specifying an input table and returning the output as the result set of a SELECT statement:
|
|
DROP FUNCTION
Removes an existing UDF/UDTF. Only users with the SYSTEM ADMIN permission can remove UDFs.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF to remove |
To drop the UDF_SOS_PY_PROC UDF:
|
|
SHOW FUNCTION
Outputs the configuration of an existing UDF/UDTF. Users with SYSTEM ADMIN permission will see the entire UDF configuration, while users with proc level permissions will only see the name of the UDF & its execution mode.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF whose configuration will be output. Use * instead to output the DDL of all UDFs. |
Note
The response to SHOW FUNCTION is a single-column result set with the DDL statement as the value in the DDL column.
To show the UDF_SOS_PY_PROC UDF:
|
|
DESCRIBE FUNCTION
Displays the attributes of the given UDF/UDTF in tabular format. Users with SYSTEM ADMIN permission or proc-level permissions on the function (or across all functions) will be allowed to see the function attributes.
|
|
Parameters | Description |
---|---|
<function name> | Name of the UDF whose attributes will be output. Use * instead to output the attributes of all UDFs. |
Note
The response to DESCRIBE FUNCTION is a six-column result set:
- FUNCTION_NAME - name of the UDF/UDTF
- MODE - execution mode of the function; distributed or nondistributed
- RUN_COMMAND - command used to run the function
- RUN_COMMAND_ARGS - command arguments used to run the function
- FILE_PATHS - list of files involved in the execution of the function
- WITH_OPTIONS - list of key/value pairs of options used to create the function
To describe the UDF_SOS_PY_PROC UDF:
|
|