> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kinetica.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Example UDF (Non-CUDA) - Sum of Squares

The following is a complete example, using the Python API, of a non-CUDA UDF
that takes a list of *input tables* and corresponding *output tables* (must be
the same number) and, for each record of each *input table*, sums the squares of
*input table* columns and saves the result to the corresponding *output table*
column; i.e.:

in.a<sup>2</sup> + in.b<sup>2</sup> + ... + in.n<sup>2</sup> -> out.a

## Download & Run

This example will contain the following *Python* scripts *(click to download)*:

* A UDF management program,
  [udf\_sos\_manager.py](https://raw.githubusercontent.com/kineticadb/kinetica-docs/master/content/udf/python/examples/dist_noncuda_sum_of_squares/udf_sos_manager.py),
  written using the Python API, which creates the input & output tables, and
  creates the UDF and executes it.
* A UDF,
  [udf\_sos\_proc.py](https://raw.githubusercontent.com/kineticadb/kinetica-docs/master/content/udf/python/examples/dist_noncuda_sum_of_squares/udf_sos_proc.py),
  written using the Python UDF API, which contains the sum-of-squares example.

After copying these scripts to a local directory, the example can be run as
follows, specifying the database URL, username, & password:

```bash title="Run Example" theme={null}
$ python3 udf_sos_manager.py init <url> <username> <password>
$ python3 udf_sos_manager.py exec <url> <username> <password>
```

Verify the results, using a SQL client ([KiSQL](/content/tools/kisql)),
[Kinetica Workbench](/content/admin/workbench), or other:

* The `udf_sos_py_in_table` table is created in the user's default schema
  (`ki_home`, unless a different one was assigned during account creation)
* A matching `udf_sos_py_out_table` table is created in the same schema
* The `udf_sos_py_in_table` contains 10,000 records of random data
* The `udf_sos_py_out_table` contains the sum of square of the two columns
  from `udf_sos_py_in_table`.
* To show the source columns and the sum-of-squares together, run the
  following query:

  ```sql title="Show Sum of Squares Calculations" theme={null}
  SELECT
     sos_in.id,
     x1 || '^2 + ' || x2 || '^2 = ' || y AS "Equation"
  FROM
     udf_sos_py_in_table sos_in,
     udf_sos_py_out_table sos_out
  WHERE
     sos_in.id = sos_out.id
  ORDER BY
     id
  ```

## UDF Detail

The example UDF uses a single table, `udf_sos_py_in_table`, as input and a
corresponding table, `udf_sos_py_out_table`, for output.

The input table will contain two *float* columns and be populated with 10,000
pairs of randomly-generated numbers.  The output table will contain one *float*
column that will hold the sums calculated by the UDF.  Both tables will also
contain an *int* column that is the calculation identifier, allowing the input
data to be matched up with the output data after the UDF has run.

<Info>
  The UDF will assume the first column of the input table, as defined in
  the original table creation process, is the identifier field.  All of
  the remaining columns after the first will be used in the
  sum-of-squares calculation.
</Info>

The UDF will calculate the sum of the squares of each of the 10,000 pairs of
numbers and insert into the output table the corresponding 10,000 sums.

### Initialization (udf\_sos\_manager.py init)

The *init* option invokes the `init()` function in the
`udf_sos_manager.py` script.  This function will create the input table for
the UDF to use as the source of the calculations and the output table into which
the results will be inserted.  It also populates the input data using the
standard Kinetica Python API, all outside of the UDF execution framework.

Several aspects of the initialization process are noteworthy:

* The external database connection, indicative of the use of the standard
  Kinetica Python API--the UDF itself will not have this, as it runs within
  the database:

  ```python Connect to the Database theme={null}
  kinetica = gpudb.GPUdb(host=[args.url], username=args.username, password=args.password)
  ```

* Input and output table creation:

  ```python Create Input Table theme={null}
  input_table_obj = gpudb.GPUdbTable(
      _type = [
          ["id", "int", gpudb.GPUdbColumnProperty.INT16, gpudb.GPUdbColumnProperty.PRIMARY_KEY],
          ["x1", "float"],
          ["x2", "float"]
      ],
      name = input_table,
      db = kinetica
  )
  ```

  ```python Populate Input Table Data theme={null}
  records = []
  for val in range(1, MAX_RECORDS+1):
      records.append([val, random.gauss(1, 1), random.gauss(1, 2)])
  input_table_obj.insert_records(records)
  ```

  ```python Create Results Table theme={null}
  gpudb.GPUdbTable(
      _type = [
          ["id", "int", gpudb.GPUdbColumnProperty.INT16, gpudb.GPUdbColumnProperty.PRIMARY_KEY],
          ["y", "float"]
      ],
      name = output_table,
      db = kinetica
  )
  ```

### UDF (udf\_sos\_proc.py)

The `udf_sos_proc.py` script is the UDF itself.  It uses the Kinetica Python
UDF API to compute the sums of squares of input table columns and output those
sums to the output table.  It runs within the UDF execution framework, and as
such, is not called directly--instead, it is registered and launched by
<Badge color="blue-destructive">udf\_sos\_manager.py</Badge>.

Noteworthy in the UDF are the following:

* The initial call to `ProcData()` to access the database:

  ```python Begin UDF theme={null}
  proc_data = ProcData()
  ```

* The size of the output table must be specified before writing to it:

  ```python Size Results Table to Match the Input Table theme={null}
  out_table.size = in_table.size
  ```

* The sum-of-squares computation and writing to the output table:

  ```python Compute Sum-of-Squares and Write to Output Table theme={null}
  # Loop through the remaining input table columns
  for in_column in islice(in_table, 1, None):
      # For every record value in the column...
      for calc_num in range(0, in_table.size):
          # Add the square of that value to the corresponding output column
          y[calc_num] += in_column[calc_num] ** 2
  ```

* The final call to `complete()` to mark the process as finished and ready for
  clean-up:

  ```python End UDF theme={null}
  proc_data.complete()
  ```

### Execution (udf\_sos\_manager.py exec)

The *exec* option invokes the `exec()` function in the
`udf_sos_manager.py` script.  This function will read the UDF script in as
bytes, and create a UDF, uploading the script to the database. The function will
then execute the UDF.

* The registration step associates a name with the UDF execution code contained
  in <Badge color="blue-destructive">udf\_sos\_proc.py</Badge>, the command
  *(* <Badge color="blue-destructive">python3</Badge> *)* and arguments
  *(the name of the proc script)* to use to run it, and that it will run in
  *distributed* mode.

  ```python Create UDF theme={null}
  response = kinetica.create_proc(PROC_NAME, 'distributed', files, 'python3', [PROC_FILE_NAME], {})
  ```

* The execution step invokes the UDF by name, passing in the input & output
  table names against which the UDF will execute.

  ```python Execute UDF theme={null}
  response = kinetica.execute_proc(PROC_NAME, {}, {}, [input_table], {}, [output_table], {})
  ```
