The unpivot operation works much the same as the SQL concept: "rotating" a
set of column headers (and the associated values) into rows to create a more
normalized table that's longer and narrower. It is nearly the opposite of
the pivot operation, though it will not be able to unmerge
records and restore nulls that were aggregated by the pivot operation. You can
perform unpivot operations using /aggregate/unpivot or
via the SQL UNPIVOT function. Visit the Aggregation
documentation to review details on aggregating data.
An unpivot operation could be used to convert a table like this:
1
2
3
4
5
6
+--------+--------------+--------------+--------------+
| name | home_phone | work_phone | cell_phone ||--------+--------------+--------------+--------------|| Jane | 123-456-7890 | 111-222-3333 ||| John | 123-456-7890 || 333-222-1111 |+--------+--------------+--------------+--------------+
into a table like this:
1
2
3
4
5
6
7
8
9
10
+--------+--------------+----------------+
| name | phone_type | phone_number ||--------+--------------+----------------|| Jane | Cell ||| Jane | Home | 123-456-7890 || Jane | Work | 111-222-3333 || John | Cell | 333-222-1111 || John | Home | 123-456-7890 || John | Work ||+--------+--------------+----------------+
Each original record will become three records in the unpivoted version, one for
each phone type & number associated with the name. Note that nulls in the
source data will still result in an output record being produced; e.g., a
contact with no cell phone number will still have a record in the unpivoted
output attributing a null phone number with the cell phone type to that contact.
The results of an unpivot may be written to a separate
memory-only table using the result_table option.
This table can be persisted (like normal tables) using the
result_table_persist option.
A memory-only table name must adhere to the standard
naming criteria. Each memory-only table
exists within a schema and follows the standard
name resolution rules for tables.
Examples
Customer
Given a source table customer_contact, which lists the name, ID, and home,
work, & cell phone numbers for each customer in the table:
1
2
3
4
5
6
+--------+--------------+--------------+--------------+
| name | home_phone | work_phone | cell_phone ||--------+--------------+--------------+--------------|| Jane | 123-456-7890 | 111-222-3333 ||| John | 123-456-7890 || 333-222-1111 |+--------+--------------+--------------+--------------+
An unpivot operation can be performed like so, creating separate
home, work, & cell phone records for each customer without copying over their
names:
kinetica.aggregate_unpivot(table_name="example.customer_contact",column_names=["name"],variable_column_name="phone_type",value_column_name="phone_number",pivoted_columns=["home_phone AS Home","work_phone AS Work","cell_phone AS Cell"],encoding="json",options={"order_by":"name, phone_type"})
The resulting table looks like this
1
2
3
4
5
6
7
8
9
10
+--------+--------------+----------------+
| name | phone_type | phone_number ||--------+--------------+----------------|| Jane | Cell ||| Jane | Home | 123-456-7890 || Jane | Work | 111-222-3333 || John | Cell | 333-222-1111 || John | Home | 123-456-7890 || John | Work ||+--------+--------------+----------------+
Student
Given a source table student_grade, which lists each student's quarterly
grades as separate columns in a single record:
kinetica.aggregate_unpivot(table_name="example.student_grade",column_names=["*"],variable_column_name="quarter",value_column_name="grade",pivoted_columns=["q1_grade AS q1","q2_grade AS q2","q3_grade AS q3","q4_grade AS q4"],encoding="json",options={"order_by":"student_id"})
The unpivot operation cannot be applied to vector or arraydata types, meaning they cannot appear in the
pivoted_columns list.
The column types for the pivoted_columns must match, e.g., int8 to
int8, char4 to char4, etc.
Unpivot Memory-Only Tables
Creating a memory-only table results in an entirely new data set, so be
mindful of the memory usage implications.
If an unpivot memory-only table is created and its source data set's rows
are updated (or removed), the memory-only table will not be updated to
reflect the changes in the source data set.
An unpivot memory-only table is transient, by default, and will expire after
the default TTL setting.
An unpivot memory-only table is not persisted, by default, and will not
survive a database restart; specifying a result_table_persist option of
true will make the table permanent and not expire.