The pivot operation works much the same as the SQL concept: "rotating" and
expanding single columns of values into multiple columns (one for each value),
creating a wider but shorter table. It is nearly the opposite of the
unpivot operation, though it will merge duplicates and drop
null values, which unpivot cannot undo. You can perform pivot
operations using the pivot and pivot_values parameters in the
options map of /aggregate/groupby or via the SQL
PIVOT function. Visit the Aggregation documentation to review
details on aggregating data.
A pivot operation could be used to convert a table like this:
1
2
3
4
5
6
7
8
+--------+--------------+----------------+
| name | phone_type | phone_number ||--------+--------------+----------------|| Jane | Home | 123-456-7890 || Jane | Work | 111-222-3333 || John | Home | 123-456-7890 || John | Cell | 333-222-1111 |+--------+--------------+----------------+
into a table like this:
1
2
3
4
5
6
+--------+--------------+--------------+--------------+
| name | Home_Phone | Work_Phone | Cell_Phone ||--------+--------------+--------------+--------------|| Jane | 123-456-7890 | 111-222-3333 ||| John | 123-456-7890 || 333-222-1111 |+--------+--------------+--------------+--------------+
Each phone type in the pivoted key column becomes a prefix of one of the
resulting value columns of the pivot, with the suffix for those columns
(Phone) specified by the user.
The results of a pivot may be written to a separate
memory-only table using the result_table option.
This table can be persisted (like normal tables) using the
result_table_persist option.
A memory-only table name must adhere to the standard
naming criteria. Each memory-only table
exists within a schema and follows the standard
name resolution rules for tables.
Examples
Customer Phone Numbers
Given a source table phone_list, which lists each phone number for a
customer as a separate record in the table:
1
2
3
4
5
6
7
8
+--------+--------------+----------------+
| name | phone_type | phone_number ||--------+--------------+----------------|| Jane | Home | 123-456-7890 || Jane | Work | 111-222-3333 || John | Home | 123-456-7890 || John | Cell | 333-222-1111 |+--------+--------------+----------------+
A pivot operation can be performed in Python like so, creating a single
record per customer with the home, work, & cell phone numbers as separate
columns:
1
2
3
4
5
6
7
8
9
10
11
12
h_db.aggregate_group_by(table_name="example.phone_list",column_names=["name","MAX(phone_number) as Phone"],offset=0,limit=1000,encoding="json",options={"sort_by":"key","pivot":"phone_type","pivot_values":"Home, Work, Cell"})
The resulting table looks like this:
1
2
3
4
5
6
+--------+--------------+--------------+--------------+
| name | Home_Phone | Work_Phone | Cell_Phone ||--------+--------------+--------------+--------------|| Jane | 123-456-7890 | 111-222-3333 ||| John | 123-456-7890 || 333-222-1111 |+--------+--------------+--------------+--------------+
Student Grades
Given a source table qtr_grade, which lists each student's quarterly grade
as a separate record in the table:
Store-only columns cannot have Pivot operations
applied to them, nor can they appear in the result set
Non-charN string column types cannot appear in the result set
if the result_table option is used
Pivot values are case sensitive; e.g., a value of Home will not match
records with values of HOME or home, and it will generate a new column
with Home as the name (or part of the name if an alias was specified).