The pivot operation works much the same as the SQL concept: "rotating" and
expanding single columns of values into multiple columns (one for each value),
creating a wider but shorter table. It is nearly the opposite of the
unpivot operation, though it will merge duplicates and drop
null values, which unpivot cannot undo. You can perform pivot
operations using the pivot
and pivot_values
parameters in the
options
map of /aggregate/groupby or via the SQL
PIVOT function. Visit the Aggregation documentation to review
details on aggregating data.
A pivot operation could be used to convert a table like this:
Name | Phone Type | Phone Number |
---|---|---|
Jane | Home | 123-456-7890 |
Jane | Work | 111-222-3333 |
John | Home | 123-456-7890 |
John | Cell | 333-222-1111 |
into a table like this:
Name | Home Phone | Work Phone | Cell Phone |
---|---|---|---|
Jane | 123-456-7890 | 111-222-3333 | |
John | 123-456-7890 | 333-222-1111 |
Given a source table customer
, which lists each phone number for a customer
as a separate record in the table:
name | phone_type | phone_number |
---|---|---|
... | ... | ... |
A pivot operation can be performed in Python like so, creating a single record per customer with the home, work, & cell phone numbers as separate columns:
gpudb.aggregate_groupby(
table_name= "customer",
column_names= ["name", "MAX(phone_number) as Phone"],
offset= 0,
limit= 1000,
encoding= "json",
options= {"pivot":"phone_type", "pivot_values":"Home, Work, Cell"}
)
The resulting table looks like this:
name | Home_Phone | Work_Phone | Cell_Phone |
---|---|---|---|
... | ... | ... | ... |
Given a source table students
, which lists each student's quarterly grade as
a separate record in the table:
student_id | quarter | grade |
---|---|---|
... | ... | ... |
Performing a pivot operation in Python on the student
table converts it
into a table with one record per student with each quarter grade as a separate
column:
gpudb.aggregate_groupby(
table_name = "student",
column_names= ["student_id", "MAX(grade) as grade"]
offset= 0,
limit= 1000,
encoding= "json",
options= {"pivot":"quarter", "pivot_values":"q1, q2, q3, q4"}
)
The resulting table looks like this:
student_id | q1_grade | q2_grade | q3_grade | q4_grade |
---|---|---|---|---|
... | ... | ... | ... | ... |
result_table
option is usedHome
will not match
records with values of HOME
or home
, and it will generate a new column
with Home
as the name (or part of the name if an alias was specified).